Nitin Jadhav's Blog: November 2012

Tuesday, November 20, 2012

External tables to monitor os space in UNIX

To create the external table in oracle database follow the below steps.
(Note: Here we are creating external table for monitoring the OS space information considering OS is UNIX flavored)

SQL> create directory Directory_Name as '/usr/tmp'

SQL> CREATE TABLE server_space
(
FILESYSTEM VARCHAR2(100),
TOTAL_GB VARCHAR2(10),
USED_GB VARCHAR2(10),
FREE_GB VARCHAR2(10),
USED_SPACE VARCHAR2(10),
MOUNTED_ON VARCHAR2(100)
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY Directory_Name
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
NOBADFILE
NODISCARDFILE
NOLOGFILE
SKIP 0
FIELDS TERMINATED BY WHITESPACE
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
FILESYSTEM CHAR,
TOTAL_GB CHAR,
USED_GB CHAR,
FREE_GB CHAR,
USED_SPACE CHAR,
MOUNTED_ON CHAR
)
)
LOCATION (ALERT_LOG_DIR:'server.txt')
)REJECT LIMIT UNLIMITED

Schedule the below line using crontab
df -h |grep -v Filesystem|awk '{print $1" "$2" "$3" "$4" "$5" "$6}' >/usr/tmp/server.txt

Now you can do the server side space monitoring without login to server.

SQL>select * from server_space;

Oracle's RMAN Validation for Identifying corruption

Follow the below steps to identify corruption as well as corrupted objects in database.

Step 1: Identify the corrupt blocks:

RMAN> connect
RMAN> backup validate check logical database;

or

1-- RMAN> configure device type disk parallelism 4;
RMAN> backup validate check logical database;
--------------------------------------------------------------------------------

The corrupted blocks are listed in the view v$database_block_corruption:

2-- SQL> select * from v$database_block_corruption;
--------------------------------------------------------------------------------

Step 2 : Identify the corrupt segments:

1-- drop table corrupted_objects;
------------------------------------------------------------------------------

2--
create table corrupted_objects
as
select owner, segment_name, partition_name, segment_type, 1 file#, 1 block#
from dba_extents
where 1=2;
------------------------------------------------------------------------------

3--
declare
cursor C1 is select file#, block#, blocks
from v$database_block_corruption;
begin
for r1 in C1 loop
for r2 in 1..r1.blocks loop
insert into corrupted_objects
select owner, segment_name, partition_name,segment_type, r1.file#, r1.block#+r2-1
from dba_extents
where file_id=r1.file#
and (r1.block#+r2-1) between block_id and block_id + blocks - 1;
if sql%notfound then
/* segment header block might be corrupt causing dba_extents not returning rows. */
insert into corrupted_objects
select owner, segment_name, partition_name, segment_type, r1.file#, r1.block#
from dba_segments
where header_file=r1.file# and header_block = (r1.block#+r2-1);
end if;
commit;
end loop;
end loop;
end;
/
----------------------------------------------------------------------------------

4-- Select all the objects populated in the table corrupted_objects:

SQL> select distinct owner, segment_name, partition_name, segment_type from corrupted_objects;

Export/Import and Corrupt Block in oracle database

While Doing export backup of database/schema/objects;
If there is a corrupted object, an error will be written in the export log showing the object that was not exported.
Therefore, no corrupt object blocks will be exported into the export file.

1. Export does not detect all corruptions:
- does not detect disk corruptions above the high water mark;
- does not detect corruptions in indexes, or in free or temporary extents;
- does not detect all corruptions in the data dictionary (the export
utility does not read all of the data dictionary, so there could still be undetected corruptions in the
SYSTEM tablespace).

2. Only a Conventional path export is able to detect logical corrupt blocks.
When using a Direct path export , the data is read from disk directly
into the export session's program global area (PGA): the rows are
transferred directly to the Export session's private buffer. This means
that the SQL command-processing layer (evaluation buffer) is bypassed,
because the data is already in the format that Export expects.
Therefore, do not specify the export parameter DIRECT=Y but use DIRECT=N
which is also the default value for the DIRECT parameter.

Therefore, use a conventional export and not the direct path.

Scenario for the Physical Block Corruption

SMON's unable to clean the temporary segments due to the fact that these blocks are currently fractured

A fractured block is a clear symptom about serious issues within the O.S./H.W. layers.
Oracle keeps track off the header of each block and constructs before writing down to disk by building a small
4 byte field/value in the tail of each block to guarantee it is correctly written

Pls see the following as an Example:
-----------------------
Corrupt block relative dba: 0x0380e573 (file 14, block 58739)
Fractured block found during buffer read
Data in bad block -
type: 6 format: 2 rdba: 0x0380e573
last change scn: 0x0288.8e5a2f78 seq: 0x1 flg: 0x04
consistency value in tail: 0x00780601 <- Should be least 4 significant bytes from scn, flg && seq
check value in block header: 0x8739, computed block checksum: 0x2f00
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0380e573 (file 14, block 58739) found same corrupted data
-----------------------

This value was right on the block oracle asked o.s. to write but, unfortunatelly, the write did not complete as
a whole and only partial write was done.

There are checks that may be run against datafiles to ensure the validity of all tail values on all blocks of them.
DBV catches this kind of failures and may be used against your DB file(s) to check this.

Identically, there is a clear path to follow when this happens.
These blocks are badly written by o.s./h.w. and as such, Oracle operations over the block(s) affected are correct.
(otherwise, a different kind of error would have been printed out)

. What to do to recover from this scenario (fractured block(s)):
----------------------------------------------------------------
0.- Offline affected datafile(s) on production instance or shutdown + startup mount DB.
In case SYSTEM datafile is affected, you need necessarily to shutdown DB + startup mount (you can't offline System TS).
1.- Restore previous backup of affected datafile(s) into different location
2.- Check with DBV the file restored
- corruption exists -> Return to step 1 by selecting an 'older' backup
- corruption does not exist -> Continue with step 3.-
3.- Rename current datafiles at o.s. level to avoid overwriting in case something fails
4.- Recover affected datafile(s)
-> recover database ; -- this will recover exclusivelly the old datafiles found
5.- Online affected datafile(s) or open database after media recovery complete

. What if no backup exists or corruption exists in all backups:
---------------------------------------------------------------
In this case we can only rebuild the affected object.
We will not be able to get rid of these temporary segments.

I believe the only thing that may be done here is:
.- Determine where the problem resides (at OS/HW level)
.- in the mean time, move this DB (the correct part of it) to a new box
-> 733824.1 HowTo Recreate a database using TTS

Block Corruption in Oracle Database

Oracle classifies the Data File Block corruptions as Physical and Logical :

Physical Block Corruptions

This kind of block corruptions are normally reported by Oracle with error ORA-1578 and the detailed corruption description is printed in the alert log.

Corruption Examples are:

Bad header - the beginning of the block (cache header) is corrupt with invalid values
The block is Fractured/Incomplete - header and footer of the block do not match
The block checksum is invalid
The block is misplaced
Zeroed out blocks / ORA-8103

Detailed Corruption Description:

Fractured Block:
A Fractured block means that the block is incomplete. Information from the block header does not match the block tail.

Bad Checksum:
Block Checksums are used to identify if the block was changed by something external to Oracle and after the block was last written by Oracle.
Checksum is calculated by DBWR or direct loader before writing the block to disk and stored in the block header. Every time that the block is read and if db_block_checksum is different than false, Oracle calculates a checksum and compares it to the one stored in the block header. Reference Note 30706.1

Block Misplaced:
This is when Oracle detected that the content of the block being read belongs to a different block and the checksum is valid

********************************************************************************

Logical Block Corruptions

This is when block contains a valid checksum and the structure below the beginning of the block is corrupt (Block content is corrupt). It may cause different ORA-600 errors.

The detailed corruption description for Logical Corruptions are not normally printed in the alert.log. DBVerify will report what is logically corrupted in the block.

Corruption Examples are:

row locked by non-existent transaction - ORA-600 [4512],etc
the amount of space used is not equal to block size
avsp bad
etc.

When db_block_checking is enabled, it may produce the internal errors ORA-600 [kddummy_blkchk] or ORA-600 [kdBlkCheckError].

If db_block_checking is enabled and the block is already logically corrupt on disk, the next block update will mark the block as Soft Corrupt and future reads of this block will produce the error ORA-1578. In that case DBVerify reports this corruption with error "DBV-200: Block, dba <rdba>, already marked corrupted".

How to find Special Characters from the table

To Find Special Characters in Particular column

SELECT *
FROM table_name
WHERE RTRIM
((LTRIM
(TRANSLATE
(REPLACE (&p_column, 'A', ''),
'ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz0123456789@#$%^&*()_+-=,!\`~{}./?:";''[]'
|| CHR (160)
|| CHR (32)
|| CHR (0)
|| CHR (253),
'A'
)
)
)
) IS NOT NULL;

Also by using below PL/SQL block you can find special characters through out the all columns in table.

SET SERVEROUTPUT ON SIZE 1000000

DECLARE

PROCEDURE GOOEY(V_TABLE VARCHAR2, V_COLUMN VARCHAR2) IS

TYPE T_ID IS TABLE OF NUMBER;

TYPE T_DUMP IS TABLE OF VARCHAR2(20000);

TYPE T_DATA IS TABLE OF VARCHAR2(20000);

L_ID T_ID;

L_DATA T_DATA;

L_DUMP T_DUMP;

CURSOR A IS

SELECT DISTINCT COLUMN_NAME

FROM DBA_TAB_COLUMNS

WHERE TABLE_NAME = V_TABLE

AND DATA_TYPE = 'VARCHAR2'

AND COLUMN_NAME NOT IN ('CUSTOMER_KEY','ADDRESS_KEY');

BEGIN

FOR X IN A LOOP

L_ID := NULL;

L_DATA := NULL;

L_DUMP := NULL;

EXECUTE IMMEDIATE 'SELECT ' || V_COLUMN || ', ' || X.COLUMN_NAME || ', ' ||

'dump(' || X.COLUMN_NAME || ')'

|| ' FROM ' || V_TABLE

|| ' WHERE RTRIM((LTRIM(REPLACE(TRANSLATE(' || X.COLUMN_NAME ||

',''ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789@#$%^&*()_+

-=,!\`~{}./?:";''''[ ]'',''A''), ''A'', '''')))) IS NOT NULL'

BULK COLLECT INTO L_ID, L_DATA, L_DUMP;

IF L_ID IS NOT NULL THEN

FOR K IN 1..L_ID.COUNT LOOP

DBMS_OUTPUT.PUT_LINE(V_TABLE || ' - ' || X.COLUMN_NAME || ' - ' ||

TO_CHAR(L_ID(K),'999999999999'));

DBMS_OUTPUT.PUT_LINE(L_DATA(K));

DBMS_OUTPUT.PUT_LINE(L_DUMP(K));

DBMS_OUTPUT.PUT_LINE('*********************');

END LOOP;

END IF;

END LOOP;

END GOOEY;

BEGIN

GOOEY('TABLENAME1','COLUMN_NAME');

GOOEY('TABLENAME2',' COLUMN_NAME ');

GOOEY('TABLENAME3',' COLUMN_NAME ');

END;

Friday, November 2, 2012

Service Management in Concurrent Processing

An Oracle Applications system depends on a variety of services such as Forms Listeners, HTTP Servers, Concurrent Managers, and Workflow Mailers. Such services are composed of one or more processes that must be kept running for the proper functioning of the applications. Previously many of these processes had to be individually started and monitored by system administrators. Management of these processes was complicated by the fact that these services could be distributed across multiple host machines. Service Management helps to greatly simplify the management of these processes by providing a fault tolerant service framework and a central management console built into Oracle Applications Manager.

Service Management is an extension of concurrent processing, which provides a powerful framework for managing processes on multiple host machines. With Service Management, virtually any application tier service can be integrated into this framework. Services such as the Oracle Forms Listener, Oracle Reports Server, Apache Web listener, and Oracle Workflow Mailer can be run under Service Management.

With Service Management, the Internal Concurrent Manager (ICM) manages the various service processes across multiple hosts. On each host, Service Manager acts on behalf of the ICM, allowing the ICM to monitor and control service processes on that host. System administrators can then configure, monitor, and control services though a management console which communicates with the ICM.

Generic Service Management

Service Management provides a fault tolerant system. If a service process exits unexpectedly, the ICM will automatically attempt to restart the process. If a host fails, the ICM may start the affected service processes on a secondary host. The ICM itself is monitored and kept alive by Internal Monitor processes located on various hosts.

Service Management provides significant improvements in the manageability of Oracle Applications. System administrators can now use the central console in Oracle Applications Manager (OAM) to manage a variety of services that formerly had to be managed independently on separate hosts. The entire set of system services may be started or stopped with a single action. Service Management also provides a great benefit by automatically compensating for certain system failures.

Service processes are very much like concurrent manager and transaction manager processes. They must be kept running on a middle tier for the proper functioning of their respective products. The concurrent processing management feature has been built for concurrent managers and transaction managers, to provide fault tolerance, process distribution, and simplified configuration and control.

Benefits of Service Management
• The service processes will no longer need to be manually and individually started and monitored by Oracle Applications system administrators.
• Services can take advantage of the process distribution and fault tolerance capabilities that have been developed for concurrent processing.
• As with concurrent manager processes, system administrators can use work shifts to determine the number of processes that will be active for a service on a given node for a given time period.

To extend process management support to the various Applications services, the Internal Concurrent Manager must be able to start, monitor, and control processes on all Applications tiers. Every node of every tier will have an Oracle RPC-based Service Manager installed. The ICM will use the Service Manager to manage processes.

Service
A service is a process or collection of processes that perform actions at the request of client processes. A concurrent manager is a type of service where the client submits a request for actions to be processed while the client continues to do other work. While active, a service must have one or more listener processes that wait to process requests from clients. An example of a listener is a concurrent manager process which periodically polls a queue for requests to process.

Service Instance
Each service controlled by service management may have multiple service instances.
Each instance may consist of one or more processes.

Concurrent:GSM Enabled Profile Option
The Concurrent:GSM Enabled profile option should be set to Y to enable Service Management. It is set automatically to Y by AutoConfig. Disabling Service Management is not recommended as that may prevent necessary services from starting.

Service Management and Control Scripts
With Service Management, the Apache Server, Forms Server, Forms Metrics Server, Forms Metrics Client, and Reports services can be managed through Oracle Applications Manager. When these services are enabled for Service Management, they can still be controlled using the control scripts listed below; for example, using adapcctl.sh (UNIX) or adapcctl (Windows).

These control scripts are generated by AutoConfig for the Forms Listener, Reports Server, and other Application Tier services, and synchronize with Service Management. If you start or stop a service using one of these scripts, Service Management is notified of the change. If the Service Management infrastructure is not running, the control scripts can be used to control individual services. The service status is synchronized with Service Management when the Internal Concurrent Manager (ICM) is restarted. Running one of these control scripts from the command line starts or stops the respective service synchronously and the FNDSVCRG program and the ICM handles the data collection.
The control scripts that can be managed by Service Management are:
• adapcctl.sh (Apache)
• adfrmctl.sh (Forms)
• adfmsctl.sh (Metrics Server)
• adfmcctl.sh (Metrics Client)
• adrepctl.sh (Reports)

Network Failure Recovery
As part of its shutdown process, the ICM determines if it's being forced to shutdown due to losing its database connection. This is done by looking for specific error messages ORA-3113, ORA-3114, or ORA-1041. If one of these error messages is detected, the ICM spawns the reviver process, which attempts to make a database connection. If unsuccessful, it sleeps for a period before trying again. This continues until either a successful connection is made or it receives a signal to shut itself down. When a successful connection is made, the process kills the old ICM database session, and then starts a new ICM using the normal start manager script. Once the ICM is restarted, it starts up any other managers that had also shut down, and normal processing resumes.

Failover Sensitive Workshifts
Nodes can become overloaded when a middle-tier node fails and service instances on that node failover to their secondary nodes. The Load Distribution feature allows the System Administrator to control the allocation of resources during normal processing. The Failover Sensitivity feature allows Work Shifts to failover with fewer processes than on the original node. This lessens the impact on the existing resources allocated on the secondary node.
The number of failover processes is entered as part of the standard Work Shift settings in the Service Instance Definition. When failover occurs, the ICM uses the Failover Processes value in place of the normal running processes value as it iterates through service instances to perform queue sizing.

Nitin Jadhav's Blog