Tuesday, November 20, 2012

Scenario for the Physical Block Corruption


SMON's unable to clean the temporary segments due to the fact that these blocks are currently fractured

A fractured block is a clear symptom about serious issues within the O.S./H.W. layers.
Oracle keeps track off the header of each block and constructs before writing down to disk by building a small
4 byte field/value in the tail of each block to guarantee it is correctly written

Pls see the following as an Example:
-----------------------
Corrupt block relative dba: 0x0380e573 (file 14, block 58739)
Fractured block found during buffer read
Data in bad block -
type: 6 format: 2 rdba: 0x0380e573
last change scn: 0x0288.8e5a2f78 seq: 0x1 flg: 0x04
consistency value in tail: 0x00780601 <- Should be least 4 significant bytes from scn, flg && seq
check value in block header: 0x8739, computed block checksum: 0x2f00
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0380e573 (file 14, block 58739) found same corrupted data
-----------------------

This value was right on the block oracle asked o.s. to write but, unfortunatelly, the write did not complete as
a whole and only partial write was done.

There are checks that may be run against datafiles to ensure the validity of all tail values on all blocks of them.
DBV catches this kind of failures and may be used against your DB file(s) to check this.

Identically, there is a clear path to follow when this happens.
These blocks are badly written by o.s./h.w. and as such, Oracle operations over the block(s) affected are correct.
(otherwise, a different kind of error would have been printed out)

. What to do to recover from this scenario (fractured block(s)):
----------------------------------------------------------------
0.- Offline affected datafile(s) on production instance or shutdown + startup mount DB.
In case SYSTEM datafile is affected, you need necessarily to shutdown DB + startup mount (you can't offline System TS).
1.- Restore previous backup of affected datafile(s) into different location
2.- Check with DBV the file restored
- corruption exists -> Return to step 1 by selecting an 'older' backup
- corruption does not exist -> Continue with step 3.-
3.- Rename current datafiles at o.s. level to avoid overwriting in case something fails
4.- Recover affected datafile(s)
-> recover database ; -- this will recover exclusivelly the old datafiles found
5.- Online affected datafile(s) or open database after media recovery complete

. What if no backup exists or corruption exists in all backups:
---------------------------------------------------------------
In this case we can only rebuild the affected object.
We will not be able to get rid of these temporary segments.

I believe the only thing that may be done here is:
.- Determine where the problem resides (at OS/HW level)
.- in the mean time, move this DB (the correct part of it) to a new box
-> 733824.1 HowTo Recreate a database using TTS

No comments:

Post a Comment