Venzi's Tech-Blog

14 November 2007

When the database dies….

Filed under: Oracle,Work — Venzi @ 15:34

Yesterday was a bad day for me as DBA. All begone with two distributed transactions which had a lock held on a table. Turned out that these two transactions were In-Doubt transaction and could be roll-backed. So what do you do in this case: Select on DBA_2PC_PENDING view, get the global transaction id and execute a rollback force.

It just took two minutes the first developer came to me and asked me why the hell the database is down on 10am. “What, database down?”, I said and reconnected me. And really, the database went down. So I restarted it and opened the alert log to look what was going on. Nice,  ORA-00600: internal error code, arguments: [kcbz_check_objd_typ_3], [0], [0], [1], [], [], [], []. As I read in the log file, it just took one minute and the database died again. Same ORA-600 three times. And again and again and again. Shit…. Ok, opened metalink, and ora-600 lookup tool and searched for it. Nothing useful, just something about block corruption, damn. Ok, time to log a SR with severity 1 to Oracle after around 40 developers couldn’t work anymore. And what turned out. A little bit funny I think:

The two transactions looked at two different primary keys which had a corrupt block inside. The instance recognized that, when I roll-backed the transactions and stopped the database. After startup the instance recognized that there is something to recover, and of course also this two distributed transactions. So it crashed again. After finding with Oracle support out which two indexes made trouble, it was a easy job. I mounted the database, put the datafile of the indexes offline, opened the database, droped the indexes, recovered the datafiles, recreated the indexes and put the datafile back online.

Quiet another funny day in a DBAs life…

2 Comments »

  1. Were you ever able to figure out why the indexes were corrupted. We are facing similar situation when we use XA driver. It seems its a bug in Oracle distributed transactions using temp table.

    Comment by chad — 12 December 2007 @ 00:21 | Reply

  2. Hi chad,

    Unfortunately not! The error occured when rolling back the two in-doubt-transactions. They both held a lock on the tables where the indexes got corrupt. Oracle Support wasn’t able anymore to determine the reason, because we didn’t run the database in archivelog mode so the interesting redologs were already gone.

    Regards

    Venzi

    Comment by venzi — 12 December 2007 @ 00:36 | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: