The Oradbsafe : Hot backup mode Misconcepts

Misconception: Hot backup mode stops writing to the datafiles

During an Oracle tablespace hot backup, a script or program puts a tablespace into backup mode, then copies the datafiles to disk or tape, then takes the tablespace out of backup mode. Although these steps are widely understood, the idea that datafiles are not written during backup mode is probably the most alarming and widely held misconception about Oracle. So many people think this is true, that it is actually asserted as fact in the backup/recovery sections of a number of major published books on Oracle. Numerous websites on the topic also make the assertion that datafiles are not writable during backup. Many people have reported that Oracle Education DBA classes and third-party classes teach this fallacy as fact.

The erroneous descriptions have several permutations. Most claim that while the datafiles are allegedly not writable, changes are stored somewhere in the SGA, the redologs, the rollback segments or some combination thereof, then written back into the datafile when the tablespace is taken out of backup mode. The passage from Oracle Unleashed describing this supposed mechanism is representative.

When you place a tablespace in backup mode, the Oracle instance notes that a backup is being performed and internally compensates for it. As you know, it is impossible to make an authentic copy of a database file that is being written to. On receipt of the command to begin the backup, however, Oracle ceases to make direct changes to the database file. It uses a complex combination of rollback segments, buffers, redo logs, and archive logs to store the data until the end backup command is received and the database files are brought back in sync.

Simplifying a hot backup in this way is tantamount to classifying the USS Nimitz as a boat. The complexity of the actions taken by the Oracle RDBMS under a hot backup could consume an entire chapter and is beyond the scope of this book. What you should understand is the trade-off for taking a hot backup is increased use of rollback segments, redo logs, archive logs, and internal buffer areas within the SGA.

(From Oracle Unleashed, Copyright Ó SAMS/Macmillan, Inc. 1997, chapter 14)

In fact, Oracle’s tablespace hot backup does not work this way at all. Rather, it is a simple and failure-resistant mechanism. It absolutely does not stop writing the datafiles, and actually allows continued operation of the database almost exactly as during normal operation. Contrary to the characterization as “complex” in SAMS Oracle Unleashed, it can be almost completely summarized in one sentence.

· The tablespace is checkpointed, the checkpoint SCN marker in the datafile headers cease to increment with checkpoints, and full images of changed DB blocks are written to the redologs.

Those three actions are all that is required to guarantee consistency once the file is restored and recovery is applied. By freezing the checkpoint SCN in the file headers, any subsequent recovery on that backup copy of the file will know that it must commence at that SCN. Having an old SCN in the file header tells recovery that the file is an old one, and that it should look for the redolog file containing that SCN, and apply recovery starting there. Note that checkpoints to datafiles in hot backup mode are not suppressed during the backup, only the incrementing of the main checkpoint SCN flag. A “hot backup checkpoint” SCN marker in the file header continues to increment as periodic or incremental checkpoints progress normally.

There is a confusing side effect of having the checkpoint SCN artificially frozen at an SCN earlier than the true checkpointed SCN of the datafile. In the event of a system crash or a ‘shutdown abort’ during hot backup of a tablespace, the automatic crash recovery routine at startup will think that the files for that tablespace are quite out of date, and will actually suggest the application of old archived redologs in order to bring them back into sync with the rest of the database. It is unnecessary, in this case, to heed Oracle’s suggestion. With the database started up in mount mode, simply check v$backup and v$datafile to determine which datafiles were in backup mode at the time the database crashed. For each file in backup mode, issue an ‘alter database datafile '' end backup;’ This action will update the checkpoint SCN in the file headers to be the same as the hot backup checkpoint SCN (which is a true representation of the last SCN to which the datafile is truly checkpointed. Once this action is taken, it allows normal crash recovery to proceed during the ‘alter database open;’ command.

By initially checkpointing the datafiles that comprise the tablespace and logging full block images to redo, Oracle guarantees that any blocks changed in the datafile while in hot backup mode will also be available in the redologs in case they are ever used for a recovery.

It is well understood by much of the Oracle user community that during hot backup mode, a greater volume of redo is generated for changes to the tablespace being backed up than when the tablespace is not in backup mode. This is the result of the logging of full images of changed blocks in these tablespaces to the redologs. Normally, Oracle logs an entry in the redologs for every change in the database, but it does not log the whole image of the database block. By logging full images of changed DB blocks to the redologs during backup mode, Oracle eliminates the possibility of the backup containing irresolvable split blocks. To understand this reasoning, you must first understand what a split block is.

Typically, Oracle database blocks are a multiple of O/S blocks. For instance, most Unix filesystems have a default block size of 512 bytes, while Oracle’s default block size is 2k. This means that the filesystem stores data in 512 byte chunks, while Oracle performs reads and writes in 2k chunks, or multiples thereof. While backing up a datafile, your backup script makes a copy of the datafile from the filesystem, using O/S utilities such as copy, dd, cpio, or OCOPY. As it is making this copy, it is reading in O/S-block sized increments. If the database writer happens to be writing a DB block into the datafile at the same time that your script is reading that block’s constituent O/S blocks, your backup copy of the DB block could contain some O/S blocks from before the database performed the write, and some from after. This would be a split block.

By logging the full block image of the changed block to the redologs, it guarantees that in the event of a recovery, any split blocks that might be in the backup copy of the datafile will be resolved by overlaying them with the full legitimate image of the block from the redologs. Upon completion of a recovery, any blocks that got copied in a split state into the backup will have been resolved through application of full block images from the redologs.

It is important to remember that all of these mechanisms exist for the benefit of the backup copy of the files and the recovery process, and have very little effect on the datafiles of the database being backed up. The database files are read by server processes and written by the database writer throughout the backup, as they are when a backup is not taking place. The only difference manifested in the open database files is the freezing of the checkpoint SCN, and the incrementing of the hot-backup SCN.

To demonstrate the principle, we can formulate a simple proof.

· A table is created and a row of data inserted.

SQL> create table fruit (name varchar2(32)) tablespace administrator;

Table created.

SQL> insert into fruit values ('orange');

1 row created.

SQL> commit;

Commit complete.

· A checkpoint is forced, to flush dirty DB block buffers to the datafiles.

SQL> alter system checkpoint;

System altered.

· The block number in the datafile where the data resides is determined.

SQL> select dbms_rowid.rowid_block_number(rowid) blk, name from fruit;

BLK NAME

------- ----------------

3 orange

· The DB block containing an identifiable piece of the data is excised from a datafile using the Unix dd command, allowing 9 DB blocks for the raw partition header and datafile header, plus the block offset of three.

unixhost% dd ibs=8192 skip=11 count=1 \

> if=/raw-01/databases/toy/administrator-01.dbf
strings

1+0 records in

16+0 records out

orange

· The tablespace is placed into hot backup mode.

SQL> alter tablespace administrator begin backup;

Tablespace altered.

· The row is updated, committed, and a global checkpoint forced on the database.

SQL> update fruit set name = 'plum';

1 row updated

SQL> commit;

Commit complete.

SQL> alter system checkpoint;

System altered.

· The same block is extracted, showing that the DB block has been written on disk.

unixhost% dd ibs=8192 skip=11 count=1 \

> if=/raw-01/databases/toy/administrator-01.dbf
strings

1+0 records in

16+0 records out

orange,

plum

· The tablespace is taken out of backup mode

SQL> alter tablespace administrator end backup;

It is quite clear from this demonstration that datafiles receive writes even during hot backup mode.

During an Oracle tablespace hot backup, a script or program puts a tablespace into backup mode, then copies the datafiles to disk or tape, then takes the tablespace out of backup mode. Although these steps are widely understood, the idea that datafiles are not written during backup mode is probably the most alarming and widely held misconception about Oracle. So many people think this is true, that it is actually asserted as fact in the backup/recovery sections of a number of major published books on Oracle. Numerous websites on the topic also make the assertion that datafiles are not writable during backup. Many people have reported that Oracle Education DBA classes and third-party classes teach this fallacy as fact.

The erroneous descriptions have several permutations. Most claim that while the datafiles are allegedly not writable, changes are stored somewhere in the SGA, the redologs, the rollback segments or some combination thereof, then written back into the datafile when the tablespace is taken out of backup mode. The passage from Oracle Unleashed describing this supposed mechanism is representative.

When you place a tablespace in backup mode, the Oracle instance notes that a backup is being performed and internally compensates for it. As you know, it is impossible to make an authentic copy of a database file that is being written to. On receipt of the command to begin the backup, however, Oracle ceases to make direct changes to the database file. It uses a complex combination of rollback segments, buffers, redo logs, and archive logs to store the data until the end backup command is received and the database files are brought back in sync.

Simplifying a hot backup in this way is tantamount to classifying the USS Nimitz as a boat. The complexity of the actions taken by the Oracle RDBMS under a hot backup could consume an entire chapter and is beyond the scope of this book. What you should understand is the trade-off for taking a hot backup is increased use of rollback segments, redo logs, archive logs, and internal buffer areas within the SGA.

(From Oracle Unleashed, Copyright Ó SAMS/Macmillan, Inc. 1997, chapter 14)

In fact, Oracle’s tablespace hot backup does not work this way at all. Rather, it is a simple and failure-resistant mechanism. It absolutely does not stop writing the datafiles, and actually allows continued operation of the database almost exactly as during normal operation. Contrary to the characterization as “complex” in SAMS Oracle Unleashed, it can be almost completely summarized in one sentence.

· The tablespace is checkpointed, the checkpoint SCN marker in the datafile headers cease to increment with checkpoints, and full images of changed DB blocks are written to the redologs.

Those three actions are all that is required to guarantee consistency once the file is restored and recovery is applied. By freezing the checkpoint SCN in the file headers, any subsequent recovery on that backup copy of the file will know that it must commence at that SCN. Having an old SCN in the file header tells recovery that the file is an old one, and that it should look for the redolog file containing that SCN, and apply recovery starting there. Note that checkpoints to datafiles in hot backup mode are not suppressed during the backup, only the incrementing of the main checkpoint SCN flag. A “hot backup checkpoint” SCN marker in the file header continues to increment as periodic or incremental checkpoints progress normally.

There is a confusing side effect of having the checkpoint SCN artificially frozen at an SCN earlier than the true checkpointed SCN of the datafile. In the event of a system crash or a ‘shutdown abort’ during hot backup of a tablespace, the automatic crash recovery routine at startup will think that the files for that tablespace are quite out of date, and will actually suggest the application of old archived redologs in order to bring them back into sync with the rest of the database. It is unnecessary, in this case, to heed Oracle’s suggestion. With the database started up in mount mode, simply check v$backup and v$datafile to determine which datafiles were in backup mode at the time the database crashed. For each file in backup mode, issue an ‘alter database datafile '' end backup;’ This action will update the checkpoint SCN in the file headers to be the same as the hot backup checkpoint SCN (which is a true representation of the last SCN to which the datafile is truly checkpointed. Once this action is taken, it allows normal crash recovery to proceed during the ‘alter database open;’ command.

By initially checkpointing the datafiles that comprise the tablespace and logging full block images to redo, Oracle guarantees that any blocks changed in the datafile while in hot backup mode will also be available in the redologs in case they are ever used for a recovery.

It is well understood by much of the Oracle user community that during hot backup mode, a greater volume of redo is generated for changes to the tablespace being backed up than when the tablespace is not in backup mode. This is the result of the logging of full images of changed blocks in these tablespaces to the redologs. Normally, Oracle logs an entry in the redologs for every change in the database, but it does not log the whole image of the database block. By logging full images of changed DB blocks to the redologs during backup mode, Oracle eliminates the possibility of the backup containing irresolvable split blocks. To understand this reasoning, you must first understand what a split block is.

Typically, Oracle database blocks are a multiple of O/S blocks. For instance, most Unix filesystems have a default block size of 512 bytes, while Oracle’s default block size is 2k. This means that the filesystem stores data in 512 byte chunks, while Oracle performs reads and writes in 2k chunks, or multiples thereof. While backing up a datafile, your backup script makes a copy of the datafile from the filesystem, using O/S utilities such as copy, dd, cpio, or OCOPY. As it is making this copy, it is reading in O/S-block sized increments. If the database writer happens to be writing a DB block into the datafile at the same time that your script is reading that block’s constituent O/S blocks, your backup copy of the DB block could contain some O/S blocks from before the database performed the write, and some from after. This would be a split block.

By logging the full block image of the changed block to the redologs, it guarantees that in the event of a recovery, any split blocks that might be in the backup copy of the datafile will be resolved by overlaying them with the full legitimate image of the block from the redologs. Upon completion of a recovery, any blocks that got copied in a split state into the backup will have been resolved through application of full block images from the redologs.

It is important to remember that all of these mechanisms exist for the benefit of the backup copy of the files and the recovery process, and have very little effect on the datafiles of the database being backed up. The database files are read by server processes and written by the database writer throughout the backup, as they are when a backup is not taking place. The only difference manifested in the open database files is the freezing of the checkpoint SCN, and the incrementing of the hot-backup SCN.

To demonstrate the principle, we can formulate a simple proof.

· A table is created and a row of data inserted.

SQL> create table fruit (name varchar2(32)) tablespace administrator;

Table created.

SQL> insert into fruit values ('orange');

1 row created.

SQL> commit;

Commit complete.

· A checkpoint is forced, to flush dirty DB block buffers to the datafiles.

SQL> alter system checkpoint;

System altered.

· The block number in the datafile where the data resides is determined.

SQL> select dbms_rowid.rowid_block_number(rowid) blk, name from fruit;

BLK NAME

------- ----------------

3 orange

· The DB block containing an identifiable piece of the data is excised from a datafile using the Unix dd command, allowing 9 DB blocks for the raw partition header and datafile header, plus the block offset of three.

unixhost% dd ibs=8192 skip=11 count=1 \

> if=/raw-01/databases/toy/administrator-01.dbf
strings

1+0 records in

16+0 records out

orange

· The tablespace is placed into hot backup mode.

SQL> alter tablespace administrator begin backup;

Tablespace altered.

· The row is updated, committed, and a global checkpoint forced on the database.

SQL> update fruit set name = 'plum';

1 row updated

SQL> commit;

Commit complete.

SQL> alter system checkpoint;

System altered.

· The same block is extracted, showing that the DB block has been written on disk.

unixhost% dd ibs=8192 skip=11 count=1 \

> if=/raw-01/databases/toy/administrator-01.dbf
strings

1+0 records in

16+0 records out

orange,

plum

· The tablespace is taken out of backup mode

SQL> alter tablespace administrator end backup;

It is quite clear from this demonstration that datafiles receive writes even during hot backup mode.

The Oradbsafe

Wednesday, June 13, 2012

Hot backup mode Misconcepts

No comments:

Post a Comment

ORA-12519: TNS:no appropriate service handler found error