This section focuses on the recovery methods used by Oracle and various options available to the DBA. There are three basic types of recovery -- online block recovery, thread recovery, and media recovery. In all three cases, the algorithm that applies the redo records against an individual block is the same. But first, you need to understand the concepts of redo application, roll forward, and rollback mechanisms, and how Oracle determines that recovery is required for a data file(s).
When a database is started with the startup command from SVRMGR, there are various stages that the database goes through. The database first goes into the nomount state. In this state, Oracle reads the INIT.ORA file to determine the size of the SGA, creates the SGA, and starts the background processes. The DBA sees a message on the terminal at this time that says "instance started."
Next, the instance mounts the database. In this state, the control file is opened and the "database mounted" message is displayed. In the mounted state, commands such as recover database or any alter database command can be issued. The alter session command can be used to dump trace information from the control file, redo log file headers, data file headers, and data blocksto trace files.
In the third and final stage, the instance opens the database displaying the "database opened" message to the user screen. In this stage, it is verified that all the data and log files can be opened. If the instance is opening the database for the first time after a database crash, crash recovery needs to be performed. There are two steps to crash recovery. The first is to roll forward the database, where all the redo stored in the redo log files will be applied to the database files and a new thread will be opened. As part of the second step (known as transaction recovery), all uncommitted transactions are rolled back.
A common question asked is how does Oracle know when to apply recovery to a particular data file or data files? We have learned that each data file, in its header, has a checkpoint counter that gets incremented every time Oracle performs a checkpoint on the data file. The control file keeps a checkpoint counter for every data file as well. We have also learned that every data file header contains an SCN as part of its checkpoint structure. This is called the start SCN. Corresponding to every data file, the control file has a stop SCN. During normal operation of the database, the stop SCN in the control file is set to infinity. The start SCN in the data file is incremented every time a checkpoint is done.
When the database is shut down with the NORMAL or IMMEDIATE option, the checkpoint that is issued will set the stop SCN in the control file equal to the corresponding start SCN in the data file header for each data file. When the database is opened the next time, Oracle makes two checks. The first check is to see if the checkpoint counter in the data file header matches its corresponding checkpoint counter in the control file. Once it is the same, it does the second check. This check compares the start SCN value in the data file header to its corresponding stop SCN in the control file. If the stop SCN equals the start SCN, then no recovery is required for that file. This check is performed for every data file and then the database is opened. As part of the open, the stop SCNs are set to infinity again.
Now, take the case where you shut the database down hard using the shutdown abort command. In this case, a checkpoint is not performed and the stop SCN remains at infinity when the database goes down. During the next startup, the checkpoint counters are again matched first. If they are the same (i.e., you didn't replace the data files with a backup copy), then Oracle compares the stop and start SCNs. In this case, since the stop SCN is infinite and the start SCN has some value, Oracle determines that they are not the same, so thread recovery needs to be performed, in this case, since you are starting up the Instance after a crash, crash recovery will be performed. As part of the crash recovery, Oracle reads the online log files and applies the changes to the database as part of the roll forward, and reads the rollback segment's transaction table to perform transaction recovery (roll backward). Thread recovery is discussed later in this section.
After shutting down the database, if you replace one of the data files with a backup copy, Oracle detects this as part of the checkpoint counter check and asks you to apply media recovery. From the data file header, Oracle also knows the beginning log sequence number of the archived redo log file where recovery starts. Oracle requests that you apply media recovery starting from that log file sequence number.