---------------------------------------------------------------------------------------------------- [2009-07] http://www.itpub.net/thread-1181636-3-1.html > After Oracle8i, a complete checkpoint only occurs when you alter system checkpoint and when you > shutdown a database in a mode other than abort. An incremental checkpoint occurs every 3 seconds > and every time a logfile switch occurs. I had in my mind two concepts "fast vs slow checkpoint" (which determines checkpoint priority) and "complete vs incremental checkpoint" (which is about whether to write *all* buffers on the checkpoint queue). I searched on Metalink and Google and realized the first concept is old, used in Oracle 8 or older, and it seems to be the same as the second one. That is, there's only one concept, not two independent "orthogonal" ones. Oracle just quietly renamed fast to complete and slow to incremental. The renaming is a good move, because it's really not about speed, but about how much the checkpoint queue is processed. With that understanding, I pulled my book from the shelf, "Oracle8 Backup and Recovery Handbook" by Rama Velpuri, published in 1998. On p.244, three's a table listing all types of checkpoints. Since this book is hard to find and there's no ebook version, let me summarize here (some post-Oracle8 concepts are not listed of course): These checkpoints are fast: alter system checkpoint (local or global), alter tablespace begin backup or offline (normal, temporary), instance shutdown (normal, immediate), log file switch stuck. These checkpoints are slow: alter system switch logfile, log file switch normal, checkpoint due to log_checkpoint_(timeout|interval). So is the checkpoint triggered by a logfile switch complete or incremental? According to Rama's book, normally it is incremental, consistent with what you said. It only becomes complete when the switch gets stuck, which Rama clearly explains in a paragraph, as in the case where you only have 2 small logfiles and Oracle can't switch to the other file because checkpoint of that old one is still being checkpointed. ---------------------------------------------------------------------------------------------------- Active Checkpoint Queue Ref: http://www.vldb.org/conf/1998/p665.pdf Assuming one single checkpoint queue (CQ) is wrong. The above article has a good picture, Figure 2, where there're two CQs, BCQ1 and BCQ2. (It doesn't show multiple DB writers, although the paragraph right before the end of Sec 3.1 mentions that multiple CQs makes multiple DBWR processes possible.) Let's assume each DBWR writes buffers on each CQ. Although there're multiple CQs, there's only one Active Checkpoint Queue (ACQ). If you check v$latch_children, you'll see multiple CQ latches but only one ACQ latch. So we can reasonably assume that it's the ACQ that records the ultimate checkpoint progress record, and it is this one that CKPT writes into the control file periodically. Suppose DBWR1 works on BCQ1 and DBWR2 on BCQ2. If the checkpoint request comes in saying please advance checkpoint to c3 (see the picture), both DB writers start to work. Suppose DBWR1 works faster for some reason and finishes writing both buffers b1 and b2, but DBWR2 only finishes b4 and hasn't finished b5. The checkpoint progress cannot be said to have advanced to c3 yet. It's either c0 (somewhere to the left of c1) as if nothing has done, or c1. (I'm not sure whether it'll be reported to be c0 or c1; I doubt ACQ is updated as frequently as every time one single buffer on a CQ is written.) ---------------------------------------------------------------------------------------------------- > Writing dirty buffers on CKPT-Q. These buffers won't be removed from the LRUW. I have some doubt about that statement. On a small 11.2.0.3 database, I made some dirty buffers. alter session set events 'immediate trace name buffers level 1' creates a trace file. And grep -c buffer_dirty tracefile shows 87. Also grep ckptq tracefile | grep -cv 'ckptq: \[NULL\]' shows 87. I run alter system checkpoint local (it's RAC). Dump buffers again. The trace file contains 0 lines for both dirty buffers and non-empty checkpoint queue. So it seems when I checkpoint, all dirty buffers become clean as well. There may be better ways to verify whether the LRUW list becomes empty upon checkpoint. I thought I could verify that x$kcbwds.cnum_write (number of buffers on the main write list) or anum_write (the same but on the aux write list) would change from non-zero to zero. But it's always 0 for me. I think I know why cnum_write and anum_write of x$kcbwds (number of blocks on main and auxiliary LRUWs) are "always" zero (see my last paragraph). When DBWn is urged by a foreground session to write dirty buffers from replacement lists, the foreground moves them to the main LRUW and DBWn moves them to the aux LRUW and almost immediately writes them to disk. All this happens quickly so there's very little time for me to catch non-empty LRUW lists (both main and aux). vage's test: http://www.itpub.net/thread-1632432-1-1.html "Dirty buffer may exist on two lists at the same time: LRUW and CKPT-Q. After a buffer is written from LRUW, it's removed from CKPT-Q. How about the other way? Written to disk from CKPT-Q, will it be removed from LRUW after the write? The answer is Yes." I think this is what happens. On a regular DBWn write, dirty buffers are written from LRUW (and immediately taken off the checkpoint queue). On a checkpoint, they are written from the checkpoint queue (and immediately cleaned from the LRUW). (http://www.itpub.net/thread-1628433-1-1.html) (http://www.itpub.net/thread-1455763-1-1.html) (http://www.itpub.net/thread-1875065-1-1.html) ---------------------------------------------------------------------------------------------------- One question about Rama's book: Checkpoint works from the latest dirty buffers toward the oldest according to the second bullet on p.247. But p.246 says DBWR writes out the buffers in ascending, not descending, low redo order. ----------------------------------------------------------------------------------------------------