在Oracle中有一个事件叫Heartbeat,这个词在很多地方被提及,并且有着不同的含义(比如RAC中),我们这里要讨论的是CKPT的Heartbeat机制。
Oracle通过CKPT进程每3秒将Heartbeat写入控制文件,以减少故障时的恢复时间(这个我们后面再详细阐述)。
我们可以通过如下方法验证这个过程。
1.首先在系统级启用10046时间跟踪
并重新启动数据库使之生效
[oracle@jumper oracle]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Thu Jan 19 09:24:04 2006
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
SQL> alter system set event='10046 trace name context forever,level 12' scope=spfile;
System altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 114365800 bytes
Fixed Size 451944 bytes
Variable Size 50331648 bytes
Database Buffers 62914560 bytes
Redo Buffers 667648 bytes
Database mounted.
Database opened.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
2.检查bdump目录下生成的跟踪文件
目录在$ORACLE_HOME/admin/$ORACLE_SID/bdump目录下,每个后台进程都会生成一个跟踪文件。
[oracle@jumper bdump]$ ls
20050424_alert_conner.log conner_arc0_2569.trc conner_dbw0_2559.trc conner_reco_2567.trc
alert_conner.log conner_arc1_2571.trc conner_lgwr_2561.trc conner_smon_2565.trc
a.sql conner_ckpt_2563.trc conner_pmon_2557.trc
3.检查CKPT进程的跟踪文件
我们可以很容易的发现CKPT进程每3秒都对控制文件进行一次写入
[oracle@jumper bdump]$ tail -f conner_ckpt_2563.trc
WAIT #0: nam='rdbms ipc message' ela= 2994710 p1=300 p2=0 p3=0
WAIT #0: nam='control file parallel write' ela= 2442 p1=3 p2=3 p3=3
WAIT #0: nam='rdbms ipc message' ela= 2995171 p1=300 p2=0 p3=0
WAIT #0: nam='control file parallel write' ela= 2586 p1=3 p2=3 p3=3
WAIT #0: nam='rdbms ipc message' ela= 2994962 p1=300 p2=0 p3=0
WAIT #0: nam='control file parallel write' ela= 2582 p1=3 p2=3 p3=3
WAIT #0: nam='rdbms ipc message' ela= 2995020 p1=300 p2=0 p3=0
WAIT #0: nam='control file parallel write' ela= 2455 p1=3 p2=3 p3=3
WAIT #0: nam='rdbms ipc message' ela= 2995188 p1=300 p2=0 p3=0
WAIT #0: nam='control file parallel write' ela= 2412 p1=3 p2=3 p3=3
WAIT #0: nam='rdbms ipc message' ela= 2995187 p1=300 p2=0 p3=0
WAIT #0: nam='control file parallel write' ela= 2463 p1=3 p2=3 p3=3
WAIT #0: nam='rdbms ipc message' ela= 2995095 p1=300 p2=0 p3=0
WAIT #0: nam='control file parallel write' ela= 2448 p1=3 p2=3 p3=3
4.检查控制文件的变更
通过2次dump控制文件,比较其trace文件输出可以比较得到不同之处,我们发现,Oracle仅仅更新了Heartbeat这个数值。
[oracle@jumper udump]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Wed Jan 18 22:44:10 2006
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
SQL> alter session set events 'immediate trace name CONTROLF level 10';
Session altered.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
[oracle@jumper udump]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.4.0 - Production on Wed Jan 18 22:44:18 2006
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
SQL> alter session set events 'immediate trace name CONTROLF level 10' ;
Session altered.
SQL> exit
Disconnected from Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production
[oracle@jumper udump]$ ls
conner_ora_21896.trc conner_ora_21898.trc
[oracle@jumper udump]$ diff conner_ora_21896.trc conner_ora_21898.trc
1c1
< /opt/oracle/admin/conner/udump/conner_ora_21896.trc
---
> /opt/oracle/admin/conner/udump/conner_ora_21898.trc
14c14
< Unix process pid: 21896, image: oracle@jumper.hurray.com.cn (TNS V1-V3)
---
> Unix process pid: 21898, image: oracle@jumper.hurray.com.cn (TNS V1-V3)
16c16
< *** SESSION ID9.813) 2006-01-18 22:44:14.314
---
> *** SESSION ID9.815) 2006-01-18 22:44:21.569
63c63
< heartbeat: 579991793 mount id: 3191936000
---
> heartbeat: 579991796 mount id: 3191936000
[oracle@jumper udump]$
In 8.0.5 a heartbeat mechanism was included in CKPT's timeout action (every 3 seconds) to update the checkpoint progress record for the thread in the controlfile.