一 :环境说明
1.1 硬件需求
至少需要三台服务器, 1 主、 1 从、 1 监视器,由于资源有限,将监视器安装在主库机器上,本次实验使用 2 台虚拟机。
1.2 网络需求
心跳 IP 和外部 IP 不同网段,本次实验没有单独配置,和外部 IP 相同。
1.3 注意事项
在搭建数据守护系统前,应注意数据守护系统中各实例使用的 DM 服务器版本应一致,同时还应注意各实例所在主机的操作系统位数、大小端模式、时区及时间设置都应一致,以及使用同一个用户启动 DM 服务器和守护进程 dmwatcher ,以免系统在运行时出现意想不到的错误。
|
主库 |
备库 |
监视库 |
主机名 |
cjc-db-03 |
cjc-db-04 |
cjc-db-03 |
物理 IP |
172.16.6.101 |
172.16.6.102 |
172.16.6.101 |
心跳 IP |
172.16.6.101 |
172.16.6.102 |
172.16.6.101 |
数据库端口 |
5238 |
5238 |
5238 |
MAL_INST_DW_PORT |
33141 |
33142 |
不涉及 |
MAL_PORT |
61141 |
61142 |
|
MAL_DW_PORT |
52141 |
52142 |
|
DBNAME |
CHEN |
CJC |
|
数据库版本 |
V8 |
V8 |
|
参考:
https://eco.dameng.com/document/dm/zh-cn/pm/data-guard-construction.html
二:环境准备
配置数据守护 V4.0 之前,必须先通过备份还原方式同步各数据库的数据,确保各数据的数据保持完全一致。主库可以是新初始化的数据库,也可以是正在生产、使用中的数据库。
不能使用分别初始化库或者直接拷贝数据文件的方法,原因如下:
1. 每个库都有一个永久魔数( permenant_magic ),一经生成,永远不会改变,主库传送日志时会判断这个值是否一样,确保是来自同一个数据守护环境中的库,否则传送不了日志。
2. 由于 dminit 初始化数据库时,会生成随机密钥用于加密,每次生成的密钥都不相同,备库无法解析采用主库密钥加密的数据。
3. 每个库都有一个数据库魔数( DB_MAGIC ),每经过一次还原、恢复操作, DB_MAGIC 就会产生变化,需要通过这种方式来区分同一个数据守护环境中各个不同的库。
2.1 主库脱机备份
启动 DMAP 服务
[dmdba@cjc-db-03 dm8]$ cd /dm8/bin/service_template/ [dmdba@cjc-db-03 service_template]$ ./DmAPService start
脱机备份
[dmdba@cjc-db-03 service_template]$ dmrman CTLSTMT="BACKUP DATABASE '/dm8/data/CHEN/dm.ini' FULL TO BACKUP_FILE1 BACKUPSET '/dm8/bak/BACKUP_FILE_01'"
查看备份文件
[dmdba@cjc-db-03 bak]$ ll -rth BACKUP_FILE_01/ total 7.1M -rw-r--r--. 1 dmdba dinstall 7.0M Oct 10 15:33 BACKUP_FILE_01.bak -rw-r--r--. 1 dmdba dinstall 81K Oct 10 15:33 BACKUP_FILE_01.meta
压缩
[dmdba@cjc-db-03 dm8]$ tar -zcvf bak.tar.gz bak/
传到备库
[dmdba@cjc-db-03 dm8]$ scp bak.tar.gz dmdba@172.16.6.102:/dm8
2.2 备库脱机恢复
安装 DM 数据库软件
[dmdba@cjc-db-04 mnt]$ ./DMInstall.bin -i
创建 DM数据库实例
[dmdba@cjc-db-04 mnt]$ dminit path=/dm8/data DB_NAME=CJC instance_name=CJC port_num=5238
恢复备份
[dmdba@cjc-db-04 dm8]$ tar -zxvf bak.tar.gz [dmdba@cjc-db-04 dm8]$ /dm8/bin/service_template/DmAPService start [dmdba@cjc-db-04 dm8]$ dmrman CTLSTMT="restore database '/dm8/data/CJC/dm.ini' from backupset '/dm8/bak/BACKUP_FILE_01'" [dmdba@cjc-db-04 dm8]$ dmrman CTLSTMT="recover database '/dm8/data/CJC/dm.ini' from backupset '/dm8/bak/BACKUP_FILE_01'" [dmdba@cjc-db-04 dm8]$ dmrman CTLSTMT="recover database '/dm8/data/CJC/dm.ini' update DB_MAGIC "
三:配置主库 CHEN
库名和实例名 :CHEN
[dmdba@cjc-db-03 CHEN]$ disql SYSDBA/Dameng123:5238 Server[LOCALHOST:5238]:mode is normal, state is open login used time : 6.215(ms) disql V8 SQL> select name from v$database; LINEID NAME ---------- ---- 1 CHEN used time: 14.585(ms). Execute id is 500. SQL> select instance_name from v$instance; LINEID INSTANCE_NAME ---------- ------------- 1 CHEN used time: 1.817(ms). Execute id is 501.
3.1配置 dm.ini(主库 )
在主库机器上配置主库的实例名为 CHEN , dm.ini 参数修改如下:
注意:
1. 实例名,总长度不能超过 16
2. 下面参数已经存在 dm.ini 文件中,需要修改对应的值。
[dmdba@cjc-db-03 CHEN]$ vi dm.ini
INSTANCE_NAME = CHEN PORT_NUM = 5238 ##数据库实例监听端口 DW_INACTIVE_INTERVAL = 60 ##接收守护进程消息超时时间,默认60 ALTER_MODE_STATUS = 1 ##不允许手工方式修改实例模式/状态/OGUID,默认1 ENABLE_OFFLINE_TS = 2 ##不允许备库OFFLINE表空间,默认1 MAL_INI = 1 ##打开MAL系统,默认0 ARCH_INI = 1 ##打开归档配置,默认0 RLOG_SEND_APPLY_MON = 64 ##统计最近64次的日志发送信息,默认64
3.2配置 dmmal.ini(主库 )
配置 MAL 系统,各主备库的 dmmal.ini 配置必须完全一致,
MAL_HOST 使用内部网络 IP ,
MAL_PORT 与 dm.ini 中 PORT_NUM 使用不同的端口值, MAL_DW_PORT 是各实例对应的守护进程之间,以及守护进程和监视器之间的通信端口,配置如下:
[dmdba@cjc-db-03 CHEN]$ vi dmmal.ini
MAL_CHECK_INTERVAL = 5 ##MAL链路检测时间间隔 MAL_CONN_FAIL_INTERVAL = 5 ##判定MAL链路断开的时间 [MAL_INST1] MAL_INST_NAME = CHEN ##实例名,和dm.ini中的INSTANCE_NAME一致 MAL_HOST = 172.16.6.101 ##MAL系统监听TCP连接的IP地址 MAL_PORT = 61141 ##MAL系统监听TCP连接的端口 MAL_INST_HOST = 172.16.6.101 ##实例的对外服务IP地址 MAL_INST_PORT = 5238 ##实例的对外服务端口,和dm.ini中的PORT_NUM一致 MAL_DW_PORT = 52141 ##实例本地的守护进程监听TCP连接的端口 MAL_INST_DW_PORT = 33141 ##实例监听守护进程TCP连接的端口 [MAL_INST2] MAL_INST_NAME = CJC MAL_HOST = 172.16.6.102 MAL_PORT = 61142 MAL_INST_HOST = 172.16.6.102 MAL_INST_PORT = 5238 MAL_DW_PORT = 52142 MAL_INST_DW_PORT = 33142
3.3配置 dmarch.ini(主库 )
修改 dmarch.ini ,配置本地归档和实时归档。
除了本地归档外,其他归档配置项中的 ARCH_DEST 表示实例是 Primary 模式时,需要同步归档数据的目标实例名。
当前实例 CHEN 是主库,需要向 CJC (实时备库)同步数据,因此实时归档的 ARCH_DEST 配置为 CJC 。
[dmdba@cjc-db-03 CHEN]$ mkdir arch
[dmdba@cjc-db-03 CHEN]$ pwd
/dm8/data/CHEN
[dmdba@cjc-db-03 CHEN]$ vi dmarch.ini
[ARCHIVE_REALTIME] ARCH_TYPE = REALTIME ##实时归档类型 ARCH_DEST = CJC ##实时归档目标实例名 [ARCHIVE_LOCAL1] ARCH_TYPE = LOCAL ##本地归档类型 ARCH_DEST = /dm8/data/CHEN/arch ##本地归档文件存放路径 ARCH_FILE_SIZE = 128 ##单位Mb,本地单个归档文件最大值 ARCH_SPACE_LIMIT = 0 ##单位Mb,0表示无限制,范围1024~2147483647M
3.4配置 dmwatcher.ini(主库 )
修改 dmwatcher.ini 配置守护进程,配置为全局守护类型,使用自动切换模式。
[dmdba@cjc-db-03 CHEN]$ vi dmwatcher.ini
[GRP1] DW_TYPE = GLOBAL ##全局守护类型 DW_MODE = AUTO ##自动切换模式 DW_ERROR_TIME = 10 ##远程守护进程故障认定时间 INST_RECOVER_TIME = 60 ##主库守护进程启动恢复的间隔时间 INST_ERROR_TIME = 10 ##本地实例故障认定时间 INST_OGUID = 453331 ##守护系统唯一OGUID值 INST_INI = /dm8/data/CHEN/dm.ini ##dm.ini配置文件路径 INST_AUTO_RESTART = 1 ##打开实例的自动启动功能 INST_STARTUP_CMD = /dm8/bin/dmserver ##命令行方式启动 RLOG_SEND_THRESHOLD = 0 ##指定主库发送日志到备库的时间阈值,默认关闭 RLOG_APPLY_THRESHOLD = 0 ##指定备库重演日志的时间阈值,默认关闭
3.5启动主库
以 Mount 方式启动主库
[root@cjc-db-03 dm8]# systemctl stop DmServiceCJC.service [dmdba@cjc-db-03 CHEN]$ dmserver /dm8/data/CHEN/dm.ini mount file dm.key not found, use default license! version info: develop DM Database Server 64 V8 03134283890-20220720-165295-10045 startup... Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL Database mode = 0, oguid = 0 License will expire on 2023-07-20 file lsn: 39941 ndct db load finished ndct second level fill fast pool finished ndct third level fill fast pool finished ndct second level fill fast pool finished ndct third level fill fast pool finished ndct fill fast pool finished nsvr_startup end. aud sys init success. aud rt sys init success. systables desc init success. ndct_db_load_info success. SYSTEM IS READY.
注意
一定要以 Mount 方式启动数据库实例,否则系统启动时会重构回滚表空间,生成 Redo 日志;
并且,启动后应用可能连接到数据库实例进行操作,破坏主备库的数据一致性。
数据守护配置结束后,守护进程会自动 Open 数据库。
3.6 设置 OGUID
启动命令行工具 DIsql ,登录主库设置 OGUID 值。
[dmdba@cjc-db-03 CHEN]$ disql SYSDBA/Dameng123:5238 SQL>SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1); SQL>sp_set_oguid(453331); SQL>SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);
注意
系统通过 OGUID 值确定一个守护进程组,由用户保证 OGUID 值的唯一性,并确保数据守护系统中,数据库、守护进程和监视器配置相同的 OGUID 值。
3.7 修改数据库模式
启动命令行工具 DIsql ,登录主库修改数据库为 Primary 模式
SQL>alter database primary;
四:配置备库 CJC
4.1 配置 dm.ini( 备库 )
在主库机器上配置主库的实例名为 CJC , dm.ini 参数修改如下:
注意:
1. 实例名,总长度不能超过 16
2. 下面参数已经存在 dm.ini 文件中,需要修改对应的值。
[dmdba@cjc-db-04 dm8]$ cd /dm8/data/CJC/
[dmdba@cjc-db-04 CJC]$ vi dm.ini
INSTANCE_NAME = CJC PORT_NUM = 5238 ##数据库实例监听端口 DW_INACTIVE_INTERVAL = 60 ##接收守护进程消息超时时间,默认60 ALTER_MODE_STATUS = 0 ##不允许手工方式修改实例模式/状态/OGUID,默认1 ENABLE_OFFLINE_TS = 2 ##不允许备库OFFLINE表空间,默认1 MAL_INI = 1 ##打开MAL系统,默认0 ARCH_INI = 1 ##打开归档配置,默认0 RLOG_SEND_APPLY_MON = 64 ##统计最近64次的日志重演信息
4.2 配置 dmmal.ini( 备库 )
配置 MAL 系统,各主备库的 dmmal.ini 配置必须完全一致, MAL_HOST 使用内部网络 IP , MAL_PORT 与 dm.ini 中 PORT_NUM 使用不同的端口值, MAL_DW_PORT 是各实例对应的守护进程之间,以及守护进程和监视器之间的通信端口,配置如下:
[dmdba@cjc-db-04 CJC]$ vi dmmal.ini
MAL_CHECK_INTERVAL = 5 ##MAL链路检测时间间隔 MAL_CONN_FAIL_INTERVAL = 5 ##判定MAL链路断开的时间 [MAL_INST1] MAL_INST_NAME = CHEN ##实例名,和dm.ini中的INSTANCE_NAME一致 MAL_HOST = 172.16.6.101 ##MAL系统监听TCP连接的IP地址 MAL_PORT = 61141 ##MAL系统监听TCP连接的端口 MAL_INST_HOST = 172.16.6.101 ##实例的对外服务IP地址 MAL_INST_PORT = 5238 ##实例的对外服务端口,和dm.ini中的PORT_NUM一致 MAL_DW_PORT = 52141 ##实例本地的守护进程监听TCP连接的端口 MAL_INST_DW_PORT = 33141 ##实例监听守护进程TCP连接的端口 [MAL_INST2] MAL_INST_NAME = CJC MAL_HOST = 172.16.6.102 MAL_PORT = 61142 MAL_INST_HOST = 172.16.6.102 MAL_INST_PORT = 5238 MAL_DW_PORT = 52142 MAL_INST_DW_PORT = 33142
4.3配置 dmarch.ini(备库 )
修改 dmarch.ini ,配置本地归档和实时归档。
除了本地归档外,其他归档配置项中的 ARCH_DEST 表示实例是 Primary 模式时,需要同步归档数据的目标实例名。
当前实例 CJC 是备库,守护系统配置完成后,可能在各种故障处理中, CJC 切换为新的主库,正常情况下, CHEN 会切换为新的备库,需要向 CHEN 同步数据,因此实时归档的 ARCH_DEST 配置为 CHEN 。
[dmdba@cjc-db-04 CJC]$ vi dmarch.ini
[ARCHIVE_REALTIME] ARCH_TYPE = REALTIME ##实时归档类型 ARCH_DEST = CHEN ##实时归档目标实例名 [ARCHIVE_LOCAL1] ARCH_TYPE = LOCAL ##本地归档类型 ARCH_DEST = /dm8/data/CJC/arch ##本地归档文件路径 ARCH_FILE_SIZE = 128 ##单位Mb,本地单个归档文件最大值 ARCH_SPACE_LIMIT = 0 ##单位Mb,0表示无限制,范围1024~2147483647M
4.4 配置 dmwatcher.ini( 备库 )
修改 dmwatcher.ini 配置守护进程,配置为全局守护类型,使用自动切换模式。
[GRP1] DW_TYPE = GLOBAL ##全局守护类型 DW_MODE = AUTO ##自动切换模式 DW_ERROR_TIME = 10 ##远程守护进程故障认定时间 INST_RECOVER_TIME = 60 ##主库守护进程启动恢复的间隔时间 INST_ERROR_TIME = 10 ##本地实例故障认定时间 INST_OGUID = 453331 ##守护系统唯一OGUID值 INST_INI = /dm8/data/CJC/dm.ini##dm.ini配置文件路径 INST_AUTO_RESTART = 1 ##打开实例的自动启动功能 INST_STARTUP_CMD = /dm8/bin/dmserver ##命令行方式启动 RLOG_APPLY_THRESHOLD = 0 ##指定备库重演日志的时间阈值,默认关闭
4.5 启动备库
以 Mount 方式启动备库
[dmdba@cjc-db-04 CJC]$ dmserver /dm8/data/CJC/dm.ini mount [dmdba@cjc-db-04 dm8]$ disql SYSDBA/Dameng123:5238
注意
一定要以 Mount 方式启动数据库实例,否则系统启动时会重构回滚表空间,生成 Redo 日志;并且,启动后应用可能连接到数据库实例进行操作,破坏主备库的数据一致性。数据守护配置结束后,守护进程会自动 Open 数据库。
4.6 设置 OGUID
启动命令行工具 DIsql ,登录备库设置 OGUID 值为 453331
SQL>SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1); SQL>sp_set_oguid(453331); SQL>SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0);
注意
系统通过 OGUID 值确定一个守护进程组,由用户保证 OGUID 值的唯一性,并确保数据守护系统中,数据库、守护进程和监视器配置相同的 OGUID 值。
4.6 修改数据库模式
启动命令行工具 DIsql ,登录备库修改数据库为 Standby 模式。
如果当前数据库不是 Normal 模式,需要先修改 dm.ini 中 ALTER_MODE_STATUS 值为 1 ,允许修改数据库模式,修改 Standby 模式成功后再改回为 0 。
如果是 Normal 模式,请忽略下面的第 1 步和第 3 步。
SQL>SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1); ----第1步 SQL>alter database standby; ----第2步 SQL>SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 0); ----第3步
五:配置监视器
在 CJC-DB-03 机器上配置。
配置监视器时可以选择配置单实例监视器或配置多实例监视器,下面将分别介绍配置单实例监视器以及多实例监视器时 dmmonitor.ini 配置文件中的配置信息。
配置单实例监视器
由于主库和实时备库的守护进程配置为自动切换模式,因此这里选择配置确认监视器。
和普通监视器相比,确认监视器除了相同的命令支持外,在主库发生故障时,能够自动通知实时备库接管为新的主库,具有自动故障处理的功能。
注意
故障自动切换模式下,必须配置确认监视器,且确认监视器最多只能配置一个
5.1 配置 dmmonitor.ini
修改 dmmonitor.ini 配置确认监视器,其中 MON_DW_IP 中的 IP 和 PORT 和 dmmal.ini 中的 MAL_HOST 和 MAL_DW_PORT 配置项保持一致。
[dmdba@cjc-db-03 CHEN]$ vi dmmonitor.ini
MON_DW_CONFIRM = 1 ##确认监视器模式 MON_LOG_PATH = /dm8/data/log ##监视器日志文件存放路径 MON_LOG_INTERVAL = 60 ##每隔60s定时记录系统信息到日志文件 MON_LOG_FILE_SIZE = 32 ##每个日志文件最大32M MON_LOG_SPACE_LIMIT = 0 ##不限定日志文件总占用空间 [GRP1] MON_INST_OGUID = 453331 ##组GRP1的唯一OGUID值 MON_DW_IP = 172.16.6.101:52141 MON_DW_IP = 172.16.6.102:52142
六:配置主从
6.1 启动守护进程
启动各个主备库上的守护进程:
主:
[dmdba@cjc-db-03 CHEN]$ dmwatcher /dm8/data/CHEN/dmwatcher.ini
从:
[dmdba@cjc-db-04 ~]$ dmwatcher /dm8/data/CHEN/dmwatcher.ini
守护进程启动后,进入 Startup 状态,此时实例都处于 Mount 状态。
守护进程开始广播自身和其监控实例的状态信息,结合自身信息和远程守护进程的广播信息,守护进程将本地实例 Open ,并切换为 Open 状态。
6.2 启动监视器
[dmdba@cjc-db-03 CHEN]$ dmmonitor /dm8/data/CHEN/dmmonitor.ini
监视器提供一系列命令,支持当前守护系统状态查看以及故障处理,可输入 help 命令,查看各种命令说明使用,结合实际情况选择使用。
至此一主一备的实时数据守护系统搭建完毕,在搭建步骤和各项配置都正确的情况下,在监视器上执行 show 命令,可以监控到所有实例都处于 Open 状态,所有守护进程也都处于 Open 状态,即为正常运行状态。
七:数据同步测试
7.1 主库新增数据
主库
[dmdba@cjc-db-03 dm8]$ disql SYSDBA/Dameng123:5238 SQL> select name,status$,role$ from v$database; LINEID NAME STATUS$ ROLE$ ---------- ---- ----------- ----------- 1 CHEN 4 1
创建表
SQL> create table t1(id int); executed successfully used time: 39.548(ms). Execute id is 6. SQL> insert into t1 values(1),(2),(3); affect rows 3 used time: 3.138(ms). Execute id is 7. SQL> commit; executed successfully used time: 30.124(ms). Execute id is 8.
7.2从库检查数据同步
从库
[dmdba@cjc-db-04 ~]$ disql SYSDBA/Dameng123:5238 Server[LOCALHOST:5238]:mode is standby, state is open login used time : 5.973(ms) disql V8 SQL> select name,status$,role$ from v$database; LINEID NAME STATUS$ ROLE$ ---------- ---- ----------- ----------- 1 CHEN 4 2 used time: 9.511(ms). Execute id is 100. SQL> select * from t1; LINEID ID ---------- ----------- 1 1 2 2 3 3 used time: 3.325(ms). Execute id is 102.
从库只读
SQL> insert into t1 values(4),(5); insert into t1 values(4),(5); [-2018]:Error in line: 1 Try to insert/update/delete table table is not temporary or contains lob on standby mode. used time: 2.128(ms). Execute id is 0.
八:数据库主从切换
使用 dmmonitor 工具进行主从切换
相关命令如下
37.switchover [group_name[.]] [db_name] --switchover specified database of specified group as primary database 38.takeover [group_name[.]] [db_name] --use specified database of specified group to takeover as primary database 39.takeover force [group_name[.]] [db_name] --use specified database of specified group to takeover as primary database forcefully
7.1 手动切换
查看主从状态
show global info
2022-10-10 16:50:51 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 16:50:50 GLOBAL VALID OPEN CHEN OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 16:50:50 GLOBAL VALID OPEN CJC OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID #================================================================================#
开始切换
switchover GRP1.CJC [monitor] 2022-10-10 16:40:15: Not login dmmonitor or server public key changed, please try to login again!
登录
login username:SYSDBA password: [monitor] 2022-10-10 16:50:19: Login dmmonitor success!
再次切换
switchover GRP1.CJC
日志如下:
[monitor] 2022-10-10 16:51:27: Start to switchover instance CJC [monitor] 2022-10-10 16:51:27: Notify dmwatcher(CHEN) switch to SWITCHOVER status [monitor] 2022-10-10 16:51:27: Dmwatcher process CHEN status switching [OPEN-->SWITCHOVER] [monitor] 2022-10-10 16:51:29: Switch dmwatcher CHEN to SWITCHOVER status success [monitor] 2022-10-10 16:51:29: Notify dmwatcher(CJC) switch to SWITCHOVER status [monitor] 2022-10-10 16:51:29: Dmwatcher process CJC status switching [OPEN-->SWITCHOVER] [monitor] 2022-10-10 16:51:29: Switch dmwatcher CJC to SWITCHOVER status success [monitor] 2022-10-10 16:51:29: Instance CHEN start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) [monitor] 2022-10-10 16:51:29: Instance CHEN execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) success [monitor] 2022-10-10 16:51:29: Instance CJC start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) [monitor] 2022-10-10 16:51:29: Instance CJC execute sql SP_SET_GLOBAL_DW_STATUS(0, 6) success [monitor] 2022-10-10 16:51:29: Instance CHEN start to execute sql ALTER DATABASE MOUNT [monitor] 2022-10-10 16:51:29: Instance CHEN execute sql ALTER DATABASE MOUNT success [monitor] 2022-10-10 16:51:29: Instance CJC start to execute sql SP_APPLY_KEEP_PKG() [monitor] 2022-10-10 16:51:29: Instance CJC execute sql SP_APPLY_KEEP_PKG() success [monitor] 2022-10-10 16:51:29: Instance CJC start to execute sql ALTER DATABASE MOUNT [monitor] 2022-10-10 16:51:29: Instance CJC execute sql ALTER DATABASE MOUNT success [monitor] 2022-10-10 16:51:29: Instance CHEN start to execute sql ALTER DATABASE STANDBY [monitor] 2022-10-10 16:51:30: Instance CHEN execute sql ALTER DATABASE STANDBY success [monitor] 2022-10-10 16:51:30: Instance CJC start to execute sql ALTER DATABASE PRIMARY [monitor] 2022-10-10 16:51:31: Instance CJC execute sql ALTER DATABASE PRIMARY success [monitor] 2022-10-10 16:51:31: Notify instance CJC to change all arch status to be invalid [monitor] 2022-10-10 16:51:31: Succeed to change all instances arch status to be invalid [monitor] 2022-10-10 16:51:31: Instance CHEN start to execute sql ALTER DATABASE OPEN FORCE [monitor] 2022-10-10 16:51:32: Instance CHEN execute sql ALTER DATABASE OPEN FORCE success [monitor] 2022-10-10 16:51:32: Instance CJC start to execute sql ALTER DATABASE OPEN FORCE [monitor] 2022-10-10 16:51:32: Instance CJC execute sql ALTER DATABASE OPEN FORCE success [monitor] 2022-10-10 16:51:32: Instance CHEN start to execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) [monitor] 2022-10-10 16:51:32: Instance CHEN execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) success [monitor] 2022-10-10 16:51:32: Instance CJC start to execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) [monitor] 2022-10-10 16:51:32: Instance CJC execute sql SP_SET_GLOBAL_DW_STATUS(6, 0) success [monitor] 2022-10-10 16:51:32: Notify dmwatcher(CHEN) switch to OPEN status [monitor] 2022-10-10 16:51:32: Dmwatcher process CHEN status switching [SWITCHOVER-->OPEN] [monitor] 2022-10-10 16:51:32: Switch dmwatcher CHEN to OPEN status success [monitor] 2022-10-10 16:51:32: Notify dmwatcher(CJC) switch to OPEN status [monitor] 2022-10-10 16:51:33: Dmwatcher process CJC status switching [SWITCHOVER-->OPEN] [monitor] 2022-10-10 16:51:33: Switch dmwatcher CJC to OPEN status success [monitor] 2022-10-10 16:51:33: Notify group(GRP1)'s dmwatcher to do clear [monitor] 2022-10-10 16:51:34: Clean request of dmwatcher processer CHEN success 2022-10-10 16:51:34 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 16:51:33 GLOBAL VALID OPEN CJC OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 172.16.6.102 5238 OK CJC OPEN PRIMARY 0 0 REALTIME VALID 6432 43497 6432 43498 NONE < > DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 16:51:34 GLOBAL VALID OPEN CHEN OK 1 1 OPEN STANDBY DSC_OPEN REALTIME INVALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 172.16.6.101 5238 OK CHEN OPEN STANDBY 0 0 REALTIME INVALID 6431 42139 6431 42139 NONE DATABASE(CHEN) APPLY INFO FROM (CJC), REDOS_PARALLEL_NUM (1): DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[6431, 6431, 6431], (RLSN, SLSN, KLSN)[42139, 42139, 42139], N_TSK[0], TSK_MEM_USE[0] REDO_LSN_ARR: (42139) #================================================================================# [monitor] 2022-10-10 16:51:34: Clean request of dmwatcher processer CJC success [monitor] 2022-10-10 16:51:34: Switchover instance CJC success [monitor] 2022-10-10 16:51:35: Dmwatcher process CJC status switching [OPEN-->RECOVERY] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-10-10 16:51:35 RECOVERY OK CJC OPEN PRIMARY VALID 5 43498 43498 [monitor] 2022-10-10 16:51:38: Dmwatcher process CJC status switching [RECOVERY-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-10-10 16:51:37 OPEN OK CJC OPEN PRIMARY VALID 5 43499 43499
查看主从状态,切换成功
show global info
2022-10-10 16:53:01 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 16:53:01 GLOBAL VALID OPEN CJC OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 16:53:01 GLOBAL VALID OPEN CHEN OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID #================================================================================#
新主库插入数据测试
[dmdba@cjc-db-04 ~]$ disql SYSDBA/Dameng123:5238 SQL> insert into t1 values(4),(5); SQL> commit;
新从库查询数据同步
[dmdba@cjc-db-03 ~]$ disql SYSDBA/Dameng123:5238 SQL> select * from t1; LINEID ID ---------- ----------- 1 1 2 2 3 3 4 4 5 5 used time: 6.220(ms). Execute id is 600.
回切
switchover GRP1.CHEN
查看主从状态,回切成功
show global info
2022-10-10 16:56:38 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 16:56:38 GLOBAL VALID OPEN CHEN OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 16:56:37 GLOBAL VALID OPEN CJC OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID #================================================================================#
7.2自动切换
停止主库
SQL> shutdown immediate; executed successfully used time: 2.060(ms). Execute id is 0.
查看主从状态
show global info
2022-10-10 16:59:20 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 16:59:20 GLOBAL VALID OPEN CHEN OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 16:59:19 GLOBAL VALID OPEN CJC OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID #================================================================================#
原主库被自动拉起了,没有发生自动切换
查看 dmwatcher 自动启动主库日志:
Waitpid error! file dm.key not found, use default license! version info: develop DM Database Server 64 V8 03134283890-20220720-165295-10045 startup... Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL Database mode = 1, oguid = 453331 License will expire on 2023-07-20 file lsn: 45230 ndct db load finished ndct second level fill fast pool finished ndct third level fill fast pool finished ndct second level fill fast pool finished ndct third level fill fast pool finished ndct fill fast pool finished nsvr_startup end. aud sys init success. aud rt sys init success. systables desc init success. ndct_db_load_info success. SYSTEM IS READY. iid page's trxid[13037] NEXT TRX ID = 13038 pseg_collect_mgr_items, total collect 0 active_trxs, 0 cmt_trxs, 0 pre_cmt_trxs, 0 to_release_trxs, 0 active_pages, 0 cmt_pages, 0 pre_cmt_pages, 0 to_release_pages, 0 mgr pages, 0 mgr recs! iid page's trxid[14039] NEXT TRX ID = 15041. total 0 active crash trx, pseg_crash_trx_rollback sys_only(0) begin ... pseg_crash_trx_rollback end, total 0 active crash trx, include 0 empty_trxs, 0 empty_pages which only need to delete mgr recs. pseg_crash_trx_rollback end pseg recv finished nsvr_process_before_open begin. nsvr_process_before_open success. total 0 active crash trx, pseg_crash_trx_rollback sys_only(0) begin ... pseg_crash_trx_rollback end, total 0 active crash trx, include 0 empty_trxs, 0 empty_pages which only need to delete mgr recs. pseg_crash_trx_rollback end
九:日常维护
9.1 停数据库
dmmonitor 相关命令
19.startup dmwatcher [group_name] --startup watching specified group 20.stop dmwatcher [group_name] --stop watching specified group 21.startup group [group_name] --startup all databases of specified group 22.stop group [group_name] --exit all databases of specified group 23.kill group [group_name] --kill all active databases of specified group
stop group GRP1
停止所有数据库
[monitor] 2022-10-10 17:07:25: Stop dmwatcher process of instance CHEN[PRIMARY, OPEN, ISTAT_SAME:TRUE] [monitor] 2022-10-10 17:07:25: Dmwatcher process CHEN status switching [OPEN-->SHUTDOWN] [monitor] 2022-10-10 17:07:26: Stop dmwatcher process of instance CHEN[PRIMARY, OPEN, ISTAT_SAME:TRUE] success [monitor] 2022-10-10 17:07:26: Stop dmwatcher process of instance CJC[STANDBY, OPEN, ISTAT_SAME:TRUE] [monitor] 2022-10-10 17:07:26: Dmwatcher process CJC status switching [OPEN-->SHUTDOWN] [monitor] 2022-10-10 17:07:26: Stop dmwatcher process of instance CJC[STANDBY, OPEN, ISTAT_SAME:TRUE] success [monitor] 2022-10-10 17:07:26: Notify instance(CHEN) shutdown.
查看状态,停止成功
show global info
2022-10-10 17:08:02 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE ERROR DATABASE: <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 17:08:01 GLOBAL VALID SHUTDOWN CHEN ERROR 1 1 SHUTDOWN PRIMARY DSC_OPEN REALTIME VALID DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 17:08:01 GLOBAL VALID SHUTDOWN CJC ERROR 1 1 SHUTDOWN STANDBY DSC_OPEN NONE VALID #================================================================================#
9.2停止 dmwatcher
stop dmwatcher GRP1
在 SHUTDOWN 状态下无法停止 dmwatcher
[monitor] 2022-10-10 17:08:59: Dmwatcher CHEN is already in SHUTDOWN status [monitor] 2022-10-10 17:08:59: Dmwatcher CJC is already in SHUTDOWN status [monitor] 2022-10-10 17:08:59: Cannot stop dmwatcher process group GRP1
执行 crtl+c 强制中断
9.3 停止 dmmonitor
执行 crtl+c 强制中断
9.4 启动
先启动 dmwatcher
主库
dmwatcher /dm8/data/CHEN/dmwatcher.ini
从库
dmwatcher /dm8/data/CJC/dmwatcher.ini
再启动 dmmonitor
dmmonitor /dm8/data/CHEN/dmmonitor.ini
可以看到,数据库自动启动了
[dmdba@cjc-db-03 CHEN]$ dmmonitor /dm8/data/CHEN/dmmonitor.ini [monitor] 2022-10-10 17:16:14: DMMONITOR[4.0] V8 [monitor] 2022-10-10 17:16:14: DMMONITOR[4.0] IS READY. [monitor] 2022-10-10 17:16:15: Received message from(CHEN) WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-10-10 17:16:14 OPEN OK CHEN OPEN PRIMARY VALID 8 48148 48149 [monitor] 2022-10-10 17:16:15: Received message from(CJC) WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-10-10 17:16:14 OPEN OK CJC OPEN STANDBY VALID 8 48148 48148
查看进程
[dmdba@cjc-db-03 ~]$ ps -ef|grep dmm|grep -v grep dmdba 6322 4182 0 17:16 pts/1 00:00:00 dmmonitor /dm8/data/CHEN/dmmonitor.ini [dmdba@cjc-db-03 ~]$ ps -ef|grep dmw|grep -v grep dmdba 6213 4731 0 17:15 pts/3 00:00:00 dmwatcher /dm8/data/CHEN/dmwatcher.ini [dmdba@cjc-db-03 ~]$ ps -ef|grep dms|grep -v grep dmdba 6227 1 0 17:15 ? 00:00:00 /dm8/bin/dmserver /dm8/data/CHEN/dm.ini mount
通过 dmmonitor启动
启动数据库
startup group GRP1
启动 dmwatcher
startup dmwatcher GRP1
手动启动方式
监控服务器
dmmonitor /dm8/data/CHEN/dmmonitor.ini
主库
dmserver /dm8/data/CHEN/dm.ini dmwatcher /dm8/data/CHEN/dmwatcher.ini
从库
dmserver /dm8/data/CJC/dm.ini dmwatcher /dm8/data/CJC/dmwatcher.ini
9.5 dmwatcher常用命令
查看帮助信息
help Dameng dmwatcher supports the following command: 1.help --dmwatcher help info 2.status --show dmwatcher status 3.show --show local database information of all dmwatcher groups 4.show group group_name --show local database information of specified group 5.show version --show self dmwatcher version 6.show monitor config --help for dmmonitor configuration 7.show link --show tcp connect info of local dmwatcher 8.exit --exit dmwatcher status 2022-10-10 17:06:17 -------------------------- GROUP_NAME = GRP1 DW_STATUS = OPEN DW_SUB_STATUS = SUB_STATE_START
9.6 dmmonitor常用命令
查看帮助信息 help
help Dmmonitor supports the following commands: NOTE: [group_name] in command should be specified if exists more than one group, [db_name] in command should be specified if exists more than one instance, [group_name] and [db_name] should be separated by '.' for [show monitor] command, if [db_name] is empty, then use first active dmwatcher in dmmonitor ini file. #=============================================================================================# #------------------------------------FOR GLOBAL COMMAND---------------------------------------# 1.help --show help information 2.exit --exit dmmonitor 3.show version --show self dmmonitor version 4.show global info --show global database information of all groups 5.show database [group_name.]db_name --show detail database information of specified database 6.show [group_name] --show detail database information of specified group, if not specified, it will show all groups 7.show i[nterval] n --auto show database information in console every n seconds 8.q --cancel 'auto show' in console 9.list [[group_name.]db_name] --list configuration of specified dmwatcher, if not specified, it will show all dmwatchers 10.show open info [group_name.]db_name --show open history of specified database 11.show arch send info [group_name.]db_name --show source database arch send info to specified database(include recover time info) 12.show apply stat [group_name.]db_name --show specified database apply stat 13.show monitor [group_name[.]] [db_name] --show all dmmonitor connection info of specified dmwatcher 14 show state --show state of all monitors which are in the same monitor group with current monitor 15.tip --show system currently working status 16.login --login dmmonitor 17.logout --logout dmmonitor 18.get takeover time --get how much time will delay before standby instance will do takeover 19.startup dmwatcher [group_name] --startup watching specified group 20.stop dmwatcher [group_name] --stop watching specified group 21.startup group [group_name] --startup all databases of specified group 22.stop group [group_name] --exit all databases of specified group 23.kill group [group_name] --kill all active databases of specified group 24.choose switchover [group_name] --choose databases which can do switchover 25.choose takeover [group_name] --choose databases which can do takeover 26.choose takeover force [group_name] --choose databases which can do takeover force 27.set group [group_name] auto restart on --set all instances of specified group auto restart on 28.set group [group_name] auto restart off --set all instances of specified group auto restart off 29.set group [group_name] para_name para_value --set all dmwatchers of specified group configuration(set both ini file and value in memory) --if group_name not specified, it will notify all groups to execute --para_name: specified parameter name, para_value: specified parameter value --can set parameters: DW_ERROR_TIME/INST_RECOVER_TIME/INST_ERROR_TIME/INST_AUTO_RESTART/ INST_SERVICE_IP_CHECK/RLOG_SEND_THRESHOLD/RLOG_APPLY_THRESHOLD 30.set group [group_name] recover time time_value --set all standby database of specified group recover time with time_value(time_value is an integer: 3~86400(s))(only set value in dmwatcher memory) --if group_name not specified, it will notify all groups to execute 31.set group [group_name] arch invalid --set all standby databases of specified group arch status invalid --if group_name not specified, it will notify all groups to execute 32.clear group [group_name] arch send info --clear source database recent arch send info to all standby databases of specified group(notify source database to execute) --if group_name not specified, it will notify all groups to execute 33.clear group [group_name] apply stat --clear all standby databases of specified group recent arch apply stat(notify all standby databases to execute) --if group_name not specified, it will notify all groups to execute 34.check recover [group_name.]db_name --check specified database of specified group can be auto recovered or not 35.check open [group_name.]db_name --check specified database of specified group can be auto opened or not 36.open database [group_name.]db_name --open specified database of specified group 37.switchover [group_name[.]] [db_name] --switchover specified database of specified group as primary database 38.takeover [group_name[.]] [db_name] --use specified database of specified group to takeover as primary database 39.takeover force [group_name[.]] [db_name] --use specified database of specified group to takeover as primary database forcefully 40.set database [group_name.]db_name recover time time_value --set specified database of specified group recover time with time_value(time_value is an integer: 3~86400(s))(only set in dmwatcher memory) 41.set database [group_name.]db_name arch invalid --set specified database of specified group arch status invalid 42.detach database [group_name.]db_name --detach specified standby database from specified group 43.attach database [group_name.]db_name --attach specified standby database to specified group 44.startup dmwatcher database [group_name.]db_name --startup watching specified database 45.stop dmwatcher database [group_name.]db_name --stop watching specified database 46.startup database [group_name.]db_name --startup specified database of specified group 47.stop database [group_name.]db_name --stop specified database of specified group 48.kill database [group_name.]db_name --kill specified database of specified group 49.clear database [group_name.]db_name arch send info --clear primary database recent arch send info to specified standby database(notify primary database to execute) 50.clear database [group_name.]db_name apply stat --clear specified standby database recent arch apply stat(notify standby database to execute) 51.set database [group_name.]db_name until time time_val --set specified database of specified group archive send until time with time_val 52.cancel database [group_name.]db_name until time --cancel specified database of specified group archive send until time 53.show mpp --show mpp site info 54.startup dmwatcher all --startup watching all groups 55.stop dmwatcher all --stop watching all groups 56.startup group all --startup all database of all groups 57.stop group all --exit all database of all groups 58.kill group all --kill active database of all groups 59.check mppctl --check all active primary databases' dmmpp.ctl files are same or not 60.recover mppctl --recover all active primary databases' dmmpp.ctl files to be same ---Type q/Qto quit, or to continue---
查看当前工作状态
tip
[monitor] 2022-10-10 16:41:05: Instance CHEN[PRIMARY, OPEN, ISTAT_SAME:TRUE] cannot join other instances, dmwatcher status is OPEN, SYSOPENHISTORY status is VALID [monitor] 2022-10-10 16:41:05: Instance CHEN[PRIMARY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently [monitor] 2022-10-10 16:41:05: Instance CHEN[PRIMARY, OPEN, ISTAT_SAME:TRUE] is OK, dmwatcher status is OPEN, dw_type is GLOBAL [monitor] 2022-10-10 16:41:05: Instance CJC[STANDBY, OPEN, ISTAT_SAME:TRUE] can join instance CHEN[PRIMARY, OPEN, ISTAT_SAME:TRUE] [monitor] 2022-10-10 16:41:05: Instance CJC[STANDBY, OPEN, ISTAT_SAME:TRUE] has no command to execute currently [monitor] 2022-10-10 16:41:05: Instance CJC[STANDBY, OPEN, ISTAT_SAME:TRUE] is OK, dmwatcher status is OPEN, dw_type is GLOBAL [monitor] 2022-10-10 16:41:05: Group(GRP1) current active instances are OK [monitor] 2022-10-10 16:41:05: All groups' current active instances are OK!
查看详细信息 show
show
2022-10-10 16:07:22 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 16:07:22 GLOBAL VALID OPEN CHEN OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 172.16.6.101 5238 OK CHEN OPEN PRIMARY 0 0 REALTIME VALID 5527 41213 5527 41214 NONE < > DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 16:07:21 GLOBAL VALID OPEN CJC OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 172.16.6.102 5238 OK CJC OPEN STANDBY 0 0 REALTIME VALID 5510 41212 5510 41212 NONE DATABASE(CJC) APPLY INFO FROM (CHEN), REDOS_PARALLEL_NUM (1): DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[5526, 5526, 5527], (RLSN, SLSN, KLSN)[41212, 41212, 41213], N_TSK[0], TSK_MEM_USE[512] REDO_LSN_ARR: (41212) #================================================================================#
查看数据库信息
show global info
2022-10-10 16:23:28 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 16:23:26 GLOBAL VALID OPEN CHEN OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.102 52142 2022-10-10 16:23:27 GLOBAL VALID OPEN CJC OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID #================================================================================#
查看日志
[dmdba@cjc-db-03 log]$ tail -10f dmmonitor_20221010155019.log
[monitor] 2022-10-10 15:54:19: GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 453331 TRUE AUTO FALSE <> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 172.16.6.101 52141 2022-10-10 15:54:18 GLOBAL VALID STARTUP CHEN OK 1 1 MOUNT PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 172.16.6.101 5238 OK CHEN MOUNT PRIMARY 0 0 REALTIME VALID 5510 39941 5510 39941 NONE
十:故障排查
10.1 启动 dmwatcher 后数据库没有自动 OPEN
问题现象:
1. 启动 dmwatcher 后数据库没有自动 OPEN 。
2. 启动 dmmonitor 后,没有备库信息,主库状态为 MOUNT 。
主库
[dmdba@cjc-db-03 dm8]$ disql SYSDBA/Dameng123:5238 Server[LOCALHOST:5238]:mode is primary, state is mount login used time : 4.614(ms) disql V8 SQL> select status$ from v$instance; LINEID STATUS$ ---------- ------- 1 MOUNT used time: 12.026(ms). Execute id is 0.
尝试手动启动,报错
SQL> alter database open; alter database open; [-720]:Error in line: 1 Dmwatcher is active, or current configuration(ALTER_MODE_STATUS) not allowed to alter database. used time: 0.733(ms). Execute id is 0.
问题原因:
主、备库的防火墙和 selinux 没有关闭
解决方案:
关闭主、备库的防火墙和 selinux ,恢复正常。