在10.2.0.4
RAC环境中,出现了更改SERVICE_NAME导致大量会话被KILL的现象。
告警日志中信息如下:
Wed Oct 24 20:06:16 2012
ALTER SYSTEM SET service_names='' SCOPE=MEMORY SID='orcl2';
Wed Oct 24 20:06:16 2012
ALTER SYSTEM SET service_names='orcl' SCOPE=MEMORY SID='orcl2';
Wed Oct 24 20:06:16 2012
Immediate Kill Session#: 1418, Serial#: 22066
Immediate Kill Session: sess: 0x18dc79b70
OS pid: 4879
Immediate Kill Session#: 1424, Serial#: 108
Immediate Kill Session: sess: 0x18dc81be0
OS pid: 15110
Immediate Kill Session#: 1425, Serial#: 22
Immediate Kill Session: sess: 0x18dc83148
OS pid: 15112
Immediate Kill Session#: 1426, Serial#: 9
Immediate Kill Session: sess: 0x18dc846b0
OS pid: 15157
Immediate Kill Session#: 1427, Serial#: 17
Immediate Kill Session: sess: 0x18dc85c18
OS pid: 15119
Immediate Kill Session#: 1429, Serial#: 24221
Immediate Kill Session: sess: 0x18dc886e8
OS pid: 1044
Immediate Kill Session#: 1430, Serial#: 9
Immediate Kill Session: sess: 0x18dc89c50
OS pid: 15126
.
.
.
Immediate Kill Session#: 1605, Serial#: 60258
Immediate Kill Session: sess: 0x18dd73e68
OS pid: 11966
Immediate Kill Session#: 1606, Serial#: 18413
Immediate Kill Session: sess: 0x18dd753d0
OS pid: 11999
Immediate Kill Session#: 1607, Serial#: 18517
Immediate Kill Session: sess: 0x18dd76938
OS pid: 15378
Immediate Kill Session#: 1608, Serial#: 57825
Immediate Kill Session: sess: 0x18dd77ea0
OS pid: 1035
Wed Oct 24 20:06:27 2012
Immediate Kill Session#: 1616, Serial#: 30253
Immediate Kill Session: sess: 0x18dd829e0
OS pid: 11977
Immediate Kill Session#: 1626, Serial#: 34413
Immediate Kill Session: sess: 0x18dd8fff0
OS pid: 4863
显然大量的KILL
SESSION和同一秒发生了ALTER SYSTEM SET SERVICE_NAME有直接关系,根据MOS文档Sessions Get Killed if Connection Use
Default Service name (Same as db_name) [ID 730315.1],这是为公布的Bug 6955040 ALL THE SESSIONS LOST CONNECTION AFTER KILLING CRSD.BIN。
当CRSD进程被杀掉或自动崩溃,会导致CLUSTER检测不到VIP资源的运行,因此数据库会删除默认的服务名并断开所有使用默认服务名的连接。
Oracle在10.2.0.5和11.1.0.7中解决了这个问题,如果没有升级的计划,那么不要使用了DB_NAME相同的服务名进行连接。