ORA-27300 ORA-27301 ORA-27302问题处理

上周客户AIX环境Oracle 10.2.0.4 RAC环境中遇到ORA-2730*一系列问题后节点1重启。我们具体看一下这个问题

环境介绍:
AIX 6100-06
Oracle 10.2.0.4 RAC 使用裸设备管理

1.系统出现268DA6A3错误
$ errpt -aj 268DA6A3
---------------------------------------------------------------------------
LABEL:          DR_DMA_MAPPER_FAIL
IDENTIFIER:     268DA6A3

Date/Time:       Thu Sep 18 18:29:00 GMT+08:00 2014
Sequence Number: 12463954
Machine Id:      00F7025F4C00
Class:           S
Type:            TEMP
WPAR:            Global
Resource Name:   DR_KER_MEM     

Description
Memory related DR operation failed

Probable Causes
DMA Mapper DR handler failure

Failure Causes
DMA specific memory mapper failed

        Recommended Actions
        Try DR operation on other memory resources

Detail Data
Return Code
           1          -1
Memory Address
0000 0007 E84C 0000
Handler Address
0000 0000 0413 0840
Module Name
/usr/lib/drivers/pci/pci_busdd
---------------------------------------------------------------------------
LABEL:          DR_DMA_MAPPER_FAIL
IDENTIFIER:     268DA6A3

Date/Time:       Thu Sep 18 18:28:00 GMT+08:00 2014
Sequence Number: 12463947
Machine Id:      00F7025F4C00
Class:           S
Type:            TEMP
WPAR:            Global
Resource Name:   DR_KER_MEM     

Description
Memory related DR operation failed

Probable Causes
DMA Mapper DR handler failure

Failure Causes
DMA specific memory mapper failed

        Recommended Actions
        Try DR operation on other memory resources

Detail Data
Return Code
           1          22
Memory Address
0000 0006 72B5 0000
Handler Address
0000 0000 0443 3680
Module Name
/usr/lib/drivers/headd

2.集群CRSD进程出现报错并最终导致进程中止
2014-09-18 18:28:05.716: [  CRSEVT][16919]32CAAMonitorHandler :: 0:Could not execute /oracle/product/10.2/crs/bin/racgwrap(check) for ora.vip
category: 1234, operation: scls_process_spawn, loc: read_pipe, OS error: 12, other: EOF on read pipe
2014-09-18 18:28:05.750: [  CRSAPP][16919]32CheckResource error for ora.vip error code = -1
……
[  OCRAPI][3368]procr_ctx_set_invalid_no_abort: ctx set to invalid
2014-09-18 18:36:51.127: [ CSSCLNT][11080]clsssRecvMsg: comm error received, comrc 11, con (114b682f0), msg (114b63150), msgl 144

2014-09-18 18:36:51.136: [ CSSCLNT][7994]clsssRecvMsg: comm error received, comrc 11, con (1134b5890), msg (1135b99f0), msgl 144

2014-09-18 18:36:51.178: [ CSSCLNT][7994]clssgsGGetStatus:  communications failed (0/3/324770392)

2014-09-18 18:36:51.178: [ CSSCLNT][7994]clssgsGGetStatus: returning 8

2014-09-18 18:36:51.156: [ CSSCLNT][11080]clssgsGGetStatus:  communications failed (0/3/0)

2014-09-18 18:36:51.178: [ CSSCLNT][11080]clssgsGGetStatus: returning 8

2014-09-18 18:36:51.218: [  CRSEVT][11080]32Error in clssgsgrpstat rc =8
2014-09-18 18:36:51.232: [    CRSD][7994][PANIC]32 termination by CSS, ret=
2014-09-18 18:36:51.241: [    CRSD][7994]32Done.

3.数据库alert日志中出现status 12错误信息
Thu Sep 18 18:30:35 2014
Process startup failed, error stack:
Thu Sep 18 18:30:35 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_psp0_11469288.trc:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3

4.之后节点2将节点1重启,并接管了节点1的vip
5.节点1重启之后,数据库alert日志中报
AUTO SGA: Disabling background sga auto-tuning.
Thu Sep 18 18:47:46 2014
Error 0 in kwqmnpartition(), aborting txn
Thu Sep 18 18:47:49 2014
ORA-376 encountered when generating server alert SMG-4120
Thu Sep 18 18:47:50 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_smon_10551752.trc:
ORA-01595: error freeing extent (12) of rollback segment (133))
ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: '/dev/vx/rdsk/vgorc/lvorcl_undotbs1_1'

使用recover datafile之后数据库恢复。

问题分析:
关于ORA-2730* status 12故障,主要可能有两种情况导致
1.服务器资源耗尽,比如内存或交换空间,或者是一些其他资源,在有些系统中可能是nproc或maxnproc参数设置太小需要进行调整(参考 文档 579365_1)
2.AIX系统需要安装IV37048补丁(IV37048 CIFS_FS LEAVES BEHIND DEFUNCT KERNEL PROCESSES)(参考MOS 文档 ID 1541121.1)
  如果是这种情况,服务器可能出现下面现象:
  -AIX系统不可用
  -系统命令,比如ps命令返回fork或malloc错误
  -无法连接到数据库
  -命令行HANG住
  -僵尸进程

请使用浏览器的分享功能分享到微信等