| RHEL 6.6: IPC Send timeout/node eviction etc with high packet reassembles failure (文档 ID 2008933.1) |
转到底部
|
|
|
In this Document
APPLIES TO:Oracle Database - Enterprise Edition - Version 11.2.0.1 and laterGeneric Linux SYMPTOMSRed Hat Enterprise Linux or Oracle Linux running Red-Hat compatible kernel, after upgraded to 6.6, database/node fails with messages:
Fri May 01 03:05:48 2015 Please check instance 1 alert and LMON trace files for detail.Please check instance 1 alert and LMON trace files for detail. LMS0 (ospid: 28660): terminating the instance due to error 481 Fri May 01 03:06:43 2015
System state dump requested by (instance=3, osid=28660 (LMS0)), summary=[abnormal instance termination].
While this is happening, "netstat" shows huge jump of "packet reassembles failed":
==>> before the issue, the following number is more or less stable or increasing slowly
6817 packet reassembles failed .... ==>> in 30 minutes it increased by 50 6867 packet reassembles failed ==>> now the issue is happening and in 10 seconds it increased by 7533 - 6867 = 666 7533 packet reassembles failed ==>> in another 10 seconds it increased by 9630 - 7533 = 2097 9630 packet reassembles failed
Other symptoms could be: 1. node eviction 2. instance/node won't join the cluster after instance/node eviction without rebooting the node where "packet reassembles failed" is happening
CAUSERHEL 6.6 has a few ipfrag fix and increased the default ipfrag_*_thresh:
cat /proc/sys/net/ipv4/ipfrag_low_thresh
3145728 cat /proc/sys/net/ipv4/ipfrag_high_thresh 4194304
However, the issue is still happening, for Oracle Linux running Red-Hat compatible kernel, the issue is being tracked: BUG 21036841 - LCOV5/7/17 SERVER CRASHED AFTER PATCH UPGRADE AND KERNEL UPGRADE SOLUTIONThe issue is not fixed at the time of this writing, the temporary workaround is to enable jumbo frame or Increase value of below kernel parameter as mentioned below,
net.ipv4.ipfrag_high_thresh = 16M Units of these values are MB. |
转到底部