今天发现11g数据库(版本11.2.0.2.0)的smon进程生成了一个巨大的trace文件,约200g,还有一个trm文件也有几十g,导致服务器本地的文件系统爆满。数据库不能写相关的日志。
trace文件的内容是:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/11.2.0
System name: Linux
Node name: cpemsii-db04
Release: 2.6.18-194.1.AXS3
Version: #1 SMP Fri May 7 10:03:53 CST 2010
Machine: x86_64
Instance name: exttrack2
Redo thread mounted by this instance: 2
Oracle process number: 25
Unix process pid: 5598, image: oracle@cpemsii-db04 (SMON)
*** 2011-12-22 17:41:47.604
*** SESSION ID:(1776.1) 2011-12-22 17:41:47.604
*** CLIENT ID:() 2011-12-22 17:41:47.604
*** SERVICE NAME:() 2011-12-22 17:41:47.604
*** MODULE NAME:() 2011-12-22 17:41:47.604
*** ACTION NAME:() 2011-12-22 17:41:47.604
* kju_tsn_aff_drm_pending TRACEUD: called with tsn x5, dissolve 0
* kju_tsn_aff_drm_pending TRACEUD: tsn_pkey = x5.1
* >> RM REQ QS ---:
single window RM request queue is empty
multi-window RM request queue is empty
* Global DRM state ---:
There is no dynamic remastering
RM lock state = 0
pkey 5.1 undo 1 stat 0 masters[1, 2->2] reminc 14 RM# 144
flg x1 type x0 afftime x1bd1148f
nreplays by lms 0 = 0
nreplays by lms 1 = 0
nreplays by lms 2 = 0
* kju_tsn_aff_drm_pending TRACEUD: matching request not found on swin queue
* kju_tsn_aff_drm_pending TRACEUD: pp found, stat x0
* kju_tsn_aff_drm_pending TRACEUD: 2 return true
*** 2011-12-22 17:46:49.792
* kju_tsn_aff_drm_pending TRACEUD: called with tsn x5, dissolve 0
* kju_tsn_aff_drm_pending TRACEUD: tsn_pkey = x5.1
* >> RM REQ QS ---:
single window RM request queue is empty
multi-window RM request queue is empty
* Global DRM state ---:
There is no dynamic remastering
RM lock state = 0
pkey 5.1 undo 1 stat 0 masters[1, 2->2] reminc 14 RM# 144
flg x1 type x0 afftime x1bd1148f
nreplays by lms 0 = 0
nreplays by lms 1 = 0
nreplays by lms 2 = 0
* kju_tsn_aff_drm_pending TRACEUD: matching request not found on swin queue
* kju_tsn_aff_drm_pending TRACEUD: pp found, stat x0
* kju_tsn_aff_drm_pending TRACEUD: 2 return true
。。。。。。
。。。。。。
。。。。。。
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
domid 65535 (addhv 0 numhv 0), pkey 74912.0 tobe 2 , options x0
通过分析这个是oracle 11.2.0.2.0上的一个bug:
Bug 12615778: SMON GENERATED HUGE TRACE FILE | |||||
|
Bug Attributes
Type | B - Defect | Fixed in Product Version | 11.2.0.3.0 |
Severity | 2 - Severe Loss of Service | Product Version | 11.2.0.2.0 |
Status | 35 - To Filer for Review | Platform | 226 - Linux x86-64 |
Created | 02-Jun-2011 | Platform. Version | ORACLE LINUX 5 |
Updated | 27-Apr-2012 | Base Bug | - |
Database Version | 11.2.0.2.0 | ||
Affects Platforms | Generic | ||
Product Source | Oracle |
Related Products
Line | Oracle Database Products | Family | Oracle Database |
Area | Oracle Database | Product | 5 - Oracle Server - Enterprise Edition |
Hdr: 12615778 11.2.0.2.0 RDBMS 11.2.0.2.0 RAC PRODID-5 PORTID-226
Abstract: SMON GENERATED HUGE TRACE FILE
*** 06/02/11 01:15 am ***
----
PROBLEM:
--------
2 nodes RAC
SMON generated huge trace file (55G byte). Then no space are
availble on the file system.
* the customer need to restart the instance
Because SMON was holding the trace file.
(They did not try "oradebug close_trace")
* shutdown hang more than 24 hours
SMON process seemed to be spinning.
DIAGNOSTIC ANALYSIS:
--------------------
WORKAROUND:
-----------
none (shutdown abort for terminating the instance)
RELATED BUGS:
-------------
REPRODUCIBILITY:
----------------
unknown
It happened just once.
TEST CASE:
----------
none
STACK TRACE:
------------
SUPPORTING INFORMATION:
-----------------------
requesting
- alert.log on the other instance
- trace files for RAC related processes (e.g. LMS LMON)
- applied patch list
24 HOUR CONTACT INFORMATION FOR P1 BUGS:
----------------------------------------
DIAL-IN INFORMATION:
--------------------
IMPACT DATE:
------------
*** 06/02/11 01:23 am *** (CHG: Sta->16)
*** 06/02/11 01:23 am ***
*** 06/02/11 02:11 am ***
*** 06/02/11 03:56 am ***
*** 06/08/11 11:53 pm ***
smon trace file is filled with following entry
domid 65535 (addhv 0 numhv 0), pkey 52114.0 tobe 1 , options x0
Last lines before starting to write above are
*** 13:36:03.217
* kju_tsn_aff_drm_pending TRACEUD: called with tsn x2, dissolve 0
* kju_tsn_aff_drm_pending TRACEUD: tsn_pkey = x2.1
* >> RM REQ QS ---:
single window RM request queue is empty
multi-window RM request queue is empty
* Global DRM state ---:
There is no dynamic remastering
RM lock state = 0
pkey 2.1 undo 1 stat 0 masters[32768, 1->1] reminc 4 RM# 1
flg x0 type x0 afftime xf0e10f85
nreplays by lms 0 = 0
nreplays by lms 1 = 0
* kju_tsn_aff_drm_pending TRACEUD: matching request not found on swin queue
* kju_tsn_aff_drm_pending TRACEUD: pp found, stat x0
* kju_tsn_aff_drm_pending TRACEUD: 2 return true
*** 13:38:38.250
* kju_tsn_aff_drm_pending TRACEUD: called with tsn x2, dissolve 0
* kju_tsn_aff_drm_pending TRACEUD: tsn_pkey = x2.1
* >> RM REQ QS ---:
single window RM request queue is empty
multi-window RM request queue:
domid 65535 (addhv 0 numhv 0), pkey 52114.0 tobe 1 , options x0
domid 65535 (addhv 0 numhv 0), pkey 52114.0 tobe 1 , options x0
:
*** 06/09/11 12:27 am ***
*** 06/09/11 12:38 am ***
*** 06/09/11 12:41 am ***
*** 06/09/11 12:43 am ***
*** 06/09/11 12:44 am ***
*** 06/09/11 12:46 am ***
*** 06/09/11 12:46 am *** (ADD: Impact/Symptom->PROCESS HANG )
*** 06/09/11 12:49 am *** (CHG: Sta->11)
*** 06/09/11 12:49 am ***
*** 06/09/11 12:49 am ***
*** 07/11/11 08:47 am ***
*** 07/11/11 08:48 am ***
*** 07/14/11 12:26 pm *** (CHG: Sta->30)
*** 07/14/11 12:26 pm ***
*** 07/14/11 02:38 pm ***
*** 07/14/11 06:40 pm *** (CHG: Sta->11)
*** 07/14/11 06:40 pm ***
*** 07/15/11 11:32 am ***
*** 07/15/11 11:32 am *** (CHG: Sta->35)
*** 07/15/11 11:32 am ***
*** 02/29/12 12:25 am ***
*** 02/29/12 12:26 am ***
*** 03/13/12 03:34 am ***
*** 03/13/12 04:41 am ***
*** 03/21/12 01:44 pm ***
*** 03/21/12 01:45 pm ***
*** 03/26/12 01:19 am ***
*** 03/26/12 02:47 am ***
*** 04/04/12 09:50 pm ***
*** 04/04/12 09:51 pm ***
*** 04/04/12 09:53 pm ***
*** 04/10/12 10:49 am ***
*** 04/11/12 02:26 am ***
*** 04/17/12 01:28 am ***
*** 04/22/12 10:46 am ***
*** 04/22/12 10:47 am ***
*** 04/24/12 04:52 am ***
*** 04/24/12 05:41 am ***
*** 04/27/12 04:01 am ***
可以通过如下办法删除这个文件,再rm掉。
sqlplus / as sysdba
SQL> oradebug setospid
SQL> oradebug setospid
SQL> oradebug flush; /*写出trace buffer内容到trace文件*/
SQL> oradebug close_trace
SQL> oradebug close_trace
参考文件:
SMON process Spinning & Huge Trace files With "domid 65535" messages is generated in RAC may show ORA-600 [kjsmbesmi:DDET!] [ID 1440902.1]