linux参数之/proc/sys/fs详解

aio-max-nr/aio-nr

最大允许的aio请求数/当前aio请求数

2.6之前的版本还有aio-max-size,自2.6AIO成为Linux默认选项;

aio-max-nr设置过小oracle可能遭遇ORA-27090,网上有类似案例:

运行于exadata上的11203,最大支持8000数据库连接,经常遭遇ORA-27090,经检查其aio-max-nr设置为3145728,但aio-nr已经为3145726

作者使用systemtap调试系统试图找出aio请求消耗如此多的原因,

stap -ve '

global allocated, allocatedctx, freed

  

probe syscall.io_setup {

  allocatedctx[pid()] += maxevents; allocated[pid()]++;

  printf("%d AIO events requested by PID %d (%s)\n",

    maxevents, pid(), cmdline_str());

}

probe syscall.io_destroy {freed[pid()]++}

  

probe kprocess.exit {

  if (allocated[pid()]) {

     printf("PID %d exited\n", pid());

     delete allocated[pid()];

     delete allocatedctx[pid()];

     delete freed[pid()];

  }

}

  

probe end {

foreach (pid in allocated) {

   printf("PID %d allocated=%d allocated events=%d freed=%d\n",

      pid, allocated[pid], allocatedctx[pid], freed[pid]);

}

}

'

输出结果如下

Pass 1: parsed user script. and 76 library script(s) using 147908virt/22876res/2992shr kb, in 130usr/10sys/146real ms.

Pass 2: analyzed script. 4 probe(s), 10 function(s), 3 embed(s), 4 global(s) using 283072virt/49864res/4052shr kb, in 450usr/140sys/586real ms.

Pass 3: using cached /root/.systemtap/cache/11/stap_111c870f2747cede20e6a0e2f0a1b1ae_6256.c

Pass 4: using cached /root/.systemtap/cache/11/stap_111c870f2747cede20e6a0e2f0a1b1ae_6256.ko

Pass 5: starting run.

128 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))

4096 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))

128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))

4096 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))

128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))

4096 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))

128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))

128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))

128 AIO events requested by PID 32885 (oracledbm1 (LOCAL=NO))

4096 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))

4096 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))

128 AIO events requested by PID 69142 (oracledbm1 (LOCAL=NO))

128 AIO events requested by PID 69099 (oracledbm1 (LOCAL=NO))

...

(and when control-C is pressed):

  

PID 99043 allocated=6 allocatedevents=12672 freed=3

PID 37074 allocated=12 allocatedevents=25344 freed=6

PID 99039 allocated=18 allocatedevents=38016 freed=9

PID 69142 allocated=24 allocatedevents=50688 freed=12

PID 32885 allocated=36 allocatedevents=76032 freed=18

PID 69099 allocated=6 allocatedevents=12672 freed=3

 

Oracle进程占用了大量的AIO,有些进程一次就请求4096,最后经oracle技术支持协商,将aio-max-nr设置为5000万,自此ORA-27090再也没有出现过

http://www.pythian.com/blog/troubleshooting-ora-27090-async-io-errors/

 

如何检查系统是否使用AIO

justin_$ cat /proc/slabinfo | grep kio

kioctx               579    920    384   10    1 : tunables   54   27    8 : slabdata     92     92      1

kiocb                 35     45    256   15    1 : tunables  120   60    8 : slabdata      3      3      0

 

如何确定oracle是否链接了AIO

justin_$ /usr/bin/ldd $ORACLE_HOME/bin/oracle | grep libaio

        libaio.so.1 => /usr/lib64/libaio.so.1 (0x00007fca30cb4000)

如果没有返回结果,则关闭数据库编译binary,具体文件为$ORACLE_HOME/rdbms/lib/ins_rdbms.mk

make PL_ORALIBS=-laio -f ins_rdbms.mk async_on

 

justin_$ more aio-max-nr

1048576

justin_$ more aio-nr

98482

 

 

file-max/nr_open

内核支持的最大file handle数量/一个进程最多使用的file handle

justin_$ more file-max

6815744

justin_$ more nr_open

1048576

 

file-nr

3列分别为:已分配的文件handle数量/已分配但没有使用的/最大文件handle

Linux 2.6起第2列一直为0 ,表示所有以分配的file  handle都在使用,但第1列应该经常变化

justin_$ more file-nr

786048  0       6815744

http://space.itpub.net/15480802/viewspace-734062 

 

 

inode-nr/ inode-state

Inode-max:最大inode数量,通常为file-max3-4倍,因为stdin/stdout/socket都需要inode,但2.6已经废弃;

Inode-nr:列出inode-state的前两个item,可以跳过不看

Inode-state:前3个列为nr_inodes/nr_free_inodes/preshrink,而前两个分别表示已分配inode/空闲inode数;当nr_inodes > inode_maxpreshirnk = nr_inodes inode_max,此时系统需要清除排查inode列表;

justin_$ more inode-state

134123  41514   0       0       0       0       0

justin_$ more inode-nr

134123  41514

 

Overflowgid/ overflowuid

LinuxUID/GID32位,但有些文件系统只支持16位的UID/GID,此时若进行写操作会出错;

UID/GID超过65535时会自动被转换为一个固定值,即上述两值

justin_$ more overflowgid

65534

justin_$ more overflowuid

65534

 

 

leases-enable/lease-break-time

linux也拥有文件锁,详情参照

http://blog.csdn.net/yebanghua/article/details/7301904

 

justin_$ more lease-break-time

45

justin_$ more leases-enable

1

justin_$ more suid_dumpable

0

 

 

 

/proc/sys/fs还包含一些子目录,诸如mqueue/quota/nfs/inotify

 

Mqueue目录

POSIX消息队列用于进程间交换数据,与System V消息队列类似,以下3个参数用于其基本设置;

可通过ipcs q查看当前系统的使用情况

justin_$ ipcs -q

 

------ Message Queues --------

key        msqid      owner      perms      used-bytes   messages   

msg_max

一个消息队列的最大消息数,默认为10

justin_$ more msg_max

10

msgsize_max

单个消息最大尺寸

justin_$ more msgsize_max

8192

queues_max

最大消息队列数

justin_$ more queues_max

256

 

 

 

请使用浏览器的分享功能分享到微信等