Causes of Oracle Buffer Busy Waits

From www.hotsos.com

One of the most confounding problems with Oracle is the resolution of buffer busy wait events. Buffer busy waits are common in an I/O-bound Oracle system, as evidenced by any system with read (sequential/scattered) waits in the top-five waits in the Oracle STATSPACK report, like this:

Top 5 Timed Events
% Total
Event Waits Time (s) Ela Time
--------------------------- ------------ ----------- -----------
db file sequential read 2,598 7,146 48.54
db file scattered read 25,519 3,246 22.04

library cache load lock 673 1,363 9.26
CPU time 2,154 934 7.83
log file parallel write 19,157 837 5.68

The main way to reduce buffer busy waits is to reduce the total I/O on the system. This can be done by tuning the SQL to access rows with fewer block reads (i.e., by adding indexes). Even if we have a huge db_cache_size, we may still see buffer busy waits, and increasing the buffer size won't help.

In order to look at system-wide wait events, we can query the v$system_event performance view. This view, shown below, provides the name of the wait event, the total number of waits and timeouts, the total time waited, and the average wait time per event.

select * from v$system_event where event like ‘%wait%’;
 

EVENT                       TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT

--------------------------- ----------- -------------- ----------- ------------

buffer busy waits                636528           1557      549700   .863591232

write complete waits               1193              0       14799   12.4048617

free buffer waits                  1601              0         622   .388507183

The type of buffer that causes the wait can be queried using the v$waitstat view. This view lists the waits per buffer type for buffer busy waits, where COUNT is the sum of all waits for the class of block, and TIME is the sum of all wait times for that class:
select * from v$waitstat;

CLASS COUNT TIME
------------------ ---------- ----------
data block 1961113 1870278
segment header 34535 159082
undo header 233632 86239
undo block 1886 1706

Buffer busy waits occur when an Oracle session needs to access a block in the buffer cache, but cannot because the buffer copy of the data block is locked. This buffer busy wait condition can happen for either of the following reasons:
  • The block is being read into the buffer by another session, so the waiting session must wait for the block read to complete.
  • Another session has the buffer block locked in a mode that is incompatible with the waiting session's request.

Because buffer busy waits are due to contention between particular blocks, there's nothing you can do until you know which blocks are in conflict and why the conflicts are occurring. Tuning therefore involves identifying and eliminating the cause of the block contention.

The v$session_wait performance view, shown below, can give some insight into what is being waited for and why the wait is occurring.

SQL> desc v$session_wait Name Null? Type ----------------------------------------- -------- --------------------- SID NUMBER SEQ# NUMBER EVENT VARCHAR2(64) P1TEXT VARCHAR2(64) P1 NUMBER P1RAW RAW(4) P2TEXT VARCHAR2(64) P2 NUMBER P2RAW RAW(4) P3TEXT VARCHAR2(64) P3 NUMBER P3RAW RAW(4) WAIT_TIME NUMBER SECONDS_IN_WAIT NUMBER STATE VARCHAR2(19)

The columns of the v$session_wait view that are of particular interest for a buffer busy wait event are:

  • P1—The absolute file number for the data file involved in the wait.
  • P2—The block number within the data file referenced in P1 that is being waited upon.
  • P3—The reason code describing why the wait is occurring.

Here's an Oracle data dictionary query for these values:select p1 "File #", p2 "Block #", p3 "Reason Code"
from v$session_wait
where event = 'buffer busy waits';

If the output from repeatedly running the above query shows that a block or range of blocks is experiencing waits, the following query should show the name and type of the segment:
select owner, segment_name, segment_type
from dba_extents
where file_id = &P1
and &P2 between block_id and block_id + blocks -1;


Once the segment is identified, the v$segment_statistics performance view facilitates real-time monitoring of segment-level statistics. This enables a DBA to identify performance problems associated with individual tables or indexes, as shown below.

select object_name, statistic_name, value from V$SEGMENT_STATISTICS where object_name = 'SOURCE$'; OBJECT_NAME STATISTIC_NAME VALUE----------- ------------------------- ----------SOURCE$ logical reads 11216SOURCE$ buffer busy waits 210SOURCE$ db block changes 32SOURCE$ physical reads 10365SOURCE$ physical writes 0SOURCE$ physical reads direct 0SOURCE$ physical writes direct 0SOURCE$ ITL waits 0SOURCE$ row lock waits

We can also query the dba_data_files to determine the file_name for the file involved in the wait by using the P1 value from v$session_wait for the file_id.

SQL> desc dba_data_files Name Null? Type ----------------------------------------- -------- ---------------------------- FILE_NAME VARCHAR2(513) FILE_ID NUMBER TABLESPACE_NAME VARCHAR2(30) BYTES NUMBER BLOCKS NUMBER STATUS VARCHAR2(9) RELATIVE_FNO NUMBER AUTOEXTENSIBLE VARCHAR2(3) MAXBYTES NUMBER MAXBLOCKS NUMBER INCREMENT_BY NUMBER USER_BYTES NUMBER USER_BLOCKS NUMBER

Interrogating the P3 (reason code) value from v$session_wait for a buffer busy wait event will tell us why the session is waiting. The reason codes range from 0 to 300 and can be decoded, as shown in Table A.

Table A

Code

Reason for wait

-

A modification is happening on a SCUR or XCUR buffer but has not yet completed.

0

The block is being read into the buffer cache.

100

We want to NEW the block, but the block is currently being read by another session (most likely for undo).

110

We want the CURRENT block either shared or exclusive but the block is being read into cache by another session, so we have to wait until its read() is completed.

120

We want to get the block in current mode, but someone else is currently reading it into the cache. Wait for the user to complete the read. This occurs during buffer lookup.

130

Block is being read by another session, and no other suitable block image was found, so we wait until the read is completed. This may also occur after a buffer cache assumed deadlock. The kernel can't get a buffer in a certain amount of time and assumes a deadlock. Therefore it will read the CR version of the block.

200

We want to NEW the block, but someone else is using the current copy, so we have to wait for that user to finish.

210

The session wants the block in SCUR or XCUR mode. If this is a buffer exchange or the session is in discrete TX mode, the session waits for the first time and the second time escalates the block as a deadlock, so does not show up as waiting very long. In this case, the statistic: "exchange deadlocks" is incremented, and we yield the CPU for the "buffer deadlock" wait event.

220

During buffer lookup for a CURRENT copy of a buffer, we have found the buffer but someone holds it in an incompatible mode, so we have to wait.

230

Trying to get a buffer in CR/CRX mode, but a modification has started on the buffer that has not yet been completed.

231

CR/CRX scan found the CURRENT block, but a modification has started on the buffer that has not yet been completed.

Reason codes

As I mentioned at the beginning of this article, buffer busy waits are prevalent in I/O-bound systems. I/O contention, resulting in waits for data blocks, is often due to numerous sessions repeatedly reading the same blocks, as when many sessions scan the same index. In this scenario, session one scans the blocks in the buffer cache quickly, but then a block has to be read from disk. While session one awaits the disk read to complete, other sessions scanning the same index soon catch up to session one and want the same block currently being read from disk. This is where the buffer busy wait occurs—waiting for the buffer blocks that are being read from disk.
The following rules of thumb may be useful for resolving each of the noted contention situations:

  • Data block contention—Identify and eliminate HOT blocks from the application via changing PCTFREE and or PCTUSED values to reduce the number of rows per data block. Check for repeatedly scanned indexes. Since each transaction updating a block requires a transaction entry, increase the INITRANS value.
  • Freelist block contention—Increase the FREELISTS value. Also, when using Parallel Server, be certain that each instance has its own FREELIST GROUPs.
  • Segment header contention—Again, increase the number of FREELISTs and use FREELIST GROUPs, which can make a difference even within a single instance.
  • Undo header contention—Increase the number of rollback segments.

Waits and Statistics

Time in Oracle

6 July 1999

What resolution are timed statistics reported in by Oracle?

Almost all Oracle times are in centiseconds, but some are in seconds. The SQL*Plus SET TIMING ON is an exception. It is port specific, and gives milliseconds on some platforms, but not all.

Recursive calls

12 October 1999

A utlestat report I have run on a particular instance shows 245 recursive calls per second. Is there any way I can work how much of the recursive activity is due to dynamic space management, and how much is due to the heavy PL/SQL usage? In your experience, how much recursive SQL activity is it reasonable to expect from PL/SQL intensive applications? One of the tuning documents on MetaLink says that ratio of the recursive calls to user calls should be less than 10%.

The contribution of system tasks (such as populating the dictionary cache, dynamic extension, and so) is almost always insignificant by comparison with the number of recursive calls from stored program units. The target of 10% for recursive calls is an old version 6 tuning rule that no longer applies.

Timing in Oracle

27 October 1999

I have noticed that Oracle measures some statistics in seconds, some in 100ths of a second, and some in milliseconds. I have been told by Oracle Support that these units may be documented incorrectly. For example, today I had question about WRITETIM in V$FILESTAT, and they said that this parameter is in milliseconds, whereas the documentation says 100ths of a second. Anyway, how can the true units for a statistic be determined?

All statistics in the V$ views are either in seconds or centiseconds, and I am not aware of any documentation bugs regarding this. Support are definitely wrong about V$FILESTAT showing milliseconds. There was a paper circulating about 2 years ago that made that claim, and that may be where the information is coming from, but it is clearly wrong. The only use of milliseconds in Oracle is that the SET TIMING ON command returns milliseconds on some platforms. If you suspect a documentation bug, the only way to check is to measure whatever it is against a known timer such as V$TIMER or DBMS_UTILITY.GET_TIME.

Slow inserting

28 October 1999

Here is a listing from V$SYSTEM_EVENT. What's causes all the pmon timer and smon timer waits? Increasing the number of block buffers, the shared pool and sort area size only has marginal impact on these waits. There are 50 rollback segments and I have resized them with large extents with only marginal impact. I am using Oracle 8.0.5.1 on HP K-class with 6 processors and raw partition. This database is used to benchmark a heavily inserting application. Could these waits be caused by the CPU being so much faster than I/O?

EVENT                       TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED AVERAGE_WAIT
--------------------------- ----------- -------------- ----------- ------------
rdbms ipc message                105790          14558     5503890   52.0265621
SQL*Net message from client     2818486              0     4438999   1.57495868
log file sync                   2771634            210     2842901   1.02571299
pmon timer                         4041           4035     1215290   300.739916
smon timer                           43             39     1193952   27766.3256
enqueue                          157396              2      918233   5.83390302
latch free                       623534         451472      914831     1.467171
buffer busy waits                636528           1557      549700   .863591232
log file parallel write          558785              0      187438   .335438496
db file parallel write            17350              4      108726   6.26662824
buffer deadlock                   48272          46659       93475   1.93642277
rdbms ipc reply                    8000              0       82468      10.3085
write complete waits               1193              0       14799   12.4048617
control file parallel write        4056              0        8055   1.98594675
SQL*Net message to client       2818487              0        4344   .001541252
db file sequential read            6823              0        1463   .214421809
log file switch completion           66              0        1126   17.0606061
free buffer waits                  1601              0         622   .388507183
db file scattered read              341              0         242   .709677419
sort segment request                  1              1         101          101

This database has several problems, but they are not the ones you think. The pmon timer and smon timer events are idle waits. These background processes wait on these event when they have no work to do.

Your first problem is that you don't have sufficient freelists on the table to which you are inserting, or possibly have a primary key index on a sequence generated primary key that is not a reverse key index. This is probably what is causing your buffer busy waits and buffer deadlock waits.

You also have enqueue waits that are probably a side effect of the lack of process freelists. Also, if your latch free waits are largely on the cache buffers chains latches, then that is probably another side effect.

The ratio of you log file sync to log file parallel write waits indicates that your log_buffer is much too big. Drop it back to 160K.

Your write complete waits and rdbms ipc reply waits indicate the either DBWn's write batch is too big, or you are checkpointing too intensively. Make sure that db_files is not much higher than the number of datafiles that you actually have. If you are using incremental checkpoints ( db_block_max_dirty_target) don't be too ambitious.

Here is the V$WAITSTAT output. We are benchmarking around 150 transactions per second (several inserts and one update). The table is using 10 freelists. We use an internally generated id instead of an Oracle sequence. The log buffer is 1.6M. The log files are 1G and do not switch during the benchmark period. The database block is 2K. Is this the cause of all my problems?

CLASS                   COUNT       TIME
------------------ ---------- ----------
data block            1961113    1870278
segment header          34535     159082
undo header            233632      86239
undo block               1886       1706

Yes, that db_block_size is a problem, because it will prevent you having more than 24 process freelists, and you need more than that. I recommend an 8K block size for this level of concurrency. However, if you have time, you may also want to benchmark 16K. Also, put 50 process freelists on the key tables, and 2 freelist groups, and drop your log_buffer to 160K.

However, your real problem is probably your failure to use a proper Oracle sequence to generate the id numbers!

Event 10046

3 November 1999

I don't understand what "event 10046, level 8" is?

Event 10046 is a widely used superset of sql_trace. The syntax for the parameter file setting is

event = "10046 trace name context forever, level 8"

The syntax for a single session is

alter session set events '10046 trace name context forever, level 8';

Have a play with it and get familiar with the output. It is a very valuable tuning/diagnostic tool. See the excellent Oracle Note 39817.1 for a detailed explanation of the information in the trace file.

<

请使用浏览器的分享功能分享到微信等