在上篇(http://space.itpub.net/17203031/viewspace-755198)中,我们介绍了自动捕获SQL Plan Baseline的初步过程。当存在一个baseline的时候,SQL CBO生成的执行计划又和Baseline相匹配,在shared pool中会建立父子共享游标。共享执行计划中被标注上Plan Baseline。
5、使用10053跟踪匹配过程
今天我们使用10053这个执行计划跟踪过程,看看刚才那个过程是怎么样进行决策的。首先需要清理shared pool。
SQL> alter system flush shared_pool;
System altered
SQL> alter system flush buffer_cache;
System altered
--查看Trace File信息
SQL> select value from v$diag_info where name='Default Trace File';
VALUE
----------------------------------------------------------------------------
/u01/diag/rdbms/wilson/wilson/trace/wilson_ora_3651.trc
启动10053监控。
SQL> alter session set events '10053 trace name context forever, level 12';
会话已更改。
SQL> select /*+ SPM DEMO*/* from t where wner='SCOTT';
(结果省略……)
SQL> alter session set events '10053 trace name context off';
会话已更改。
10053 Trace File信息关键部分如下:
****************
QUERY BLOCK TEXT
****************
select /*+ SPM DEMO*/* from t where wner='SCOTT' –处理SQL,变换之后
---------------------
QUERY BLOCK SIGNATURE
---------------------
signature (optimizer): qb_name=SEL$1 nbfros=1 flg=0
fro(0): flg=0 bjn=77122 hint_alias="T"@"SEL$1"
-----------------------------
SYSTEM STATISTICS INFORMATION(系统统计量信息,参与CPU成本计算公式)
-----------------------------
Using NOWORKLOAD Stats
CPUSPEEDNW: 2722 millions instructions/sec (default is 100)
IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
IOSEEKTIM: 12 milliseconds (default is 10)
MBRC: -1 blocks (default is 8)
***************************************
BASE STATISTICAL INFORMATION(对象统计量,参与CPU成本计算公式)
***********************
Table Stats::
Table: T Alias: T
#Rows: 12 #Blks: 1 AvgRowLen: 78.00
Index Stats::
Index: IDX_T_OWNER Col#: 1
LVLS: 0 #LB: 1 #DK: 1 LB/K: 1.00 DB/K: 1.00 CLUF: 1.00
Access path analysis for T
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for T[T]
Table: T Alias: T
Card: Original: 12.000000 Rounded: 12 Computed: 12.00 Non Adjusted: 12.00
Access Path: TableScan
Cost: 2.00 Resp: 2.00 Degree: 0
Cost_io: 2.00 Cost_cpu: 12881
Resp_io: 2.00 Resp_cpu: 12881
Access Path: index (AllEqRange)
Index: IDX_T_OWNER
resc_io: 2.00 resc_cpu: 21803
ix_sel: 1.000000 ix_sel_with_filters: 1.000000
Cost: 2.00 Resp: 2.00 Degree: 1
Best:: AccessPath: TableScan (最终决策,走FTS路径)
Cost: 2.00 Degree: 1 Resp: 2.00 Card: 12.00 Bytes: 0
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
Permutations for Starting Table :0
Join order[1]: T[T]#0
***********************
Best so far: Table#: 0 cost: 2.0004 card: 12.0000 bytes: 936
***********************
(newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:2000
*********************************
Number of join permutations tried: 1
*********************************
Trying or-Expansion on query block SEL$1 (#0)
Transfer Optimizer annotations for query block SEL$1 (#0)
id=0 frofand predicate="T"."OWNER"='SCOTT'
Final cost for query block SEL$1 (#0) - All Rows Plan:
Best join order: 1
Cost: 2.0004 Degree: 1 Card: 12.0000 Bytes: 936
Resc: 2.0004 Resc_io: 2.0000 Resc_cpu: 12881
Resp: 2.0004 Resp_io: 2.0000 Resc_cpu: 12881
kkoqbc-subheap (delete addr=0x8566b0, in-use=11048, alloc=14176)
kkoqbc-end:
:
call(in-use=16448, alloc=32736), compile(in-use=55872, alloc=59324), execution(in-use=1468, alloc=4060)
kkoqbc: finish optimizing query block SEL$1 (#0)
apadrv-end
:
call(in-use=16448, alloc=32736), compile(in-use=56468, alloc=59324), execution(in-use=1468, alloc=4060)
--Oracle发现竟然有SPM Baseline,并且相匹配
SPM: cost-based plan found in the plan baseline, planId = 2498539100
SPM: cost-based plan was successfully matched, planId = 2498539100
Starting SQL statement dump
user_id=0 user_name=SYS module=sqlplus.exe action=
sql_id=3kn82wdfqh9j3 plan_hash_value=1601196873 problem_type=3
从10053事件诊断过程中,我们可以发现Oracle在出现Baseline的时候,并不是武断的去按照Baseline的方法去做。Oracle首先使用CBO生成执行计划,之后拿到SPM baseline中进行匹配。如果匹配上了,也就是CBO生成的执行计划和baseline相匹配,Oracle直接就使用这个执行计划了。并且这个过程中会在shared pool中生成shared cursor信息。
如果没有匹配上,也就是说baseline不是最优执行计划,那么怎么办?
6、数据修改策略
我们可以修改SQL底层的数据,让FTS并不是最优执行计划。
SQL> insert into t select * from dba_objects where owner<>'SCOTT';
72732 rows inserted
SQL> commit;
Commit complete
SQL> exec dbms_stats.gather_table_stats(user,'T',cascade => true);
PL/SQL procedure successfully completed
在原有的12条记录基础上,添加了7万多条记录。那么SQL应该是Index Range Scan是最好执行计划。
SQL> explain plan for select * from t where wner='SCOTT';
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1516787156
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 1261 | 2 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| T | 13 | 1261 | 2 (0)|
|* 2 | INDEX RANGE SCAN | IDX_T_OWNER | 13 | | 1 (0)|
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OWNER"='SCOTT')
14 rows selected
执行SQL,同时生成对应的10053 Trace File。生成的Trace File我们之后分析。
--启动10053事件
SQL> alter session set events '10053 trace name context forever, level 12';
会话已更改。
SQL> select /*+ SPM DEMO*/* from t where wner='SCOTT';
(结果省略)
SQL> alter session set events '10053 trace name context off';
会话已更改。
SQL> select value from v$diag_info where name='Default Trace File';
VALUE
---------------------------------------------------------------------------
/u01/diag/rdbms/wilson/wilson/trace/wilson_ora_3781.trc
注意,此时监控内存中shared pool状态。
SQL> select sql_id, executions, version_count from v$sqlarea where sql_text like 'select /*+ SPM DEMO*/* from t%';
SQL_ID EXECUTIONS VERSION_COUNT
------------- ---------- -------------
我们知道,如果是从CBO单纯的过程看,FTS(也就是baseline规定的执行计划)不是最优的执行计划,而是Index Range Scan。但是,SQL执行了,没有在shared pool中形成任何共享游标。这是不正常的,在之后的实验中,我们证明是由于10053造成的执行计划不在shared pool中出现。
此时,SPM Baseline状态如下:
SQL> select sql_handle, plan_name, ENABLED, ACCEPTED, FIXED, AUTOPURGE,executions from dba_sql_plan_baselines;
SQL_HANDLE PLAN_NAME ENABLED ACCEPTED FIXED AUTOPURGE EXECUTIONS
------------------------------ ------------------------------ ------- -------- ----- --------- ----------
SYS_SQL_66ab2c3cf329ff6c SQL_PLAN_6datc7mtkmzvc7f85a438 YES NO NO YES 0
SYS_SQL_66ab2c3cf329ff6c SQL_PLAN_6datc7mtkmzvc94ecae5c YES YES NO YES 0
我们发现在SPM中增加了一个baseline plan,但是和之前的区别是Accepted状态为NO。
我们首先查看一下,刚刚的那个SQL执行计划是什么。
SQL> explain plan for select /*+ SPM DEMO*/* from t where wner='SCOTT';
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 1261 | 266 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T | 13 | 1261 | 266 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OWNER"='SCOTT')
Note
-----
- SQL plan baseline "SQL_PLAN_6datc7mtkmzvc94ecae5c" used for this statement
17 rows selected
SQL还是按照了Accepted的Baseline计划来执行的SQL,此时SQL Plan编号是:SQL_PLAN_6datc7mtkmzvc94ecae5c。那么,生成的那个非Accepted的执行计划基线是什么呢?
SQL> set pagesize 10000;
SQL> select * from table(dbms_xplan.display_sql_plan_baseline(sql_handle => 'SYS_SQL_66ab2c3cf329ff6c',plan_name => 'SQL_PLAN_6datc7mtkmzvc7f85a438',format => 'advanced'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL handle: SYS_SQL_66ab2c3cf329ff6c
SQL text: select /*+ SPM DEMO*/* from t where wner='SCOTT'
--------------------------------------------------------------------------------
Plan name: SQL_PLAN_6datc7mtkmzvc7f85a438 Plan id: 2139464760
Enabled: YES Fixed: NO Accepted: NO Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------
Outline Data from SMB:
/*+
BEGIN_OUTLINE_DATA
INDEX_RS_ASC(@"SEL$1" "T"@"SEL$1" ("T"."OWNER"))
OUTLINE_LEAF(@"SEL$1")
ALL_ROWS
DB_VERSION('11.2.0.1')
OPTIMIZER_FEATURES_ENABLE('11.2.0.1')
IGNORE_OPTIM_EMBEDDED_HINTS
END_OUTLINE_DATA
*/
--------------------------------------------------------------------------------
Plan hash value: 1516787156
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 1261 | 2 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| T | 13 | 1261 | 2 (0)|
|* 2 | INDEX RANGE SCAN | IDX_T_OWNER | 13 | | 1 (0)|
--------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1 / T@SEL$1
2 - SEL$1 / T@SEL$1
(篇幅原因,省略部分……)
72 rows selected
在Baseline运行规则上看,我们才到了这样的过程:一个SQL出现之后,Oracle CBO的确会重新进行评估,成本公式告诉Oracle说Index Range Scan是最优的执行计划。但是之后,发现SPM中有一个强制的Baseline,里面只有一个Accepted的执行计划。根据SPM的工作原理,新生成的执行计划如果成本值低于Baseline中Accepted的执行计划,这个执行计划也不会在这次操作中执行。而是添加为一个非Accepted的Plan。这次的执行还是会按照Baseline中Accepted的计划去执行。
同时,此时Shared Pool中没有对应的共享游标信息。下面,我们通过简单的实验,来证明其本质。
注意:由于是在其他环境上进行相同实验,sql_id和plan name可能有所不同。
此时数据库中包括FTS的baseline。
SQL> select sql_handle, plan_name, accepted from dba_sql_plan_baselines;
SQL_HANDLE PLAN_NAME ACCEPTED
------------------------------ ------------------------------ --------
SYS_SQL_f82ffff93a977037 SQL_PLAN_ghbzzz4x9fw1r94ecae5c YES
SQL> alter system flush shared_pool;
System altered
SQL> alter system flush buffer_cache;
System altered
首先我们使用10053监控。
SQL> alter session set events '10053 trace name context forever, level 12';
会话已更改。
SQL> select /*+ DEMP */* from t where wner='SCOTT';
(篇幅原因,有省略.....)
SQL> alter session set events '10053 trace name context off';
会话已更改。
数据shared pool中环境:
SQL> select sql_id, child_number, plan_hash_value, SQL_PLAN_BASELINE, executions from v$sql where sql_text like 'select /*+ DEMP */* from t%';
SQL_ID CHILD_NUMBER PLAN_HASH_VALUE SQL_PLAN_BASELINE EXECUTIONS
------------- ------------ --------------- ------------------------------ ----------
在非跟踪模式下:
SQL> select /*+ DEMP */* from t where wner='SCOTT';
(结果集合省略......)
SQL> select sql_id, child_number, plan_hash_value, SQL_PLAN_BASELINE, executions from v$sql where sql_text like 'select /*+ DEMP */* from t%';
SQL_ID CHILD_NUMBER PLAN_HASH_VALUE SQL_PLAN_BASELINE EXECUTIONS
------------- ------------ --------------- ------------------------------ ----------
cz2dt520pdh97 1 1601196873 SQL_PLAN_ghbzzz4x9fw1r94ecae5c 1
在正常情况下,我们可以看到不匹配的过程:如果执行过程中开启10053事件跟踪,shared cursor可能就不会生成。标准的方式是:如果一个新的执行计划被加入为非Accept类型Baseline Plan,真正执行的时候选择的依然是确认的Baseline,生成的是确认的Baseline执行计划。
那么,我们看到的是SPM在不匹配的时候的工作结果,那么在Oracle解析过程中,发生了什么呢?我们可以从刚刚生成的那个Trace文件中看到结果。我们在下篇中继续分析: