OCP版本:Version: 4.3.0-20240711151555
observer版本: oceanbase-4.2.1.8-108010012024082314.el7.x86_64.rpm
使用的是企业版 OceanBase
问题发现
在跑批完成后,对整个跑批进行分析,在OCP,slowsql中发现耗时长,cpu消耗高的SQL,如下图:


分析此类sql
库表已做脱敏,XXXX表示
SELECT
"SEQ_NO",
"CHANNEL_SEQ_NO",
"SUB_SEQ_NO",
"TAE_SUB_SEQ_NO",
TRAN_DATEFROM
XXX."XXX" WHERE ("SEQ_NO", "TRAN_DATE") >= (10023931,'2024-01-19 00:00:00') AND ("SEQ_NO", "TRAN_DATE") <= (10024360,'2024-01-19 00:00:00');执行计划如下:
=================================================================|ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)|-----------------------------------------------------------------|0 |PX COORDINATOR | |1281311 |5678820 ||1 |└─EXCHANGE OUT DISTR |:EX10000 |1281311 |4161250 ||2 | └─PX PARTITION ITERATOR| |1281311 |748539 ||3 | └─TABLE FULL SCAN |XXXXXX |1281311 |748539 |=================================================================Outputs & filters:-------------------------------------
0 - output([INTERNAL_FUNCTION(XXXXXX.SEQ_NO(0x7fdf66625ea0), XXXXXX.CHANNEL_SEQ_NO(0x7fdf6662dbe0), XXXXXX.SUB_SEQ_NO(0x7fdf6662e1e0),
XXXXXX.TAE_SUB_SEQ_NO(0x7fdf6662e7e0), XXXXXX.TRAN_DATE(0x7fdf66623ac0))(0x7fdf667ed850)]), filter(nil), rowset=256
1 - output([INTERNAL_FUNCTION(XXXXXX.SEQ_NO(0x7fdf66625ea0), XXXXXX.CHANNEL_SEQ_NO(0x7fdf6662dbe0), XXXXXX.SUB_SEQ_NO(0x7fdf6662e1e0),
XXXXXX.TAE_SUB_SEQ_NO(0x7fdf6662e7e0), XXXXXX.TRAN_DATE(0x7fdf66623ac0))(0x7fdf667ed850)]), filter(nil), rowset=256
dop=1
2 - output([XXXXXX.SEQ_NO(0x7fdf66625ea0)], [XXXXXX.TRAN_DATE(0x7fdf66623ac0)], [XXXXXX.CHANNEL_SEQ_NO(0x7fdf6662dbe0)], [XXXXXX.SUB_SEQ_NO(0x7fdf6662e1e0)],
[XXXXXX.TAE_SUB_SEQ_NO(0x7fdf6662e7e0)]), filter(nil), rowset=256
force partition granule 3 - output([XXXXXX.SEQ_NO(0x7fdf66625ea0)], [XXXXXX.TRAN_DATE(0x7fdf66623ac0)], [XXXXXX.CHANNEL_SEQ_NO(0x7fdf6662dbe0)], [XXXXXX.SUB_SEQ_NO(0x7fdf6662e1e0)],
[XXXXXX.TAE_SUB_SEQ_NO(0x7fdf6662e7e0)]), filter([(cast(XXXXXX.SEQ_NO(0x7fdf66625ea0), NUMBER(-1, -1))(0x7fdf66626d70), XXXXXX.TRAN_DATE(0x7fdf66623ac0))(0x7fdf66635dc0)
>= (10023931, cast('2024-01-19 00:00:00', DATE(-1, -1))(0x7fdf66627fc0))(0x7fdf666364f0)(0x7fdf666256d0)], [(cast(XXXXXX.SEQ_NO(0x7fdf66625ea0), NUMBER(-1,
-1))(0x7fdf6662b800), XXXXXX.TRAN_DATE(0x7fdf66623ac0))(0x7fdf66636f90) <= (10024360, cast('2024-01-19 00:00:00', DATE(-1, -1))(0x7fdf6662ca50))(0x7fdf666376c0)(0x7fdf6662a440)]), rowset=256
access([XXXXXX.SEQ_NO(0x7fdf66625ea0)], [XXXXXX.TRAN_DATE(0x7fdf66623ac0)], [XXXXXX.CHANNEL_SEQ_NO(0x7fdf6662dbe0)], [XXXXXX.SUB_SEQ_NO(0x7fdf6662e1e0)],
[XXXXXX.TAE_SUB_SEQ_NO(0x7fdf6662e7e0)]), partitions(p[0-14])
is_index_back=false, is_global_index=false, filter_before_indexback[false,false],
range_key([XXXXXX.SEQ_NO(0x7fdf66625ea0)], [XXXXXX.TRAN_DATE(0x7fdf66623ac0)]), range(MIN,MIN ; MAX,MAX)always trueUsed Hint:-------------------------------------
/*+
*/Qb name trace:-------------------------------------
stmt_id:0, stmt_type:T_EXPLAIN
stmt_id:1, SEL$1Outline Data:
-------------------------------------
/*+
BEGIN_OUTLINE_DATA
FULL(@"SEL$1" "XXXXXX"."XXXXXX"@"SEL$1")
OPTIMIZER_FEATURES_ENABLE('4.2.1.0')
END_OUTLINE_DATA
*/Optimization Info:-------------------------------------
XXXXXX:
table_rows:11531798
physical_range_rows:11531798
logical_range_rows:11531798
index_back_rows:0
output_rows:1281310
table_dop:1
dop_method:Table DOP
avaiable_index_name:[IDX_XXXXXX_3, IDX_XXXXXX_15, IDX_XXXXXX_11, IDX_XXXXXX_7, IDX_XXXXXX_9, IDX_XXXXXX_4, IDX_XXXXXX_2, IDX_XXXXXX_12, IDX_XXXXXX_6, IDX_XXXXXX_13, IDX_XXXXXX_ORIG_TRAN_TIMESTAMP, IDX_XXXXXX_10, IDX_XXXXXX_14, IDX_XXXXXX_8, IDX_XXXXXX_5, IDX_XXXXXX_1, XXXXXX]
stats version:1729088173113156
dynamic sampling level:0
Plan Type: DISTRIBUTED
Note:
Degree of Parallelisim is 1 because of table propert过滤逻辑步骤如下:
WHERE ("SEQ_NO", "TRAN_DATE") >= (10023931,'2024-01-19 00:00:00') AND ("SEQ_NO", "TRAN_DATE") <= (10024360,'2024-01-19 00:00:00');1.过滤出SEQ_NO 大于值为10023931的所有数据;
2.过滤出SEQ_NO 等于值为10023931时,再看TRAN_DATE过滤条件;
3.过滤出SEQ_NO 小于值为10024360的所有数据;
4.过滤出SEQ_NO 等于值为10024360时,再看TRAN_DATE过滤条件;
取交集后,查询结果如下:

是不是不太好理解呢?那我们看下,实际写成and的方式,结果会是什么样子的。
WHERE SEQ_NO >= 10023931 AND SEQ_NO <= 10024360 AND TRAN_DATE>='2024-01-19 00:00:00' AND TRAN_DATE<='2024-01-19 00:00:00';
结果如下:

SEQ_NO=10024358的数据不在了。这个很好理解。
我们再来看下,要是不加TRAN_DATE 条件,结果会是什么样子的。
WHERE SEQ_NO >= 10023931 AND SEQ_NO <= 10024360;
结果如下:

与结果时3条的对比,第四行,是不符合向量的过滤,所以去掉了。
是否走索引
and方式,走了索引。如下图。 向量方式,是full scan,如一开始的explain。
========================================================|ID|OPERATOR |NAME |EST.ROWS|EST.TIME(us)|--------------------------------------------------------|0 |TABLE RANGE SCAN|XXXXXX |1 |4 |========================================================Outputs & filters:-------------------------------------
0 - output([XXXXXX.SEQ_NO(0x7fb3f86244d0)], [XXXXXX.CHANNEL_SEQ_NO(0x7fb3f862af00)], [XXXXXX.SUB_SEQ_NO(0x7fb3f862b500)], [XXXXXX.TAE_SUB_SEQ_NO(0x7fb3f862bb00)],
[XXXXXX.TRAN_DATE(0x7fb3f8623500)]), filter([XXXXXX.TRAN_DATE(0x7fb3f8623500) <= cast('2024-01-19 00:00:00', DATE(19, 0))(0x7fb3f8627e10)(0x7fb3f8629270)],
[XXXXXX.TRAN_DATE(0x7fb3f8623500) >= cast('2024-01-19 00:00:00', DATE(19, 0))(0x7fb3f8627e10)(0x7fb3f8627680)]), rowset=16
access([XXXXXX.SEQ_NO(0x7fb3f86244d0)], [XXXXXX.TRAN_DATE(0x7fb3f8623500)], [XXXXXX.CHANNEL_SEQ_NO(0x7fb3f862af00)], [XXXXXX.SUB_SEQ_NO(0x7fb3f862b500)],
[XXXXXX.TAE_SUB_SEQ_NO(0x7fb3f862bb00)]), partitions(p14)
is_index_back=false, is_global_index=false, filter_before_indexback[false,false],
range_key([XXXXXX.SEQ_NO(0x7fb3f86244d0)], [XXXXXX.TRAN_DATE(0x7fb3f8623500)]), range[10023931,2024-01-19 00:00:00 ; 10024360,2024-01-19
00:00:00],
range_cond([XXXXXX.SEQ_NO(0x7fb3f86244d0) >= cast('10023931', VARCHAR2(1048576 ))(0x7fb3f8624a80)(0x7fb3f8623d80)], [XXXXXX.SEQ_NO(0x7fb3f86244d0)
<= cast('10024360', VARCHAR2(1048576 ))(0x7fb3f8626590)(0x7fb3f8625b70)])Used Hint:-------------------------------------
/*+
*/Qb name trace:-------------------------------------
stmt_id:0, stmt_type:T_EXPLAIN
stmt_id:1, SEL$1Outline Data:
-------------------------------------
/*+
BEGIN_OUTLINE_DATA
FULL(@"SEL$1" "XXXXXX"."XXXXXX"@"SEL$1")
OPTIMIZER_FEATURES_ENABLE('4.2.1.0')
END_OUTLINE_DATA
*/Optimization Info:-------------------------------------
XXXXXX:
table_rows:10215491
physical_range_rows:3
logical_range_rows:3
index_back_rows:0
output_rows:0
table_dop:1
dop_method:Table DOP
avaiable_index_name:[IDX_XXXXXX_3, IDX_XXXXXX_15, IDX_XXXXXX_11, IDX_XXXXXX_7, IDX_XXXXXX_9, IDX_XXXXXX_4, IDX_XXXXXX_2, IDX_XXXXXX_12, IDX_XXXXXX_6, IDX_XXXXXX_13, IDX_XXXXXX_ORIG_TRAN_TIMESTAMP, IDX_XXXXXX_10, IDX_XXXXXX_14, IDX_XXXXXX_8, IDX_XXXXXX_5, IDX_XXXXXX_1, XXXXXX]
pruned_index_name:[IDX_XXXXXX_3, IDX_XXXXXX_15, IDX_XXXXXX_11, IDX_XXXXXX_9, IDX_XXXXXX_4, IDX_XXXXXX_2, IDX_XXXXXX_12, IDX_XXXXXX_6, IDX_XXXXXX_13, IDX_XXXXXX_ORIG_TRAN_TIMESTAMP, IDX_XXXXXX_10, IDX_XXXXXX_14, IDX_XXXXXX_8, IDX_XXXXXX_5, IDX_XXXXXX_1]
stats version:1728914914354723
dynamic sampling level:0
Plan Type: LOCAL
Note:
Degree of Parallelisim is 1 because of table property总结:
禁止使用WHERE ("SEQ_NO", "TRAN_DATE") >(?, ?) AND ("SEQ_NO", "TRAN_DATE") <=(?, ?) 向量方式过滤查询数据库, 性能低并且 结果集大。