【WITH Clause】使用WITH子句提高查询统计效率-颠覆思维定势

这里介绍的的是一种SQL查询方法,颠覆您日常以select开始的SQL查询写法。
这种神奇的使用方法的背后隐藏着性能的提升!

我们一起来体验一下其中的奥妙。

1.创建表T,并简单初始化几条数据
sec@ora10g> create table t (x number(10), y number(10));
sec@ora10g> insert into t values (1,110);
sec@ora10g> insert into t values (2,120);
sec@ora10g> insert into t values (2,80);
sec@ora10g> insert into t values (3,150);
sec@ora10g> insert into t values (3,30);
sec@ora10g> insert into t values (3,60);
sec@ora10g> commit;

sec@ora10g> select * from t;

         X          Y
---------- ----------
         1        110
         2        120
         2         80
         3        150
         3         30
         3         60

6 rows selected.

2.需求描述
按照x列分组后统计y列的总值,我们的最终目标是选出比y列总值的三分之一大的那些分组统计信息。

3.使用子查询方式实现
这可能是大家最容易想到的方法。
sec@ora10g> SELECT x, SUM (y) AS total_y
  2    FROM t
  3  GROUP BY x
  4  HAVING SUM (y) > (SELECT SUM (y) / 3 FROM t)
  5  ORDER BY total_y
  6  /

         X    TOTAL_Y
---------- ----------
         2        200
         3        240

4.WITH Clause方法闪亮登场
sec@ora10g> WITH secooler_sum AS (SELECT x, SUM (y) total_y
  2                          FROM t
  3                        GROUP BY x)
  4  SELECT x, total_y
  5    FROM secooler_sum
  6   WHERE total_y > (SELECT SUM (total_y) / 3 FROM secooler_sum)
  7  ORDER BY total_y
  8  /

         X    TOTAL_Y
---------- ----------
         2        200
         3        240


是不是很神奇!这里的查询语句不是以select开始的,而是以“WITH”关键字开头。可以认为Oracle在真正进行查询之前预先构造了一个临时表secooler_sum,之后我们便可多次使用它做进一步的分析和处理。

5.WITH Clause方法的优点
使用WITH Clause方法有什么好处呢?
首先,增加了SQL的易读性,如果构造了多个子查询,结构会更清晰;
更重要的是:“一次分析,多次使用”,这也是为什么会提供性能的地方,达到了“少读”的目标。

6.知其所以然
为什么WITH Clause方法会提高效率?通过查看上面两种方法的执行计划便可略知一二。
1)使用子查询的执行计划
sec@ora10g> set autot trace exp
sec@ora10g> SELECT x, SUM (y) AS total_y
  2    FROM t
  3  GROUP BY x
  4  HAVING SUM (y) > (SELECT SUM (y) / 3 FROM t)
  5  ORDER BY total_y
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 4167292448

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     6 |   156 |     5  (40)| 00:00:01 |
|   1 |  SORT ORDER BY       |      |     6 |   156 |     5  (40)| 00:00:01 |
|*  2 |   FILTER             |      |       |       |            |          |
|   3 |    HASH GROUP BY     |      |     6 |   156 |     5  (40)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T    |     6 |   156 |     3   (0)| 00:00:01 |
|   5 |    SORT AGGREGATE    |      |     1 |    13 |            |          |
|   6 |     TABLE ACCESS FULL| T    |     6 |    78 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(SUM("Y")> (SELECT SUM("Y")/3 FROM "T" "T"))

Note
-----
   - dynamic sampling used for this statement


2)使用子WITH Clause的执行计划
sec@ora10g> WITH secooler_sum AS (SELECT x, SUM (y) total_y
  2                          FROM t
  3                        GROUP BY x)
  4  SELECT x, total_y
  5    FROM secooler_sum
  6   WHERE total_y > (SELECT SUM (total_y) / 3 FROM secooler_sum)
  7  ORDER BY total_y
  8  /

Execution Plan
----------------------------------------------------------
Plan hash value: 706070671

---------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |     6 |   156 |     9  (23)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                            |       |       |            |          |
|   2 |   LOAD AS SELECT           |                            |       |       |            |          |
|   3 |    HASH GROUP BY           |                            |     6 |   156 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL      | T                          |     6 |   156 |     3   (0)| 00:00:01 |
|   5 |   SORT ORDER BY            |                            |     6 |   156 |     5  (20)| 00:00:01 |
|*  6 |    VIEW                    |                            |     6 |   156 |     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6660_23A196E |     6 |   156 |     2   (0)| 00:00:01 |
|   8 |     SORT AGGREGATE         |                            |     1 |    13 |            |          |
|   9 |      VIEW                  |                            |     6 |    78 |     2   (0)| 00:00:01 |
|  10 |       TABLE ACCESS FULL    | SYS_TEMP_0FD9D6660_23A196E |     6 |   156 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("TOTAL_Y"> (SELECT SUM("TOTAL_Y")/3 FROM  (SELECT /*+ CACHE_TEMP_TABLE ("T1") */
              "C0" "X","C1" "TOTAL_Y" FROM "SYS"."SYS_TEMP_0FD9D6660_23A196E" "T1") "SECOOLER_SUM"))

Note
-----
   - dynamic sampling used for this statement


可见,第一种使用子查询的方法T表被扫描了两次,而使用WITH Clause方法,T表仅被扫描一次。
这也是为什么在大型数据仓库系统中推荐使用WITH Clause方法进行查询统计的原因,这样可以大大的提高数据分析和查询的效率。

另外,观察WITH Clause方法执行计划,其中“SYS_TEMP_0FD9D6660_23A196E”便是在运行过程中构造的中间统计结果临时表。

7.Oracle官方文档中有关WITH Clause的简单描述
Computation Using the WITH Clause

The WITH clause (formally known as subquery_factoring_clause) enables you to reuse the same query block in a SELECT statement when it occurs more than once within a complex query. WITH is a part of the SQL-99 standard. This is particularly useful when a query has multiple references to the same query block and there are joins and aggregations. Using the WITH clause, Oracle retrieves the results of a query block and stores them in the user's temporary tablespace. Note that Oracle Database does not support recursive use of the WITH clause.

参考链接:http://download.oracle.com/docs/cd/B19306_01/server.102/b14223/aggreg.htm#sthref1670

8.小结
WITH Clause方法在数据仓库或大数据量查询中有着自己的优势。
一切提高性能的措施都是推崇的。灵活掌握,裨益无限。在遇到具体问题时可善加利用。

Good luck.

secooler
10.03.22

-- The End --

请使用浏览器的分享功能分享到微信等