行列转换

http://space.itpub.net/23278472/viewspace-689961有许多内容是转载此处的 。

假如有如下表,其中各个i值对应的行数是不定的

SQL> select * from t;

IA D
———- ———- ——————-
1 b 2008-03-27 10:55:42
1 a2008-03-27 10:55:46
1 d 2008-03-27 10:55:30
2 z 2008-03-27 10:55:55
2 t 2008-03-27 10:55:59

要获得如下结果,注意字符串需要按照D列的时间排序:

1 d,b,a
2 z,t

这是一个比较典型的行列转换,有好几种实现方法

1.自定义函数实现

create or replace function my_concat(n number)
return varchar2
is
type typ_cursor is ref cursor;
v_cursor typ_cursor;
v_temp varchar2(10);
v_result varchar2(4000):= ”;
v_sql varchar2(200);
begin
v_sql := ‘select a from t where i=’ || n ||’ order by d’;
open v_cursor for v_sql;
loop
fetch v_cursor into v_temp;
exit when v_cursor%notfound;
v_result := v_result ||’,’ || v_temp;
end loop;
return substr(v_result,2);
end;

SQL>select i,my_concat(i) from t group by i;

IMY_CONCAT(I)
———- ——————–
1 d,b,a
2 z,t

虽然这种方式可以实现需求,但是如果表t的数据量很大,i的值又很多的情况下,因为针对每个i值都要执行一句select,扫描和排序的次数和i的值成正比,性能会非常差。

2.使用sys_connect_by_path

select i,ltrim(max(sys_connect_by_path(a,’,')),’,') a
from
(
select i,a,d,min(d) over(partition by i) d_min,
(row_number() over(order by i,d))+(dense_rank() over (order by i)) numid
from t
)
start with d=d_min connect by numid-1=prior numid
group by i;

从执行计划上来看,这种方式只需要扫描两次表,比自定义函数的方法,效率要高很多,尤其是表中数据量较大的时候。


3.使用wm_sys.wm_concat

这个函数也可以实现类似的行列转换需求,但是似乎没有办法做到直接根据另外一列排序,所以需要先通过子查询或者临时表排好序


SQL> select i,wmsys.wm_concat(a) from t group by i;

IWMSYS.WM_CONCAT(A)
———- ——————–
1 b,a,d
2 z,t

SQL>select i,wmsys.wm_concat(a)
2 from
3 (select * from t order by i,d)
4 group by i;

IWMSYS.WM_CONCAT(A)
———- ——————–
1 d,b,a
2 z,t

执行计划上看,只需要做一次表扫描就可以了,但是这个函数是加密过的,执行计划并不能显示函数内部的操作。


-----行列转换一

数据格式一
CARD_CODE         Q        BAL
--------- ---------- ----------
001               1         27
001               2         10
001               3         36
001               4         97
002               1         96
002               2         12
002               3         15
002               4         32

数据格式二
CARD_CODE        Q1        Q2        Q3         Q4
--------- ---------- ---------- ---------- ----------
001              27        10        36         97
002              96        12        15         32

--格式一到格式二

SELECTa.card_code, SUM(decode(a.q, 1, a.bal, 0)) q1, SUM(decode(a.q, 2, a.bal, 0))q2,
    SUM(decode(a.q, 3, a.bal, 0)) q3, SUM(decode(a.q, 4, a.bal,0)) q4
 FROM my_card a
 GROUP BY a.card_code
 ORDER BY 1;
 
 
--
格式二到格式一
SELECT t.card_code, t.rn q, decode(t.rn, 1, t.q1, 2, t.q2, 3, t.q3, 4, t.q4)bal
 FROM (SELECT a.*, b.rn
      FROM my_card_two a,
        (SELECT rownum rn
          FROM dual
         CONNECT BY rownum <=4) b) t
 ORDER BY 1, 2;
 
 
------
行列转换二
数据格式一
CARD_CODE Q
--------- ------------------------------------------------
001       quarter_1
001       quarter_2
001       quarter_3
001       quarter_4
002       quarter_1
002       quarter_2
002       quarter_3
002       quarter_4

数据格式二
CARD_CODE Q
--------- -----------------------------
002       quarter_1;quarter_2;quarter_3;quarter_4
001       quarter_1;quarter_2;quarter_3;quarter_4


--
格式一到格式二
SELECT t1.card_code, substr(MAX(sys_connect_by_path(t1.q, ';')), 2) q
  FROM (SELECT a.card_code, a.q,
               row_number() over(PARTITION BY a.card_code ORDER BY a.q) rn
           FROM my_card_t3 a) t1
 START WITH t1.rn = 1
CONNECT BY t1.card_code = PRIOR t1.card_code
           AND t1.rn - 1 =PRIOR t1.rn
 GROUP BY t1.card_code;
 
--
格式二到格式一
SELECT t.card_code,
    substr(t.q,
        instr(';' || t.q, ';', 1, rn),
        instr(t.q || ';', ';', 1, rn) -instr(';' || t.q, ';', 1, rn)) q
 FROM (SELECT a.card_code, a.q, b.rn
      FROM my_card_t4 a,
        (SELECT rownum rn
          FROM dual
         CONNECT BY rownum <=100) b
     WHERE instr(';' || a.q, ';', 1, rn) > 0) t
 ORDER BY 1, 2;

实验

CLASS1     CALLDATE         CALLCOUNT
1          2005-08-08       40
1          2005-08-07       6
2          2005-08-08       77
3          2005-08-09       33
3          2005-08-08       9
3          2005-08-07       21

转置后:
CALLDATE     CallCount1 CallCount2 CallCount3
------------ ---------- ---------- ----------
2005-08-09   0          0          33
2005-08-08   40         77         9
2005-08-07  6      0          21

试验如下:
1). 建立测试表和数据
CREATE TABLE t(
    class1 VARCHAR2(2 BYTE),
    calldate DATE,
    callcount INTEGER
);

INSERT INTO t(class1, calldate, callcount)
VALUES ('1', TO_DATE ('08/08/2005', 'MM/DD/YYYY'), 40);

INSERT INTO t(class1, calldate, callcount)
VALUES ('1', TO_DATE ('08/07/2005', 'MM/DD/YYYY'), 6);

INSERT INTO t(class1, calldate, callcount)
VALUES ('2', TO_DATE ('08/08/2005', 'MM/DD/YYYY'), 77);

INSERT INTO t(class1, calldate, callcount)
VALUES ('3', TO_DATE ('08/09/2005', 'MM/DD/YYYY'), 33);

INSERT INTO t(class1, calldate, callcount)
VALUES ('3', TO_DATE ('08/08/2005', 'MM/DD/YYYY'), 9);

INSERT INTO t(class1, calldate, callcount)
VALUES ('3', TO_DATE ('08/07/2005', 'MM/DD/YYYY'), 21); 

COMMIT ; 

1.(1)建立ref cursor准备输出结果集 
CREATE OR REPLACE PACKAGE pkg_getrecord
IS
    TYPE myrctype IS REF CURSOR;
END pkg_getrecord;
/

 建立动态
sql交叉表函数,输出结果集 
CREATE OR REPLACE FUNCTION fn_rs
    RETURN pkg_getrecord.myrctype --返回类型为pkg_getrecord.myrctype类型
IS
    s VARCHAR2 (4000); --申明s变量用于写入sql
    CURSOR c1 IS
 
  SELECT ',sum(case when Class1='--看到‘,’逗号没,这个游标c1将结合下面s一起使用
            || class1
            || ' then CallCount else 0 end)'
            || ' "CallCount'
            || class1
            || '"' c2

    FROM t
    GROUP BY class1;
    r1 c1%ROWTYPE;--申明r1用于存储cursor c1的结果
    list_cursor pkg_getrecord.myrctype;--申明list_cursor 用于存储总结果
BEGIN
    s := 'select CallDate ';
    OPEN c1;--打开 游标
    LOOP
        FETCH c1 INTO r1;
        EXIT WHEN c1%NOTFOUND;
       
 s := s || r1.c2;--意思是select calldate,callcount
    END LOOP;
    CLOSE c1;
    s := s || ' from T group by CallDate order by CallDate desc ';
    OPEN list_cursor FOR s;
    RETURN list_cursor;
END fn_rs;


测试在sql plus下执行:
var results refcursor;
exec :results := fn_rs;
print results;
CALLDATE        CallCount1 CallCount2 CallCount3
--------------- ---------- ---------- ----------
2005-08-09      0          0          33
2005-08-08      40         77         9
2005-08-07      6          0          21

(2)另一种写法

Create or replace function my_concat(n number)

Return varchar2

As

Type typ_cursor is ref cursor;

v_cursor typ_cursor;

v_temp varchar2(20);

v_results varchar2(4000):='';

v_sql varchar2(200);

begin

v_sql := 'select CallCount from t where class1= '|| n ||' order by CallCount';

Open v_cursor for v_sql;

Loop

Fetch v_cursor into v_temp;

Exit when v_cursor%notfound;

v_results := v_results ||','|| v_temp;

End loop;

Return ltrim(v_results,2);

End;

/

Select distinct class1,my_concat(class1) from t group by class1;

----------------------------------------------------------

Plan hash value: 47235625

---------------------------------------------------------------------------

| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT   |      |     6 |    18 |     4  (25)| 00:00:01 |

|   1 |  HASH GROUP BY     |      |     6 |    18 |     4  (25)| 00:00:01 |

|   2 |   TABLE ACCESS FULL| T    |     6 |    18 |     3   (0)| 00:00:01 |

---------------------------------------------------------------------------

Note

-----

   - dynamic sampling used for this statement (level=2)

统计信息

----------------------------------------------------------

         18  recursive calls

          0  db block gets

         28  consistent gets

          0  physical reads

          0  redo size

        560  bytes sent via SQL*Net to client

        416  bytes received via SQL*Net from client

          2  SQL*Net roundtrips to/from client

          3  sorts (memory)

          0  sorts (disk)

          3  rows processed



其实写这么长个函数,就是下面的意思:
2.用聚合函数组合decode
(1)SQL> Select CallDate,
  2  sum(decode(class1,1,CallCount,0)) callcount1,
  3  sum(decode(class1,2,CallCount,0)) callcount2,
  4  sum(decode(class1,3,CallCount,0)) callcount3
  5  from T
  6  group by CallDate
  7  order by CallDate desc
  8  ;

CALLDATE       CALLCOUNT1 CALLCOUNT2 CALLCOUNT3
-------------- ---------- ---------- ----------

09-8月 -05              0          0         33
08-8月 -05             40         77          9
07-8月 -05              6          0         21
----------------------------------------------------------
Plan hash value: 1476560607

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     6 |   150 |     4  (25)| 00:00:01 |
|   1 |  SORT GROUP BY     |      |     6 |   150 |     4  (25)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T    |     6 |   150 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        700  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed


 

3.聚合函数组合case when

 
SQL> Select CallDate,sum(case when class1=1 then Callcount else 0 end)callcount,
  2  sum(case when class1=2 then Callcount else 0 end)callcount2,
  3  sum(case when class1=3 then Callcount else 0 end)callcount3
  4  From t
  5  Group by CallDate
  6  Order by
CallDate desc;

CALLDATE       CALLCOUNT1 CALLCOUNT2 CALLCOUNT3
-------------- ---------- ---------- ----------

09-8月 -05              0          0         33
08-8月 -05             40         77          9
07-8月 -05              6          0         21

 ----------------------------------------------------------
Plan hash value: 1476560607

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     6 |   150 |     4  (25)| 00:00:01 |
|   1 |  SORT GROUP BY     |      |     6 |   150 |     4  (25)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| T    |     6 |   150 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


统计信息
----------------------------------------------------------
          4  recursive calls
          0  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
        700  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed
4.11g在SELECT语句中新加了关键词PIVOT和UNPIVOT,用这两个关键词,重写上面的两个查询,就变成这个样子的了:
SQL> Select *
  2  From t
  3  Pivot(max(CallCount) for class1 in(1 as callcount1,2 as callcount2, 3 as ca
llcount3))
  4  Order by CallDate desc
  5  /

CALLDATE       CALLCOUNT1 CALLCOUNT2 CALLCOUNT3
-------------- ---------- ---------- ----------
09-8月 -05                                   33
08-8月 -05             40         77          9
07-8月 -05              6                    21


执行计划
----------------------------------------------------------
Plan hash value: 2492022148

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     3 |    39 |     4  (25)| 00:00:01 |
|   1 |  SORT GROUP BY PIVOT|      |     3 |    39 |     4  (25)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T    |     6 |    78 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------


统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        697  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed

 

对比前面四种写法,在这里可以看出直接使用sql的性能是比使用自定义函数的性能是好一些的。

接下来我想做的就是输出每个class1的callcount 

 5.sys_connect_by_path

Select class1,substr(max(sys_connect_by_path(CallCount,',')),2)

From(select class1,CallCount,rn,lead(rn) over(partition by class1 order by rn )rn1

From (select class1,CallCount,row_number() over(order by CallCount desc) rn from t ))

Start with rn1 is null

connect by rn1 = prior rn

Group by class1

Order by class1;

执行计划
----------------------------------------------------------
Plan hash value: 2753391344

--------------------------------------------------------------------------------

-----------------

| Id  | Operation                                | Name | Rows  | Bytes | Cost (

%CPU)| Time     |

--------------------------------------------------------------------------------

-----------------

|   0 | SELECT STATEMENT                         |      |     3 |   123 |     6
 (50)| 00:00:01 |

|   1 |  SORT GROUP BY                           |      |     3 |   123 |     6
 (50)| 00:00:01 |

|*  2 |   CONNECT BY NO FILTERING WITH START-WITH|      |       |       |
     |          |

|   3 |    VIEW                                  |      |     6 |   252 |     5
 (40)| 00:00:01 |

|   4 |     WINDOW SORT                          |      |     6 |   174 |     5
 (40)| 00:00:01 |

|   5 |      VIEW                                |      |     6 |   174 |     4
 (25)| 00:00:01 |

|   6 |       WINDOW SORT                        |      |     6 |    30 |     4
 (25)| 00:00:01 |

|   7 |        TABLE ACCESS FULL                 | T    |     6 |    30 |     3
  (0)| 00:00:01 |

--------------------------------------------------------------------------------

-----------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("RN1"=PRIOR "RN")
       filter("RN1" IS NULL)


统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        543  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          3  rows processed

 

 6.wmsys.wm_concat()

Select class1,wmsys.wm_concat(CallCount)

From (select class1,CallCount from t order by class1,CallCount)

Group by class1;

执行计划
----------------------------------------------------------
Plan hash value: 511366736

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     3 |    45 |     4  (25)| 00:00:01 |
|   1 |  SORT GROUP BY       |      |     3 |    45 |     4  (25)| 00:00:01 |
|   2 |   VIEW               |      |     6 |    90 |     4  (25)| 00:00:01 |
|   3 |    SORT ORDER BY     |      |     6 |    30 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T    |     6 |    30 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------


统计信息
----------------------------------------------------------
        504  recursive calls
          0  db block gets
        488  consistent gets
         30  physical reads
          0  redo size
        560  bytes sent via SQL*Net to client
        416  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
         32  sorts (memory)
          0  sorts (disk)
          3  rows processed
可以看出wmsys.wm_concat()消耗这么大!!!
请使用浏览器的分享功能分享到微信等