About Clusters
-
Disk I/O is reduced and access time improves for joins of clustered tables.
-
The cluster key is the column, or group of columns, that the clustered tables have in common. You specify the columns of the cluster key when creating the cluster. You subsequently specify the same columns when creating every table added to the cluster. Each cluster key value is stored only once each in the cluster and the cluster index, no matter how many rows of different tables contain the value.
Therefore, less storage might be required to store related table and index data in a cluster than is necessary in non-clustered table format. For example, in Figure 18-1, notice how each cluster key (each deptno) is stored just once for many rows that contain the same value in both the emp and dept tables.
Connected.
11:47:08 SCOTT@ test1 >select e.ename,e.sal,e.deptno,d.dname,d.loc from emp e ,dept d
11:48:21 2 where e.deptno=d.deptno and d.deptno=10;
ENAME SAL DEPTNO DNAME LOC
---------- ---------- ---------- -------------- -------------
CLARK 2450 10 ACCOUNTING NEW YORK
KING 5000 10 ACCOUNTING NEW YORK
MILLER 1300 10 ACCOUNTING NEW YORK
执行计划:
Execution Plan
----------------------------------------------------------
Plan hash value: 568005898
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 99 | 4 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 3 | 99 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | EMP | 3 | 39 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("D"."DEPTNO"=10)
4 - filter("E"."DEPTNO"=10)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
766 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
查看emp和dept表所在的块:
17:46:11 SCOTT@ test1 >select dbms_rowid.ROWID_OBJECT(rowid) "OBJ",DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) "BLOCK_ID" FROM EMP WHERE ROWNUM=1;
OBJ BLOCK_ID
---------- ----------
17571 149
Elapsed: 00:00:00.01
17:46:40 SCOTT@ test1 >select dbms_rowid.ROWID_OBJECT(rowid) "OBJ",DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) "BLOCK_ID" FROM DEPT WHERE ROWNUM=1;
OBJ BLOCK_ID
---------- ----------
17569 133
对于emp和dept表,属于不同的object,数据存储在不同的数据块上。
create cluster dept_emp_clu (deptno number(3))
pctfree 20 pctused 60
size 500 tablespace users;
Size选项
:是用来告诉Oracle预计有500字节数据和每个cluser key相关。Oracle将使用这个信息来计算每个block能容纳的最大cluster key数目。因此size太高,在每一block将得到很少的key,并且将使用比需要的更多的空间;设置容量太低,将得到过多的数据连接,这将偏离使用cluster的目的。Size是cluster的重要参数。
---- 建立簇表
create table department(
id number(3) primary key,
dname varchar(14) ,loc varchar2(13))
cluster dept_emp_clu(id);
create table employee(
eno number(4) primary key ,
ename varchar2(10),
job varchar2(9),
mgr number(4),
hiredate date,
sal number(7,2),
comm number(7,2),
dept_id number(3) references department
) cluster dept_emp_clu(dept_id);
-----在簇上建立索引
create index dept_emp_idx on cluster dept_emp_clu
tablespace indx;
11:49:43 SCOTT@ test1 >analyze table department compute statistics;
Table analyzed.
11:50:15 SCOTT@ test1 >analyze table employee compute statistics;
Table analyzed.
11:50:31 SCOTT@ test1 >select e.ename,e.sal,e.DEPT_ID,d.dname,d.loc from employee e,department d
11:50:39 2 where e.dept_id=d.id and d.id=10;
ENAME SAL DEPT_ID DNAME LOC
---------- ---------- ---------- -------------- -------------
CLARK 2450 10 ACCOUNTING NEW YORK
KING 5000 10 ACCOUNTING NEW YORK
MILLER 1300 10 ACCOUNTING NEW YORK
执行计划:
Execution Plan
----------------------------------------------------------
Plan hash value: 2165989181
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 140 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 5 | 140 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPARTMENT | 1 | 18 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | SYS_C005404 | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS CLUSTER | EMPLOYEE | 5 | 50 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | DEPT_EMP_IDX | 1 | | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("D"."ID"=10)
5 - access("E"."DEPT_ID"=10)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
767 bytes sent via SQL*Net to client
419 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
对于Cluster table连接访问,可以看到“5 consistent gets“,比普通表”10 consistent gets“减少了一半的一致性读。
查询簇表所在的块:
10:16:53 SYS@ test1>conn scott/tiger
Connected.
10:19:04 SCOTT@ test1>select rowid,ename from employee where rownum =1;
ROWID ENAME
------------------ ----------
AAACiCAAEAAAACEAAA SMITH
10:19:05 SCOTT@ test1>select rowid,dname from department where rownum=1;
ROWID DNAME
------------------ --------------
AAACiCAAEAAAACEAAA ACCOUNTING
17:49:17 SCOTT@ test1 >select dbms_rowid.ROWID_OBJECT('AAACiCAAEAAAACEAAA') "OBJ",DBMS_ROWID.ROWID_BLOCK_NUMBER('AAACiCAAEAAAACEAAA') "BLOCK_ID" FROM DUAL;
OBJ BLOCK_ID
---------- ----------
10370 132
------因为簇表的数据放在相同的块上,所以在表连接查询时,减少了I/O
1、创建散列簇
create cluster my_clu_two(empno number(10) )
pctused 70
pctfree 10
tablespace users
hash is empno
hashkeys 150 ;
说明:
* hash is 子句指明了进行散列的列,如果列是唯一的标示行,就可以将列指定为散列值
* hashkeys 指定和限制散列函数可以产生的唯一的散列值的数量
2、创建散列表
create table t2_emp (
empno number ( 10 ),
ename varchar2 ( 20 ),
birth_date date ,
deptno number )
cluster my_clu_two(empno);
注意:
* 必须设置数值的精度(具体原因不详)
* 散列簇不能也不用创建索引
* 散列簇不能ALTER:size、hashkeys、hash is参数
card_no varchar2(16),
transdate date sort)
hashkeys 10000
hash is ora_hash(card_no)
size 256;
Cluster created.
10:55:25 SCOTT@ test1 >create table credit_orders
(card_no varchar2(16),
transdate date,
amount number)
cluster credit_cluster(card_no,transdate);
Table created.
10:56:10 SCOTT@ test1 >select * from tab;
TNAME TABTYPE CLUSTERID
------------------------------ ------- ----------
BONUS TABLE
CREDIT_CLUSTER CLUSTER
CREDIT_ORDERS TABLE 1
10:56:15 SCOTT@ test1 >alter session set nls_date_format = "YYYYMMDDHH24MISS";
Session altered.
10:56:33 SCOTT@ test1 >insert into credit_orders (card_no,transdate,amount) values ('4111111111111111','20050131000123',57.99);
1 row created.
10:57:24 SCOTT@ test1 >insert into credit_orders (card_no,transdate,amount) values ('4111111111111111','20050130071216',16.59);
1 row created.
10:57:33 SCOTT@ test1 >insert into credit_orders (card_no,transdate,amount) values ('4111111111111111','20050131111111',39.00);
1 row created.
10:57:45 SCOTT@ test1 >insert into credit_orders (card_no,transdate,amount) values ('4111111111111111','20050130081001',25.16);
1 row created.
11:12:24 SCOTT@ test1 >insert into credit_orders (card_no,transdate,amount) values ('5111111111111111','20050131000123',57.99);
1 row created.
11:16:16 SCOTT@ test1 >insert into credit_orders (card_no,transdate,amount) values ('5111111111111111','20050130071216',16.59);
1 row created.
11:16:36 SCOTT@ test1 >insert into credit_orders (card_no,transdate,amount) values('5111111111111111','20050131111111',39.00);
1 row created.
11:16:57 SCOTT@ test1 >commit;
Commit complete.
11:17:02 SCOTT@ test1 >select * from credit_orders;
CARD_NO TRANSDATE AMOUNT
---------------- -------------- ----------
4111111111111111 20050131000123 57.99
4111111111111111 20050130071216 16.59
4111111111111111 20050131111111 39
4111111111111111 20050130081001 25.16
5111111111111111 20050131000123 57.99
5111111111111111 20050130071216 16.59
5111111111111111 20050131111111 39
7 rows selected.
10:58:14 SCOTT@ test1 >SET AUTOTRACE ON
10:59:53 SCOTT@ test1 >select * from credit_orders where card_no=4111111111111111;
CARD_NO TRANSDATE AMOUNT
---------------- -------------- ----------
4111111111111111 20050131000123 57.99
4111111111111111 20050130071216 16.59
4111111111111111 20050131111111 39
4111111111111111 20050130081001 25.16
Execution Plan
----------------------------------------------------------
Plan hash value: 1026124641
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 32 | 87 (0)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| CREDIT_ORDERS | 1 | 32 | 87 (0)| 00:00:02 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("CARD_NO")=4111111111111111)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
27 recursive calls
0 db block gets
418 consistent gets
0 physical reads
0 redo size
667 bytes sent via SQL*Net to client
415 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
4 rows processed
11:00:32 SCOTT@ test1 >analyze
11:01:10 2 table credit_orders compute statistics;
Table analyzed.
11:01:30 SCOTT@ test1 >select * from credit_orders where card_no=4111111111111111;
CARD_NO TRANSDATE AMOUNT
---------------- -------------- ----------
4111111111111111 20050131000123 57.99
4111111111111111 20050130071216 16.59
4111111111111111 20050131111111 39
4111111111111111 20050130081001 25.16
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 1026124641
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 72 | 87 (0)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| CREDIT_ORDERS | 3 | 72 | 87 (0)| 00:00:02 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("CARD_NO")=4111111111111111)
Statistics
----------------------------------------------------------
12 recursive calls
0 db block gets
352 consistent gets
0 physical reads
0 redo size
667 bytes sent via SQL*Net to client
415 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
4 rows processed