- 이 문서는 구루비에서 작성하였습니다.
- 이 문서를 다른 블로그나 홈페이지에 게재하실 경우에는 출처를 꼭 밝혀 주시면 고맙겠습니다.~^^
- 출처 : http://wiki.gurubee.net/pages/viewpage.action?pageId=9601028&
- 구루비 지식창고의 모든 문서는 크리에이티브 커먼즈의 저작자표시-비영리-동일조건변경허락(BY-NC-SA) 라이선스에 따라 자유롭게 사용할 수 있습니다.
I. 통계정보 조작
I.1. 개념 설명
- 통계정보를 조작하여 Cardinality를 제어하는 기법
- Cardinality는 Selectivity에 의해 결정되고, Selectivity는 주로 Density에 의해 결정됨
- 따라서, Density를 조작할 수 있으면 Cardinality를 조작할 수 있으며 Histogram 또한 마찬가지임
I.2. 테스트 Object 생성
DROP TABLE T1 PURGE; DROP TABLE T2 PURGE; CREATE TABLE T1(ID1 INT, STATUS1 CHAR(5)); CREATE TABLE T2(ID2 INT); CREATE INDEX T1_IDX ON T1(ID1); CREATE INDEX T2_IDX ON T2(ID2); INSERT INTO T1 SELECT -- column id1 is uniform MOD(R, 1000), -- column status1 is skewed!!! CASE WHEN R BETWEEN 1 AND 9000 THEN '-1' ELSE TO_CHAR(MOD(R, 1000)) END FROM (SELECT LEVEL AS R FROM DUAL CONNECT BY LEVEL <= 10000) ORDER BY DBMS_RANDOM.RANDOM ; INSERT INTO T2 SELECT MOD(R, 100) FROM (SELECT LEVEL AS R FROM DUAL CONNECT BY LEVEL <= 10000) ; BEGIN DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', CASCADE => TRUE, ESTIMATE_PERCENT => 100, METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', NO_INVALIDATE => FALSE ); END; / BEGIN DBMS_STATS.GATHER_TABLE_STATS(USER, 'T2', CASCADE => TRUE, ESTIMATE_PERCENT => 100, METHOD_OPT => 'FOR ALL COLUMNS SIZE 1', NO_INVALIDATE => FALSE ); END; /
I.3. 통계정보 확인
-- T1 테이블의 컬럼 통계정보 SELECT S.TABLE_NAME, S.COLUMN_NAME, S.NUM_DISTINCT, S.NUM_NULLS, S.DENSITY, S.LOW_VALUE, S.HIGH_VALUE, S.HISTOGRAM FROM USER_TAB_COLS S WHERE S.TABLE_NAME = UPPER('T1') ;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM |
---|---|---|---|---|---|---|---|
T1 | STATUS1 | 1001 | 0 | .000999001 | 2D31202020 | 3939392020 | NONE |
T1 | ID1 | 1001 | 0 | .001 | 80 | C20A64 | NONE |
I.4. 테스트
SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM T1 T1, T2 T2 WHERE T1.ID1 = T2.ID2 AND T1.ID1 BETWEEN 1 AND 100 AND T1.STATUS1 = '-1' ; COUNT(*) ---------- 89100 1 row selected. Elapsed: 00:00:00.12 SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'IOSTATS COST LAST')); ----------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | ----------------------------------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.10 | 1221 | | 2 | NESTED LOOPS | | 1 | 101 | 10 (0)| 89100 |00:00:00.18 | 1221 | |* 3 | TABLE ACCESS FULL| T1 | 1 | 1 | 9 (0)| 900 |00:00:00.01 | 31 | |* 4 | INDEX RANGE SCAN | T2_IDX | 900 | 100 | 1 (0)| 89100 |00:00:00.10 | 1190 | ----------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(("T1"."ID1"<=100 AND "T1"."STATUS1"='-1' AND "T1"."ID1">=1)) 4 - access("T1"."ID1"="T2"."ID2") filter(("T2"."ID2">=1 AND "T2"."ID2"<=100))
- 'T1' 테이블에 -1 값이 9000로우가 있음에도 불구하고 통계정보에서는 이를 모르기 때문에 E-Rows를 1로 계산을 했다.
만약 정확히 9000로우를 알고 있었다면 Hash Join을 사용했을 것이다.
이를 해결하기 위해서는 히스토그램 생성과 통계정보 조작 2가지 방법이 있다.
I.5. 히스토그램 생성
1) 히스토그램 생성 전
SELECT TABLE_NAME, COLUMN_NAME, ENDPOINT_NUMBER, ENDPOINT_VALUE || '(' || ENDPOINT_ACTUAL_VALUE || ')' AS ENDPOINT_VALUE FROM USER_TAB_HISTOGRAMS WHERE TABLE_NAME = UPPER('T1') ORDER BY COLUMN_NAME, ENDPOINT_NUMBER ;
TABLE_NAME | COLUMN_NAME | ENDPOINT_NUMBER | ENDPOINT_VALUE |
---|---|---|---|
T1 | ID1 | 0 | 0() |
T1 | ID1 | 1 | 999() |
T1 | STATUS1 | 0 | 234649741948204000000000000000000000() |
T1 | STATUS1 | 1 | 297121544231514000000000000000000000() |
2) 히스토그램 생성
BEGIN DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', ESTIMATE_PERCENT => 100, CASCADE => TRUE, METHOD_OPT => 'FOR COLUMNS STATUS1 SIZE SKEWONLY', NO_INVALIDATE => FALSE); END; /
3) 히스토그램 생성 후
SELECT TABLE_NAME, COLUMN_NAME, ENDPOINT_NUMBER, ENDPOINT_VALUE || '(' || ENDPOINT_ACTUAL_VALUE || ')' AS ENDPOINT_VALUE FROM USER_TAB_HISTOGRAMS WHERE TABLE_NAME = UPPER('T1') ORDER BY COLUMN_NAME, ENDPOINT_NUMBER ;
TABLE_NAME | COLUMN_NAME | ENDPOINT_NUMBER | ENDPOINT_VALUE |
---|---|---|---|
T1 | ID1 | 0 | 0() |
T1 | ID1 | 1 | 999() |
T1 | STATUS1 | 228 | 234649741948204000000000000000000000() |
T1 | STATUS1 | 229 | 255400627676803000000000000000000000() |
T1 | STATUS1 | 230 | 255481361174405000000000000000000000() |
T1 | STATUS1 | 231 | 255562173900169000000000000000000000() |
T1 | STATUS1 | 232 | 260612890032291000000000000000000000() |
T1 | STATUS1 | 233 | 260693702758056000000000000000000000() |
T1 | STATUS1 | 234 | 260755025355842000000000000000000000() |
T1 | STATUS1 | 235 | 265825231615942000000000000000000000() |
T1 | STATUS1 | 236 | 265886554213728000000000000000000000() |
T1 | STATUS1 | 237 | 265967366939493000000000000000000000() |
T1 | STATUS1 | 238 | 271018083071615000000000000000000000() |
T1 | STATUS1 | 239 | 271098895797379000000000000000000000() |
T1 | STATUS1 | 240 | 276167834406879000000000000000000000() |
T1 | STATUS1 | 241 | 276230424655266000000000000000000000() |
T1 | STATUS1 | 242 | 276311158152868000000000000000000000() |
T1 | STATUS1 | 243 | 281361953513152000000000000000000000() |
T1 | STATUS1 | 244 | 281442687010754000000000000000000000() |
T1 | STATUS1 | 245 | 281504088836703000000000000000000000() |
T1 | STATUS1 | 246 | 286574215868640000000000000000000000() |
T1 | STATUS1 | 247 | 286653840171967000000000000000000000() |
T1 | STATUS1 | 248 | 286716351192191000000000000000000000() |
T1 | STATUS1 | 249 | 291786557452291000000000000000000000() |
T1 | STATUS1 | 250 | 291847880050077000000000000000000000() |
T1 | STATUS1 | 251 | 291928692775842000000000000000000000() |
T1 | STATUS1 | 252 | 296979408907964000000000000000000000() |
T1 | STATUS1 | 253 | 297060221633728000000000000000000000() |
T1 | STATUS1 | 254 | 297121544231514000000000000000000000() |
4) 플랜 다시 확인
SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM T1 T1, T2 T2 WHERE T1.ID1 = T2.ID2 AND T1.ID1 BETWEEN 1 AND 100 AND T1.STATUS1 = '-1' ; COUNT(*) ---------- 89100 1 row selected. Elapsed: 00:00:00.07 SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'IOSTATS COST LAST')); --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.04 | 62 | |* 2 | HASH JOIN | | 1 | 14779 | 19 (6)| 89100 |00:00:00.01 | 62 | |* 3 | TABLE ACCESS FULL| T1 | 1 | 908 | 9 (0)| 900 |00:00:00.01 | 31 | |* 4 | TABLE ACCESS FULL| T2 | 1 | 9999 | 9 (0)| 9900 |00:00:00.01 | 31 | --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."ID1"="T2"."ID2") 3 - filter(("T1"."ID1"<=100 AND "T1"."STATUS1"='-1' AND "T1"."ID1">=1)) 4 - filter(("T2"."ID2">=1 AND "T2"."ID2"<=100))
I.6. 통계정보 변경
1) Density 조작 전
SELECT S.TABLE_NAME, S.COLUMN_NAME, S.NUM_DISTINCT, S.NUM_NULLS, S.DENSITY, S.LOW_VALUE, S.HIGH_VALUE, S.HISTOGRAM FROM USER_TAB_COLS S WHERE S.TABLE_NAME = UPPER('T1') ;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM |
---|---|---|---|---|---|---|---|
T1 | STATUS1 | 1001 | 0 | .000999001 | 2D31202020 | 3939392020 | NONE |
T1 | ID1 | 1001 | 0 | .001 | 80 | C20A64 | NONE |
2) Density 조작
BEGIN DBMS_STATS.SET_COLUMN_STATS(OWNNAME => USER, TABNAME => 'T1', COLNAME => 'STATUS1', DENSITY => 1 ); END; /
3) Density 조작 후
SELECT S.TABLE_NAME, S.COLUMN_NAME, S.NUM_DISTINCT, S.NUM_NULLS, S.DENSITY, S.LOW_VALUE, S.HIGH_VALUE, S.HISTOGRAM FROM USER_TAB_COLS S WHERE S.TABLE_NAME = UPPER('T1') ;
TABLE_NAME | COLUMN_NAME | NUM_DISTINCT | NUM_NULLS | DENSITY | LOW_VALUE | HIGH_VALUE | HISTOGRAM |
---|---|---|---|---|---|---|---|
T1 | STATUS1 | 1001 | 0 | 1 | 2D31202020 | 3939392020 | NONE |
T1 | ID1 | 1001 | 0 | .001 | 80 | C20A64 | NONE |
4) 플랜 다시 확인
SELECT /*+ GATHER_PLAN_STATISTICS */ COUNT(*) FROM T1 T1, T2 T2 WHERE T1.ID1 = T2.ID2 AND T1.ID1 BETWEEN 1 AND 100 AND T1.STATUS1 = '-1' ; COUNT(*) ---------- 89100 1 row selected. Elapsed: 00:00:00.07 SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL, NULL, 'IOSTATS COST LAST')); --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.04 | 62 | |* 2 | HASH JOIN | | 1 | 15410 | 19 (6)| 89100 |00:00:00.01 | 62 | |* 3 | TABLE ACCESS FULL| T1 | 1 | 1011 | 9 (0)| 900 |00:00:00.01 | 31 | |* 4 | TABLE ACCESS FULL| T2 | 1 | 9999 | 9 (0)| 9900 |00:00:00.01 | 31 | --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T1"."ID1"="T2"."ID2") 3 - filter(("T1"."ID1"<=100 AND "T1"."ID1">=1 AND "T1"."STATUS1"='-1')) 4 - filter(("T2"."ID2">=1 AND "T2"."ID2"<=100)) ;
I.7. 통계정보 조작 시 주의할 점
![]() |
|
![]() |
문서정보
- 이 문서는 구루비에서 작성하였습니다.
- 이 문서를 다른 블로그나 홈페이지에 게재하실 경우에는 출처를 꼭 밝혀 주시면 고맙겠습니다.~^^
- 출처 : http://wiki.gurubee.net/pages/viewpage.action?pageId=9601028&
- 구루비 지식창고의 모든 문서는 크리에이티브 커먼즈의 저작자표시-비영리-동일조건변경허락(BY-NC-SA) 라이선스에 따라 자유롭게 사용할 수 있습니다.