View Source

h1. 06 스칼라 서브쿼리를 이용한 조인
h3. (1) 스칼라 서브쿼리
* 스칼라 서브쿼리 : 함수처럼 한 레코드당 정확히 하나의 값만을 리턴하는 서브쿼리를 '스칼라 서브쿼리'라고 한다.(select-list)
* 스칼라 서브쿼리의 테이블은 NL 조인에서 Inner 테이블의 역할을 수행하고 Driving 테이블로부터 조인 조건을 상수로 제공 받는다.
||스칼라 서브 쿼리||
|{code:sql}
SQL> select /*+ gather_plan_statistics */ empno, ename, sal, hiredate
2 , (select d.dname from dept d where d.deptno = e.deptno ) dname
3 from emp e
4 where sal >= 2000;
SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
| 1 | TABLE ACCESS BY INDEX ROWID| DEPT | 3 | 1 | 3 |00:00:00.01 | 5 |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
|* 2 | INDEX UNIQUE SCAN | PK_DEPT | 3 | 1 | 3 |00:00:00.01 | 2 |
|* 3 | TABLE ACCESS FULL | EMP | 1 | 11 | 6 |00:00:00.01 | 8 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("D"."DEPTNO"=:B1)
3 - filter("SAL">=2000)
{code}|
||Outer Join||
|{code:sql}
select /*+ gather_plan_statistics ordered use_nl(d)*/
e.empno, e.ename, e.sal, e.hiredate, d.dname
from emp e, dept d
where d.deptno(+) = e.deptno
and e.sal >= 2000;
SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| 1 | NESTED LOOPS OUTER | | 1 | 11 | 6 |00:00:00.01 | 16 |
|* 2 | TABLE ACCESS FULL | EMP | 1 | 11 | 6 |00:00:00.01 | 8 |
| 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 6 | 1 | 6 |00:00:00.01 | 8 |
|* 4 | INDEX UNIQUE SCAN | PK_DEPT | 6 | 1 | 6 |00:00:00.01 | 2 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("E"."SAL">=2000)
4 - access("D"."DEPTNO"="E"."DEPTNO")
{code}
* 위에서 예시한 쿼리는 결과만 같은 것이 아니라 조인을 수행하는 처리 경로도 동일한데, NL 방식으로 수행되도록 힌트를 사용했기 때문임

h3. (2) 스칼라 서브쿼리의 캐싱 효과
* 오라클은 스칼라 서브 쿼리 수행 회수를 최소화 하려고 그 입력 값과 출력 값을 내부 캐시(Query Execution Cache)에 저장해 둔다.
* 스칼라 서브쿼리가 수행될 때면 일단 '입력 값'을 캐시에서 찾아 보고 거기 있으면 저장된 '출력 값'을 리턴한다.
* 캐시에서 찾지 못할 때만 쿼리를 수행하며, 결과는 버리지 않고 캐시에 저장해둔다.
||스칼라 서브 쿼리 케시 알고리즘||
|{code:sql}
select empno, ename, sal, hiredate
, (select d.dname -> 출력값 : d.dname
from dept d
where d.deptno = e.empno -> 입력값 : e.empno
)
from emp e
where sal >= 2000
{code}
||TEST 준비 스크립트||
|{CODE:SQL}
SQL> CREATE TABLE T_CODE AS
2 SELECT ROWNUM CODE, CHR(ROWNUM + 64) NAME FROM DUAL
3 CONNECT BY LEVEL <= 1000

SQL> CREATE INDEX IDX_T_CODE_PK ON T_CODE(CODE,NAME);

SQL> ALTER TABLE T_CODE ADD CONSTRAINT IDX_T_CODE_PK PRIMARY KEY(CODE);

SQL> CREATE TABLE T AS
2 SELECT ROWNUM NO, CEIL(ROWNUM/1000) CODE FROM DUAL
3 CONNECT BY LEVEL <= 1000000;

SQL> CREATE INDEX IDX_T_01 ON T( CODE);

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T');

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T_CODE');

CREATE or REPLACE FUNCTION GET_NAME(IN_CODE INT)
RETURN VARCHAR
IS
RET_NAME T_CODE.NAME%TYPE;
BEGIN
SELECT NAME
INTO RET_NAME
FROM T_CODE WHERE CODE = IN_CODE;
RETURN RET_NAME;
END;
/
{CODE}|
||일반 펑션||
|{CODE:SQL}
SQL> SELECT /*+ gather_plan_statistics*/COUNT(T)
2 FROM (
3 SELECT GET_NAME(CODE) AS T
4 FROM T)
5 ;

COUNT(T)
----------
1000000

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:17.42 | 2003K|

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| 2 | TABLE ACCESS FULL| T | 1 | 994K| 1000K|00:00:00.01 | 1925 |
-------------------------------------------------------------------------------------
{CODE}|
||스칼라 서브 쿼리 사용한 펑션||
|{CODE:SQL}
SQL> SELECT /*+ gather_plan_statistics*/COUNT(T)
2 FROM (
3 SELECT (SELECT GET_NAME(CODE) FROM DUAL) AS T
4 FROM T)
5 ;

COUNT(T)
----------
1000000

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

-------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------
| 1 | FAST DUAL | | 1000 | 1 | 1000 |00:00:00.01 | 0 |

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
| 2 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.18 | 3927 |
| 3 | TABLE ACCESS FULL| T | 1 | 994K| 1000K|00:00:00.01 | 1925 |
-------------------------------------------------------------------------------------

|{CODE}
* 입력값과 출력 값을 빠르게 저장하고 찾기 위해 오라클은 해싱 알고리즘을 사용한다.
* '비용기반의 오라클 원리'저자 조나단 루이스 설명에 의하면 8i, 9i에서는 256개 엔트리를 캐싱하고
* 10에서는 입력과 출력값 크기,_query_execution_cach_max_size파라미터에 의해 캐시 사이즈가 결정된다고 한다.
* 해시 충돌이 발생했을 때 기존 엔트리를 밀어내고 새로 수행한 입력 값과 출력 값으로 대체할 것 같지만,
오라클은 기존 캐시 엔트리를 그대로 둔 채 스칼라 서브 쿼리만 한 번 더 수행하고 만다.따라서 해시 충돌이 발생한 입력 값이 반복적으로
입력되면 스칼라 서브쿼리도 반복 수행된다.
||결론 : 스칼라 서브쿼리의 캐싱 효과는 입력 값의 종류가 소수여서 해시 충돌 가능성이 적은 때라야 효과가 있으며
, 반대의 경우라며 캐시를 확인하는 비용 때문에 오히려 성능은 저하되고 CPU 사용률만 높게 만든다.
게다가 스칼라 서브쿼리를 사용하면 NL조인에서 Inner 쪽인덱스와 테이블에 나타나는 버퍼 Pinning 효과도 사라진다는 사실을 기억할 필요가 있다.||

h3. (3) 두 개 이상의 값을 리턴하고 싶을 때
||'CHICAGO'인 부서만 대상으로 급여 수준을 집계하려는 것인데, 사원 테이블 전체를 다 읽어야 하는 비효율 발생이라는데...(어라 변형이...-_-)||
|{CODE:SQL}
SQL> select /*+gather_plan_statistics*/d.deptno, d.dname, avg_sal, min_sal, max_sal
2 from dept d
3 ,(select deptno, avg(sal) avg_sal, min(sal) min_sal, max(sal) max_sal
4 from emp group by deptno) e
5 where e.deptno(+) = d.deptno
6 and d.loc = 'CHICAGO';

DEPTNO DNAME AVG_SAL MIN_SAL MAX_SAL
---------- -------------- ---------- ---------- ----------
30 SALES 1566.66667 950 2850

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Us
----------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| 1 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 2 | MERGE JOIN OUTER | | 1 | 5 | 6 |00:00:00.01 | 3 | | | |
|* 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 1 | 1 |00:00:00.01 | 2 | | | |
| 4 | INDEX FULL SCAN | PK_DEPT | 1 | 4 | 4 |00:00:00.01 | 1 | | | |
|* 5 | SORT JOIN | | 1 | 14 | 6 |00:00:00.01 | 1 | 2048 | 2048 | 2048 (0)|
| 6 | INDEX FULL SCAN | EMP_X01 | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
----------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
3 - filter("D"."LOC"='CHICAGO')
5 - access("DEPTNO"="D"."DEPTNO")
filter("DEPTNO"="D"."DEPTNO")

{CODE}|
||(잠시 딴애기)위 쿼리가 변형 일어나 아래처럼 됨.. -_- 비효율 발생이 없는듯.. 근데 왜 머지로 풀리지.?? -_- 할것없다구했는데.. ㅠ||
|{CODE:SQL}
SQL> SELECT /*+gather_plan_statistics */D.deptno, D.DNAME, avg(sal) avg_sal, min(sal) min_sal, max(sal) max_sal
2 FROM EMP E
3 , DEPT D
4 WHERE E.DEPTNO(+) = D.DEPTNO
5 AND D.LOC = 'CHICAGO'
6 GROUP BY D.deptno, D.DNAME;

DEPTNO DNAME AVG_SAL MIN_SAL MAX_SAL
---------- -------------- ---------- ---------- ----------
30 SALES 1566.66667 950 2850

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Us
----------------------------------------------------------------------------------------------------
| 1 | HASH GROUP BY | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| 2 | MERGE JOIN OUTER | | 1 | 5 | 6 |00:00:00.01 | 3 | | | |
|* 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 1 | 1 |00:00:00.01 | 2 | | | |
| 4 | INDEX FULL SCAN | PK_DEPT | 1 | 4 | 4 |00:00:00.01 | 1 | | | |
|* 5 | SORT JOIN | | 1 | 14 | 6 |00:00:00.01 | 1 | 2048 | 2048 | 2048 (0)|
| 6 | INDEX FULL SCAN | EMP_X01 | 1 | 14 | 14 |00:00:00.01 | 1 | | | |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("D"."LOC"='CHICAGO')

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
5 - access("E"."DEPTNO"="D"."DEPTNO")
filter("E"."DEPTNO"="D"."DEPTNO")

|{CODE}
||(잠시 딴애기)NO_MERGE 1 ||
|{CODE:SQL}
SQL> select /*+gather_plan_statistics LEADING(E) INDEX(E PK_EMP) USE_HASH(E) NO_MERGE(E)*/
2 d.deptno, d.dname, avg_sal, min_sal, max_sal

3 from dept d

4 ,(select deptno, avg(sal) avg_sal, min(sal) min_sal, max(sal) max_sal
5 from emp group by deptno) e
6 where e.deptno(+) = d.deptno
7 and d.loc = 'CHICAGO';

DEPTNO DNAME AVG_SAL MIN_SAL MAX_SAL
---------- -------------- ---------- ---------- ----------
30 SALES 1566.66667 950 2850

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | U
----------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
|* 1 | HASH JOIN OUTER | | 1 | 1 | 1 |00:00:00.01 | 9 | 921K| 921K| 291K (0)|
|* 2 | TABLE ACCESS FULL | DEPT | 1 | 1 | 1 |00:00:00.01 | 7 | | | |
| 3 | VIEW | | 1 | 3 | 3 |00:00:00.01 | 2 | | | |
| 4 | HASH GROUP BY | | 1 | 3 | 3 |00:00:00.01 | 2 | | | |
| 5 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 14 | 14 |00:00:00.01 | 2 | | |
| 6 | INDEX FULL SCAN | PK_EMP | 1 | 14 | 14 |00:00:00.01 | 1 | | |
----------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
1 - access("E"."DEPTNO"="D"."DEPTNO")
2 - filter("D"."LOC"='CHICAGO')
|{CODE}
||(잠시 딴애기)NO_MERGE 2 ||
|{CODE:SQL}
SQL> select /*+gather_plan_statistics no_merge(e)*/d.deptno, d.dname, avg_sal, min_sal, max_sal
2 from dept d
3 ,(select deptno, avg(sal) avg_sal, min(sal) min_sal, max(sal) max_sal
4 from emp group by deptno) e
5 where e.deptno(+) = d.deptno
6 and d.loc = 'CHICAGO';

DEPTNO DNAME AVG_SAL MIN_SAL MAX_SAL
---------- -------------- ---------- ---------- ----------
30 SALES 1566.66667 950 2850

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem
----------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| 1 | MERGE JOIN OUTER | | 1 | 1 | 1 |00:00:00.08 | 11 | | | |
|* 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 1 | 1 |00:00:00.04 | 4 | | |
| 3 | INDEX FULL SCAN | PK_DEPT | 1 | 4 | 4 |00:00:00.03 | 2 | | | |
|* 4 | SORT JOIN | | 1 | 3 | 1 |00:00:00.04 | 7 | 2048 | 2048 | 20
| 5 | VIEW | | 1 | 3 | 3 |00:00:00.04 | 7 | | | |
| 6 | HASH GROUP BY | | 1 | 3 | 3 |00:00:00.04 | 7 | | | |
| 7 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.02 | 7 | |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

2 - filter("D"."LOC"='CHICAGO')
4 - access("E"."DEPTNO"="D"."DEPTNO")
filter("E"."DEPTNO"="D"."DEPTNO")
|{CODE}
||스칼라 서브 쿼리는 한 레코드당 하나의 값만 리턴함(아래 쿼리 불가능(X))||
|{CODE:SQL}
select d.deptno, d.dname
,(select avg(sal), min(sal), max(sal) from emp where deptno = d.deptno)
from dept d
where d.loc = 'CHICAGO';
|{CODE}
|| 반복적 수행 (비효율)||
|{CODE:SQL}
SQL> select /*+gather_plan_statistics*/d.deptno, d.dname
2 ,(select avg(sal) from emp where deptno = d.deptno) avg_sal
3 ,(select min(sal) from emp where deptno = d.deptno) min_sal
4 ,(select max(sal) from emp where deptno = d.deptno) max_sal
5 from dept d
6 where d.loc = 'CHICAGO';

DEPTNO DNAME AVG_SAL MIN_SAL MAX_SAL
---------- -------------- ---------- ---------- ----------
30 SALES 1566.66667 950 2850

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

---------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
|* 2 | INDEX RANGE SCAN| EMP_X01 | 1 | 5 | 6 |00:00:00.01 | 1 |
| 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
|* 4 | INDEX RANGE SCAN| EMP_X01 | 1 | 5 | 6 |00:00:00.01 | 1 |
| 5 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
|* 6 | INDEX RANGE SCAN| EMP_X01 | 1 | 5 | 6 |00:00:00.01 | 1 |
|* 7 | TABLE ACCESS FULL| DEPT | 1 | 1 | 1 |00:00:00.01 | 8 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------

2 - access("DEPTNO"=:B1)
4 - access("DEPTNO"=:B1)
6 - access("DEPTNO"=:B1)
7 - filter("D"."LOC"='CHICAGO')
|{CODE}
||substr함수 사용(오~ XPLAN 두 되는군요 =_=)||
|{code:sql}
SQL> SELECT /*+ GATHER_PLAN_STATISTICS */
2 DEPTNO
3 , DNAME
4 , TO_NUMBER(SUBSTR(SAL, 1, 7)) AVG_SAL
5 , TO_NUMBER(SUBSTR(SAL, 8, 7)) MIN_SAL
6 , TO_NUMBER(SUBSTR(SAL, 15)) MAX_SAL
7 FROM (SELECT D.DEPTNO, D.DNAME
8 , (SELECT LPAD(AVG(SAL), 7)||LPAD(MIN(SAL),7)||MAX(SAL)
9 FROM EMP
10 WHERE DEPTNO = D.DEPTNO) SAL
11 FROM DEPT D
12 WHERE D.LOC = 'CHICAGO'
13 )
14 ;

DEPTNO DNAME AVG_SAL MIN_SAL MAX_SAL
---------- -------------- ---------- ---------- ----------
30 SALES 1566.66 950 2850

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

---------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
|* 2 | INDEX RANGE SCAN| EMP_X01 | 1 | 5 | 6 |00:00:00.01 | 1 |
|* 3 | TABLE ACCESS FULL| DEPT | 1 | 1 | 1 |00:00:00.01 | 8 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("DEPTNO"=:B1)

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
3 - filter("D"."LOC"='CHICAGO')
|{code}
||오브젝트 TYPE을 사용하는 방식||
|{CODE:SQL}
SQL> CREATE OR REPLACE TYPE SAL_TYPE AS OBJECT
2 (AVG_SAL NUMBER, MIN_SAL NUMBER, MAX_SAL NUMBER)
3 /

유형이 생성되었습니다.

SQL> SELECT /*+ GATHER_PLAN_STATISTICS */DEPTNO, DNAME
2 , A.SAL.AVG_SAL, A.SAL.MIN_SAL, A.SAL.MAX_SAL
3 FROM (SELECT D.DEPTNO, D.DNAME
4 , (SELECT SAL_TYPE(AVG(SAL), MIN(SAL), MAX(SAL) )
5 FROM EMP WHERE DEPTNO = D.DEPTNO) SAL
6 FROM DEPT D WHERE D.LOC = 'CHICAGO')A
7 ;

DEPTNO DNAME SAL.AVG_SAL SAL.MIN_SAL SAL.MAX_SAL
---------- -------------- ----------- ----------- -----------
30 SALES 1566.66667 950 2850

SQL> select * from table( dbms_xplan.display_cursor( null, null, 'ALLSTATS LAST'));

---------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 1 |
|* 2 | INDEX RANGE SCAN| EMP_X01 | 1 | 5 | 6 |00:00:00.01 | 1 |
|* 3 | TABLE ACCESS FULL| DEPT | 1 | 1 | 1 |00:00:00.01 | 8 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("DEPTNO"=:B1)
3 - filter("D"."LOC"='CHICAGO')
|{CODE}