View Source

h1. Anti Join

* 역조인
* Not In, Not Exists Operation

{section}
{column:width=50}
{code:sql|title=Not In & Filter}
SQL> SELECT /*+ gather_plan_statistics */
2 t1.c1, t1.c2
3 FROM t1
4 WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2)
5 ;
1001 dummy
1002 dummy
...
9769 dummy
9770 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID 73szqa0z3h6tq, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ t1.c1, t1.c2 FROM t1 WHERE t1.c1 NOT IN
(SELECT t2.c1 FROM t2)

Plan hash value: 895956251

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
|* 1 | FILTER | | 1 | | | 9000 |00:00:00.82 | 68908 |
| 2 | TABLE ACCESS FULL| T1 | 1 | 10000 | 11 (0)| 10000 |00:00:00.01 | 638 |
|* 3 | TABLE ACCESS FULL| T2 | 10000 | 1 | 3 (0)| 1000 |00:00:00.83 | 68270 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter( IS NULL)
3 - filter(LNNVL("T2"."C1"<>:B1))


21 개의 행이 선택되었습니다.

{code}
{column}
{column:width=50}
{column}
{section}

* Subquery Unnesting 이 이루어지지 않았다. Why?
** Anti Join 은 Null 값이 존재 하지 않는다는 것이 보장될 때만 동작한다.
** Filter 로 동작하면서 9000 Row 읽는데 68908 Block I/O 가 발생했다.
** 따라서 IS NOT NULL 조건을 추가하던가, 컬럼속성을 NOT NULL 로 변경하면 된다.

{section}
{column:width=50}
{code:sql|title=Not In & Is Not Null & Anti Join}
SQL> -- Not In & Is Not Null & Anti Join --
SQL> SELECT /*+ gather_plan_statistics */
2 t1.c1, t1.c2
3 FROM t1
4 WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2 WHERE t2.c1 IS NOT NULL)
5 AND t1.c1 IS NOT NULL
6 ;
1001 dummy
1002 dummy
...
9769 dummy
9770 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID 2j5s2ad2pwbzb, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ t1.c1, t1.c2 FROM t1 WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2 WHERE t2.c1
IS NOT NULL) AND t1.c1 IS NOT NULL

Plan hash value: 667136805

---------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------
|* 1 | HASH JOIN RIGHT ANTI | | 1 | 9001 | 14 (8)| 9000 |00:00:00.01 | 645 | 1066K| 1066K| 1186K (0)|
|* 2 | INDEX FAST FULL SCAN| T2_N1 | 1 | 1000 | 2 (0)| 1000 |00:00:00.01 | 7 | | | |
|* 3 | TABLE ACCESS FULL | T1 | 1 | 10000 | 11 (0)| 10000 |00:00:00.01 | 638 | | | |
---------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T1"."C1"="T2"."C1")
2 - filter("T2"."C1" IS NOT NULL)
3 - filter("T1"."C1" IS NOT NULL)


22 개의 행이 선택되었습니다.

{code}
{column}
{column:width=50}
{column}
{section}

* IS NOT NULL 조건 추가에 따른 효과.
** Subquery Unnesting 이 이루어졌다.
** HASH JOIN RIGHT ANTI 이 되면서 일량이 645 Block로 현격하게 개선되었다.
** 테이블 스캔 대신 인덱스 스캔을 사용한다.
** 따라서 Subquery Unnesting 이 되지 않더라도 IS NOT NULL 조건을 명확하게 추가하는것이 유리하다.

{section}
{column:width=50}
{code:sql|title=Not In & Is Not Null & No_Unnest}
SQL> -- Not In & Is Not Null & No_Unnest --
SQL> SELECT /*+ gather_plan_statistics */
2 t1.c1, t1.c2
3 FROM t1
4 WHERE t1.c1 NOT IN (SELECT /*+ No_Unnest */ t2.c1 FROM t2 WHERE t2.c1 IS NOT NULL)
5 AND t1.c1 IS NOT NULL
6 ;
1001 dummy
1002 dummy
...
9769 dummy
9770 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID 7jdn0xgkm8gzm, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ t1.c1, t1.c2 FROM t1 WHERE t1.c1 NOT IN
(SELECT /*+ No_Unnest */ t2.c1 FROM t2 WHERE t2.c1 IS NOT NULL) AND t1.c1 IS NOT NULL

Plan hash value: 2243065295

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
|* 1 | FILTER | | 1 | | | 9000 |00:00:00.05 | 20638 |
|* 2 | TABLE ACCESS FULL| T1 | 1 | 10000 | 11 (0)| 10000 |00:00:00.01 | 638 |
|* 3 | INDEX RANGE SCAN | T2_N1 | 10000 | 1 | 1 (0)| 1000 |00:00:00.04 | 20000 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter( IS NULL)
2 - filter("T1"."C1" IS NOT NULL)
3 - access("T2"."C1"=:B1)
filter("T2"."C1" IS NOT NULL)


23 개의 행이 선택되었습니다.

{code}
{column}
{column:width=50}
{column}
{section}
* IS NOT NULL 조건 추가에 따른 효과.
** Subquery Unnesting 이 되지 않더라도 인덱스 스캔만으로도 일량이 20638로 68908에 비해 큰 개선효과가 있다.

{section}
{column:width=50}
{code:sql|title=Not In & 11G }
SQL> -- Not In & 11G --
SQL> SELECT /*+ gather_plan_statistics */
2 t1.c1, t1.c2
3 FROM t1
4 WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2)
5 ;
1001 dummy
1002 dummy
...
9999 dummy
10000 dummy

9000 개의 행이 선택되었습니다.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(null,null,'allstats cost last'));
SQL_ID 73szqa0z3h6tq, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */ t1.c1, t1.c2 FROM t1
WHERE t1.c1 NOT IN (SELECT t2.c1 FROM t2)

Plan hash value: 2739594415

----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 15 (100)| 9000 |00:00:00.05 | 645 | | | |
|* 1 | HASH JOIN RIGHT ANTI NA| | 1 | 9001 | 15 (7)| 9000 |00:00:00.05 | 645 | 1066K| 1066K| 1227K (0)|
| 2 | TABLE ACCESS FULL | T2 | 1 | 1000 | 3 (0)| 1000 |00:00:00.01 | 7 | | | |
| 3 | TABLE ACCESS FULL | T1 | 1 | 10000 | 11 (0)| 10000 |00:00:00.04 | 638 | | | |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("T1"."C1"="T2"."C1")


21 개의 행이 선택되었습니다.

SQL>
{code}
{column}
{column:width=50}
{column}
{section}
* Oracle 11g 에서는 Anti Join을 사용하기 위해 더이상 NOT NULL 을 보장할 필요가 없다.
** Null Aware Anti Join(ANTI NA) 이라는 새로운 Join Operation 추가.