This is just an addendum to the
previous post demonstrating one example (out of many possible) where the join skew handling feature fails. The test case setup is the same as in the previous post.
As mentioned in the
AllThingsOracle.com article and in the introduction of the previous post, the feature at present only applies to a rather limited number of scenarios. To wrap things up and to give an idea what can happen with that new feature, here's a
three table join that actually makes use of the feature for
one join, only to suffer from the
skew problem in the
next join that uses the same join expression, but doesn't qualify (yet) for the skew handling feature:
-- Here the HYBRID SKEW distribution works for the B->C join
-- But the (B->C)->A join is affected by the same skew
-- So the HASH re-distribution of the (B->C) join result on B.ID is skewed
-- And hence the HASH JOIN/SORT AGGREGATE (operation 4+5) are affected by the skew
-- An interesting question is: Why is there a re-distribution (operation 11+12)?
-- The data is already distributed on B.ID / C.FK_ID
-- In pre-12c no redistribution happens
-- So it looks like a side-effect of the hybrid distribution
-- Which makes sense as it is not really HASH distributed, but hybrid
select count(t_2_filler) from (
select /*+ monitor
leading(b c a)
use_hash(c)
no_swap_join_inputs(c)
pq_distribute(c hash hash)
use_hash(a)
swap_join_inputs(a)
pq_distribute(a hash hash)
*/
a.id as t_1_id
, a.filler as t_1_filler
, c.id as t_2_id
, c.filler as t_2_filler
from t_1 a
, t_1 b
, t_2 c
where
c.fk_id = b.id
and regexp_replace(c.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
and a.id = b.id
and regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
);
Here's the execution plan from 12.1:
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 318 | | | |
| 1 | SORT AGGREGATE | | 1 | 318 | | | |
| 2 | PX COORDINATOR | | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10004 | 1 | 318 | Q1,04 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | 1 | 318 | Q1,04 | PCWP | |
|* 5 | HASH JOIN | | 5016 | 1557K| Q1,04 | PCWP | |
| 6 | PX RECEIVE | | 2000K| 202M| Q1,04 | PCWP | |
| 7 | PX SEND HYBRID HASH | :TQ10002 | 2000K| 202M| Q1,02 | P->P | HYBRID HASH|
| 8 | STATISTICS COLLECTOR | | | | Q1,02 | PCWC | |
| 9 | PX BLOCK ITERATOR | | 2000K| 202M| Q1,02 | PCWC | |
| 10 | TABLE ACCESS FULL | T_1 | 2000K| 202M| Q1,02 | PCWP | |
| 11 | PX RECEIVE | | 100K| 20M| Q1,04 | PCWP | |
| 12 | PX SEND HYBRID HASH | :TQ10003 | 100K| 20M| Q1,03 | P->P | HYBRID HASH|
|* 13 | HASH JOIN BUFFERED | | 100K| 20M| Q1,03 | PCWP | |
| 14 | PX RECEIVE | | 2000K| 202M| Q1,03 | PCWP | |
| 15 | PX SEND HYBRID HASH | :TQ10000 | 2000K| 202M| Q1,00 | P->P | HYBRID HASH|
| 16 | STATISTICS COLLECTOR | | | | Q1,00 | PCWC | |
| 17 | PX BLOCK ITERATOR | | 2000K| 202M| Q1,00 | PCWC | |
| 18 | TABLE ACCESS FULL | T_1 | 2000K| 202M| Q1,00 | PCWP | |
| 19 | PX RECEIVE | | 2000K| 202M| Q1,03 | PCWP | |
| 20 | PX SEND HYBRID HASH (SKEW)| :TQ10001 | 2000K| 202M| Q1,01 | P->P | HYBRID HASH|
| 21 | PX BLOCK ITERATOR | | 2000K| 202M| Q1,01 | PCWC | |
| 22 | TABLE ACCESS FULL | T_2 | 2000K| 202M| Q1,01 | PCWP | |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("A"."ID"="B"."ID")
filter( REGEXP_REPLACE ("A"."FILLER",'^\s+([[:alnum:]]+)\s+$',' \1',1,1,'c')>= REGEXP_REPLACE
("B"."FILLER",'^\s+([[:alnum:]]+)\s+$',' \1',1,1,'c'))
13 - access("C"."FK_ID"="B"."ID")
filter( REGEXP_REPLACE ("C"."FILLER",'^\s+([[:alnum:]]+)\s+$',' \1',1,1,'c')>= REGEXP_REPLACE
("B"."FILLER",'^\s+([[:alnum:]]+)\s+$',' \1',1,1,'c'))
And here's the formatted output from V$PQ_TQSTAT (two popular values in T_2.FK_ID, DOP = 4):
TQ_ID SERVER_TYP INSTANCE PROCESS NUM_ROWS % GRAPH MB bytes/row
---------- ---------- ---------- -------- ---------- ---------- ---------- ---------- ----------
0 Producer 1 P000 508088 25 ########## 53 109
P001 497226 25 ########## 52 109
P002 489964 24 ########## 51 109
P003 504728 25 ########## 52 109
********** ********** ----------
Total 2000006
Consumer 1 P004 499616 25 ########## 52 109
P005 500736 25 ########## 52 109
P006 499523 25 ########## 52 109
P007 500131 25 ########## 52 109
********** ********** ----------
Total 2000006
1 Producer 1 P000 486339 24 ######### 50 107
P001 482280 24 ######### 50 108
P002 518636 26 ########## 53 107
P003 512745 26 ########## 52 107
********** ********** ----------
Total 2000000
Consumer 1 P004 500246 25 ########## 51 107
P005 500104 25 ########## 51 107
P006 499437 25 ########## 51 107
P007 500213 25 ########## 51 107
********** ********** ----------
Total 2000000
2 Producer 1 P004 503398 25 ########## 52 108
P005 498008 25 ########## 52 109
P006 507306 25 ########## 53 109
P007 491288 25 ########## 51 109
********** ********** ----------
Total 2000000
Consumer 1 P000 500363 25 ########## 52 109
P001 500256 25 ########## 52 109
P002 499609 25 ########## 52 109
P003 499772 25 ########## 52 109
********** ********** ----------
Total 2000000
3 Producer 1 P004 500246 25 ########## 100 210
P005 500104 25 ########## 100 210
P006 499437 25 ########## 100 210
P007 500213 25 ########## 100 210
********** ********** ----------
Total 2000000
Consumer 1 P000 199870 10 ## 40 211
P001 200300 10 ### 40 211
P002 799835 40 ########## 159 209
P003 799995 40 ########## 159 209
********** ********** ----------
Total 2000000
4 Producer 1 P000 1 25 ########## 0 36
P001 1 25 ########## 0 36
P002 1 25 ########## 0 36
P003 1 25 ########## 0 36
********** ********** ----------
Total 4
Consumer 1 QC 4 100 ########## 0 36
********** ********** ----------
Total 4
There are a couple of interesting things to notice:
1. The execution plan shows another
redistribution of the
(B->C) join result for joining to
(B->C)->A, although both joins use the same join expression (B.ID). So there is an additional table queue / redistribution (operations 11 + 12) and in consequence the HASH JOIN (operation 13) turns into a
HASH JOIN BUFFERED. You won't find such a re-distribution (and HASH JOIN BUFFERED) in a pre-12c plan, simply because the optimizer recognizes that the data is already distributed in a suitable way. But in case of the HYBRID HASH distribution the data isn't necessarily exactly distributed by HASH (but by a mixture of BROADCAST/HASH/ROUND-ROBIN) and so the optimizer needs to play safe and introduce another redistribution
2. This additional redistribution isn't skew aware - so while we can see from the
V$PQ_TQSTAT query result that for table queues 0 and 1 the skew detection / handling worked and ensured an even work distribution (the output above is from the variant running at a DOP of 4 and having two popular values) for table queues 2 and 3 a normal HASH distribution was used, leading to
skew as can be seen in the "Consumer" part of
TQ_ID = 3
So for the time being don't count on the new feature to solve parallel join skew problems in general. Sometimes it might work, but there are at present simply too many scenarios where it won't apply.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.