Sunday, May 18, 2014

12c Hybrid Hash Distribution with Skew Detection / Handling - Failing

This is just an addendum to the previous post demonstrating one example (out of many possible) where the join skew handling feature fails. The test case setup is the same as in the previous post.

As mentioned in the AllThingsOracle.com article and in the introduction of the previous post, the feature at present only applies to a rather limited number of scenarios. To wrap things up and to give an idea what can happen with that new feature, here's a three table join that actually makes use of the feature for one join, only to suffer from the skew problem in the next join that uses the same join expression, but doesn't qualify (yet) for the skew handling feature:

-- Here the HYBRID SKEW distribution works for the B->C join
-- But the (B->C)->A join is affected by the same skew
-- So the HASH re-distribution of the (B->C) join result on B.ID is skewed
-- And hence the HASH JOIN/SORT AGGREGATE (operation 4+5) are affected by the skew
-- An interesting question is: Why is there a re-distribution (operation 11+12)?
-- The data is already distributed on B.ID / C.FK_ID
-- In pre-12c no redistribution happens
-- So it looks like a side-effect of the hybrid distribution
-- Which makes sense as it is not really HASH distributed, but hybrid
select count(t_2_filler) from (
select  /*+ monitor
            leading(b c a)
            use_hash(c)
            no_swap_join_inputs(c)
            pq_distribute(c hash hash)
            use_hash(a)
            swap_join_inputs(a)
            pq_distribute(a hash hash)
        */
        a.id as t_1_id
      , a.filler as t_1_filler
      , c.id as t_2_id
      , c.filler as t_2_filler
from    t_1 a
      , t_1 b
      , t_2 c
where
        c.fk_id = b.id
and     regexp_replace(c.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
and     a.id = b.id
and     regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') >= regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
);

Here's the execution plan from 12.1:
-----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |     1 |   318 |        |      |            |
|   1 |  SORT AGGREGATE                     |          |     1 |   318 |        |      |            |
|   2 |   PX COORDINATOR                    |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)              | :TQ10004 |     1 |   318 |  Q1,04 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE                  |          |     1 |   318 |  Q1,04 | PCWP |            |
|*  5 |      HASH JOIN                      |          |  5016 |  1557K|  Q1,04 | PCWP |            |
|   6 |       PX RECEIVE                    |          |  2000K|   202M|  Q1,04 | PCWP |            |
|   7 |        PX SEND HYBRID HASH          | :TQ10002 |  2000K|   202M|  Q1,02 | P->P | HYBRID HASH|
|   8 |         STATISTICS COLLECTOR        |          |       |       |  Q1,02 | PCWC |            |
|   9 |          PX BLOCK ITERATOR          |          |  2000K|   202M|  Q1,02 | PCWC |            |
|  10 |           TABLE ACCESS FULL         | T_1      |  2000K|   202M|  Q1,02 | PCWP |            |
|  11 |       PX RECEIVE                    |          |   100K|    20M|  Q1,04 | PCWP |            |
|  12 |        PX SEND HYBRID HASH          | :TQ10003 |   100K|    20M|  Q1,03 | P->P | HYBRID HASH|
|* 13 |         HASH JOIN BUFFERED          |          |   100K|    20M|  Q1,03 | PCWP |            |
|  14 |          PX RECEIVE                 |          |  2000K|   202M|  Q1,03 | PCWP |            |
|  15 |           PX SEND HYBRID HASH       | :TQ10000 |  2000K|   202M|  Q1,00 | P->P | HYBRID HASH|
|  16 |            STATISTICS COLLECTOR     |          |       |       |  Q1,00 | PCWC |            |
|  17 |             PX BLOCK ITERATOR       |          |  2000K|   202M|  Q1,00 | PCWC |            |
|  18 |              TABLE ACCESS FULL      | T_1      |  2000K|   202M|  Q1,00 | PCWP |            |
|  19 |          PX RECEIVE                 |          |  2000K|   202M|  Q1,03 | PCWP |            |
|  20 |           PX SEND HYBRID HASH (SKEW)| :TQ10001 |  2000K|   202M|  Q1,01 | P->P | HYBRID HASH|
|  21 |            PX BLOCK ITERATOR        |          |  2000K|   202M|  Q1,01 | PCWC |            |
|  22 |             TABLE ACCESS FULL       | T_2      |  2000K|   202M|  Q1,01 | PCWP |            |
-----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   5 - access("A"."ID"="B"."ID")
       filter( REGEXP_REPLACE ("A"."FILLER",'^\s+([[:alnum:]]+)\s+$','        \1',1,1,'c')>= REGEXP_REPLACE 
              ("B"."FILLER",'^\s+([[:alnum:]]+)\s+$','        \1',1,1,'c'))
  13 - access("C"."FK_ID"="B"."ID")
       filter( REGEXP_REPLACE ("C"."FILLER",'^\s+([[:alnum:]]+)\s+$','        \1',1,1,'c')>= REGEXP_REPLACE 
              ("B"."FILLER",'^\s+([[:alnum:]]+)\s+$','        \1',1,1,'c'))

And here's the formatted output from V$PQ_TQSTAT (two popular values in T_2.FK_ID, DOP = 4):
     TQ_ID SERVER_TYP   INSTANCE PROCESS    NUM_ROWS          % GRAPH              MB  bytes/row
---------- ---------- ---------- -------- ---------- ---------- ---------- ---------- ----------
         0 Producer            1 P000         508088         25 ##########         53        109
                                 P001         497226         25 ##########         52        109
                                 P002         489964         24 ##########         51        109
                                 P003         504728         25 ##########         52        109
           ********** **********          ----------
           Total                             2000006

           Consumer            1 P004         499616         25 ##########         52        109
                                 P005         500736         25 ##########         52        109
                                 P006         499523         25 ##########         52        109
                                 P007         500131         25 ##########         52        109
           ********** **********          ----------
           Total                             2000006

         1 Producer            1 P000         486339         24 #########          50        107
                                 P001         482280         24 #########          50        108
                                 P002         518636         26 ##########         53        107
                                 P003         512745         26 ##########         52        107
           ********** **********          ----------
           Total                             2000000

           Consumer            1 P004         500246         25 ##########         51        107
                                 P005         500104         25 ##########         51        107
                                 P006         499437         25 ##########         51        107
                                 P007         500213         25 ##########         51        107
           ********** **********          ----------
           Total                             2000000

         2 Producer            1 P004         503398         25 ##########         52        108
                                 P005         498008         25 ##########         52        109
                                 P006         507306         25 ##########         53        109
                                 P007         491288         25 ##########         51        109
           ********** **********          ----------
           Total                             2000000

           Consumer            1 P000         500363         25 ##########         52        109
                                 P001         500256         25 ##########         52        109
                                 P002         499609         25 ##########         52        109
                                 P003         499772         25 ##########         52        109
           ********** **********          ----------
           Total                             2000000

         3 Producer            1 P004         500246         25 ##########        100        210
                                 P005         500104         25 ##########        100        210
                                 P006         499437         25 ##########        100        210
                                 P007         500213         25 ##########        100        210
           ********** **********          ----------
           Total                             2000000

           Consumer            1 P000         199870         10 ##                 40        211
                                 P001         200300         10 ###                40        211
                                 P002         799835         40 ##########        159        209
                                 P003         799995         40 ##########        159        209
           ********** **********          ----------
           Total                             2000000

         4 Producer            1 P000              1         25 ##########          0         36
                                 P001              1         25 ##########          0         36
                                 P002              1         25 ##########          0         36
                                 P003              1         25 ##########          0         36
           ********** **********          ----------
           Total                                   4

           Consumer            1 QC                4        100 ##########          0         36
           ********** **********          ----------
           Total                                   4

There are a couple of interesting things to notice:

1. The execution plan shows another redistribution of the (B->C) join result for joining to (B->C)->A, although both joins use the same join expression (B.ID). So there is an additional table queue / redistribution (operations 11 + 12) and in consequence the HASH JOIN (operation 13) turns into a HASH JOIN BUFFERED. You won't find such a re-distribution (and HASH JOIN BUFFERED) in a pre-12c plan, simply because the optimizer recognizes that the data is already distributed in a suitable way. But in case of the HYBRID HASH distribution the data isn't necessarily exactly distributed by HASH (but by a mixture of BROADCAST/HASH/ROUND-ROBIN) and so the optimizer needs to play safe and introduce another redistribution

2. This additional redistribution isn't skew aware - so while we can see from the V$PQ_TQSTAT query result that for table queues 0 and 1 the skew detection / handling worked and ensured an even work distribution (the output above is from the variant running at a DOP of 4 and having two popular values) for table queues 2 and 3 a normal HASH distribution was used, leading to skew as can be seen in the "Consumer" part of TQ_ID = 3

So for the time being don't count on the new feature to solve parallel join skew problems in general. Sometimes it might work, but there are at present simply too many scenarios where it won't apply.

No comments: