Sunday, October 26, 2014

Heuristic TEMP Table Transformation

There are at least three different ways how the Oracle optimizer can come up with a so called TEMP table transformation, that is materializing an intermediate result set:

- As part of a star transformation the repeated access to dimensions can be materialized

- As part of evaluating GROUPING SETs intermediate result sets can be materialized

- Common Subquery/Table Expressions (CTE, WITH clause)

Probably the most common usage of the materialization is in conjunction with the WITH clause.

This is nothing new but since I came across this issue several times recently, here's a short demonstration and a reminder that this so called "TEMP Table Transformation" - at least in the context of the WITH clause - isn't really cost-based, in contrast to most other optimizer transformations nowadays - although the unnest transformation of subqueries also has a "no-brainer" variant where costing isn't considered.

The logic simply seems to be: If the CTE expression is referenced more than once AND the CTE expression contains at least some (filter or join) predicate then it will be materialized.

While in most cases this makes sense to avoid the otherwise repeated evaluation of the CTE expression, there are cases where additional predicates that could be pushed inside the CTE would lead to different, significantly more efficient access paths than materializing the full CTE expression without applying the filters and filtering on the TEMP table afterwards.

Here are just two very simple examples that demonstrate the point, both based on this sample table setup:

create table t1
as
select 
        rownum as id
      , rpad('x', 100) as filler
from 
        dual
connect by 
        level <=1e5;

exec dbms_stats.gather_table_stats(null, 't1')

create index t1_idx on t1 (id);

The index on T1.ID opens up potentially a very precise access to rows.

Here is example number one:

with 
a as 
(
  select  /* inline */ 
          id
        , filler 
  from    
          t1 
  where 
          filler != 'x'
)
select
        t1.*
      , a1.filler
      , a2.filler
from
        t1
      , a a1
      , a a2
where
       a1.id = t1.id
and    a2.id = t1.id
and    t1.id = 1
and    a1.id = 1
and    a2.id = 1
;

-- 11.2.0.3 Plan without INLINE hint
------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                           |     1 |   236 |  1207   (1)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION     |                           |       |       |            |          |
|   2 |   LOAD AS SELECT               | SYS_TEMP_0FD9D6619_229329 |       |       |            |          |
|*  3 |    TABLE ACCESS FULL           | T1                        | 99999 |    10M|   420   (1)| 00:00:01 |
|*  4 |   HASH JOIN                    |                           |     1 |   236 |   787   (1)| 00:00:01 |
|*  5 |    HASH JOIN                   |                           |     1 |   171 |   394   (1)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T1                        |     1 |   106 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN          | T1_IDX                    |     1 |       |     1   (0)| 00:00:01 |
|*  8 |     VIEW                       |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|   9 |      TABLE ACCESS FULL         | SYS_TEMP_0FD9D6619_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
|* 10 |    VIEW                        |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|  11 |     TABLE ACCESS FULL          | SYS_TEMP_0FD9D6619_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

-- 11.2.0.4 Plan without INLINE hint
--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                           |  9999M|  2197G|       | 28468  (92)| 00:00:02 |
|   1 |  TEMP TABLE TRANSFORMATION     |                           |       |       |       |            |          |
|   2 |   LOAD AS SELECT               | SYS_TEMP_0FD9D661A_229329 |       |       |       |            |          |
|*  3 |    TABLE ACCESS FULL           | T1                        | 99999 |    10M|       |   420   (1)| 00:00:01 |
|*  4 |   HASH JOIN                    |                           |  9999M|  2197G|  7520K| 28048  (93)| 00:00:02 |
|*  5 |    VIEW                        |                           | 99999 |  6347K|       |   392   (1)| 00:00:01 |
|   6 |     TABLE ACCESS FULL          | SYS_TEMP_0FD9D661A_229329 | 99999 |    10M|       |   392   (1)| 00:00:01 |
|*  7 |    HASH JOIN                   |                           | 99999 |    16M|       |   394   (1)| 00:00:01 |
|   8 |     TABLE ACCESS BY INDEX ROWID| T1                        |     1 |   106 |       |     2   (0)| 00:00:01 |
|*  9 |      INDEX RANGE SCAN          | T1_IDX                    |     1 |       |       |     1   (0)| 00:00:01 |
|* 10 |     VIEW                       |                           | 99999 |  6347K|       |   392   (1)| 00:00:01 |
|  11 |      TABLE ACCESS FULL         | SYS_TEMP_0FD9D661A_229329 | 99999 |    10M|       |   392   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------------

-- 11.2.0.3/11.2.0.4 Plan with INLINE hint
-----------------------------------------------------------------------------------------
| Id  | Operation                      | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |        |     1 |   318 |     6   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                  |        |       |       |            |          |
|   2 |   NESTED LOOPS                 |        |     1 |   318 |     6   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                |        |     1 |   212 |     4   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN          | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN          | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  8 |    INDEX RANGE SCAN            | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  9 |   TABLE ACCESS BY INDEX ROWID  | T1     |     1 |   106 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------


The filter in the CTE expression is just there to fulfill the rules I've stated above, without it the TEMP table transformation wouldn't be considered at all. It could also be a (non-filtering) join condition, for example.

Notice the big difference in cost estimates between the plans with and without materialization. Clearly a cost-based evaluation should have rejected the TEMP table transformation, simply because it is a bad idea to materialize 100K rows and afterwards access this TEMP table twice to filter out exactly a single row, instead of accessing the original, untransformed row source twice via precise index access.

This is by the way an example of another anomaly that was only recently introduced (apparently in the 11.2.0.4 patch set / 12.1 release): Notice the bad cardinality estimate in the 11.2.0.4 plan with the TEMP table transformation - the filter on the TEMP table isn't evaluated properly (was already there in previous releases) and in addition the join cardinality is way off - 10G rows instead of a single row is not really a good estimate - and as a side effect the HASH JOIN uses a bad choice for the build row sources.

Another possible, perhaps less common variant is this example:

with 
a as 
(
  select  /* inline */ 
          id
        , filler 
  from 
          t1 
  where 
          filler != 'x'
)
select
        id
      , (select filler from a where id = x.id) as scal_val1
      , (select filler from a where id = x.id) as scal_val2
from
        t1 x
;

-- 12.1.0.2 Plan without INLINE hint
--------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |   100K|   488K|    77M  (1)| 00:50:26 |
|*  1 |  VIEW                      |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|   2 |   TABLE ACCESS FULL        | SYS_TEMP_0FD9D660F_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
|*  3 |  VIEW                      |                           | 99999 |  6347K|   392   (1)| 00:00:01 |
|   4 |   TABLE ACCESS FULL        | SYS_TEMP_0FD9D660F_229329 | 99999 |    10M|   392   (1)| 00:00:01 |
|   5 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |          |
|   6 |   LOAD AS SELECT           | SYS_TEMP_0FD9D660F_229329 |       |       |            |          |
|*  7 |    TABLE ACCESS FULL       | T1                        | 99999 |    10M|   420   (1)| 00:00:01 |
|   8 |   TABLE ACCESS FULL        | T1                        |   100K|   488K|   420   (1)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

-- 12.1.0.2 Plan with INLINE hint
----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |   100K|   488K|   398K  (1)| 00:00:16 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|*  3 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   106 |     2   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN                  | T1_IDX |     1 |       |     1   (0)| 00:00:01 |
|   5 |  TABLE ACCESS FULL                  | T1     |   100K|   488K|   420   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

This time I've shown plans from 12.1.0.2 - the latest available release as I write this - to demonstrate that this hasn't changed yet. What has changed in 12c is that the scalar subqueries are now actually represented in the final cost - in pre-12c these costs wouldn't be part of the total cost. So although due to that the cost difference between the two plans in 12c is much more significant than in pre-12c the optimizer still opts for materializing the CTE expression and running full table scans in the scalar subqueries on that temp table instead of taking advantage of the precise access path available - again very likely a pretty bad idea at runtime.

So whenever you make use of the WITH clause make sure you've considered the access paths that might be available when not materializing the result set.

Footnote

As of Oracle 12.1 the MATERIALIZE and INLINE hints are still not officially documented.

12 comments:

Dominic Brooks said...

I've blogged before about a restriction on the materialisation when part of a distributed / XA transaction (particularly relevant for many Java apps)

http://orastory.wordpress.com/2013/01/15/fun-with-distributed-transactions/

Your article has prompted me to retest and it looks like that is resolved in 12.1.

Randolf said...

Hi Dominic,

yes, I blogged about similar issues a while back, and in the comments of that post you can find two bug numbers that are apparently fixed in 12.1 - maybe even in 11.2.0.4, not sure though about the latter.

Randolf

Martin Preiss said...

Hi Randolf,
just out of curiosity: is there a special reason for your addition of the predicates a1.id=1 and a2.id=1 in the first query? I remember that there are (or have been) some strange effects of transitive closure - but playing with your examples in 12.1.0.2 I couldn't spot a difference with and without these predicates.
Regards
Martin

Chris Antognini said...

Hi Randolf

As of 12.1.0.2 also the vector transformation (a.k.a. in-memory aggregation) uses temp tables...

Cheers,
Chris

Randolf said...

@Martin:

I think you're right with your transitive closure idea: In some version I've tested, probably 11.2.0.3, I got a strange looking plan with lots of MERGE JOIN CARTESIAN due to the transitive closure. Since it didn't want to distract from the original topic, I decided to add these in principle superfluous predicates, which helped to arrive at a more reasonable looking plan - although it is really more about cosmetics than anything else in that case.

@Chris:
Thanks for the addition - I haven't had time yet to look in detail at the new aggregation transformation.

Randolf

Mark W. Farnham said...

My question is why they are losing the predicate. If memory serves that never or nearly never happened with the old rownum trick. Have they lost their ability to have their cake and eat it too by doing smartly accessed TEMP table transformation?

Randolf said...

Hi Mark,

I think the answer to your question is that the logic seems to be to materialize what is defined in the WITH clause. The re-usage of the generated TEMP table then applies whatever predicates on top.

Consider that it could be used with different predicates, like in my second example - there's no point in pushing the predicate into the materialization in that example.

Randolf

Mark W. Farnham said...

True. But one wonders why they don't check the cost in a bloom filter like way so that the TEMP table generation cost is minimized in a way that serves all masters (presumably either as ORs or unions) and if that cost is not less than the sum of the costs of doing it separately reject the TEMP. With literals they could pick and choose which times to use TEMP versus the original statement and with binds they check the sum of the probable costs. Projecting the entire possible set with no predicates just seems... bad.

Ahmed Hussain said...

I think I'm experiencing resource leaks with the temporary tables that get created by the TEMP TABLE TRANSFORMATION when using a WITH clause that is referenced more than once (i.e. triggering the transformation).

It appears to occur if you don't fetch to the end of a cursor that uses one. The following reproduces it with no specific data on 12.1.0.1.0:

DECLARE
jim SYS_REFCURSOR;

sheila VARCHAR2(20);
BEGIN
OPEN jim FOR 'with fred AS (SELECT * FROM all_tables)
SELECT status FROM fred WHERE owner = :x
UNION ALL
SELECT status FROM fred WHERE owner = :y'
USING 'SYS', 'SYSTEM';
FETCH jim
INTO sheila;
dbms_output.put_line(sheila);
CLOSE jim;
END;

If I monitor v$tempseg_usage I can see segments of segtype 'DATA' accumulate.

I thought initially it might be cursor leaks hanging onto the resources but I eventually tried it in a block with an explicit close, and monitoring for session open cursors doesn't seem to imply they are hanging around.

If you keep the same open/close semantics as above but wrap the FETCH with
LOOP
FETCH ...
EXIT WHEN fred%NOTFOUND;
END LOOP;

forcing the cursor to be fetched to the end, then the leak doesn't happen. The segments are freed up when the calling session is disconnected.

Randolf said...

Hi Ahmed,

thanks for interesting case and good description.

Doesn't reproduce in 11.2.0.2 and 12.1.0.2 but does in 12.1.0.1.

Looks like "Bug 18098207 - TEMP space leak from SQL using temp table transformation in a PLSQL REF cursor (Doc ID 18098207.8)", which applies to 11.2.0.3, 11.2.0.4 and 12.1.0.1.

Randolf

Ahmed Hussain said...

Thanks very much for your help. It sounds like exactly the issue I was seeing (except that I don't think it's in any way specific to a PL/SQL ref cursor, just execution plans that use the temp table transformation used by a cursor that is never fetched to the end). Apologies for not replying sooner - my company was supposed to be getting me access to their support identifier so I could look at the bug report for myself, but wheels are grinding slowly...

One of the reasons I asked a question here was that I couldn't find any obvious answers on the web, but now I try with that bug id there's a report here that also looks like the same thing.

One related point you might have some info on that I also couldn't find on the web - is there anything you know of in the dynamic data dictionary that shows up the lifecycle of those temporary tables anywhere?

As one sees table names of the form SYS_TEMP_guid_guid in the execution plans, I would have expected there to be a V$ view somewhere in which these tables can be monitored, but I haven't found anything that mentions them.

Randolf said...

Hi Ahmed,

that is a good question regarding dictionary information in V$ views about the TEMP tables generated automatically, but I've never investigated into that.

If I remember correctly, when looking at the recursive statements generated by Oracle to create these TEMP tables, they use special, undocumented hints that could indicate that the meta data related to these TEMP tables is stored/maintained in a "lightweight" way (since they are only valid/required during the execution of that cursor), so not sure they really leave that much footprint behind.

Randolf