Sunday, August 21, 2016

Adaptive Cursor Sharing Fail

Here is another example (besides the fact that Adaptive Cursor Sharing only gets evaluated during a PARSE call (still valid in 12c) and supports a maximum of 14 bind variables) I've recently come across at a client site where the default implementation of Adaptive Cursor Sharing fails to create a more suitable execution plan for different bind variable values.

Broken down to a bare minimum the query was sometimes executed using non-existing values for a particular bind variable, but other times these values were existing and very popular. There were two suitable candidate indexes and one of them appeared to the optimizer more attractive in case of the "non-existing" value case. Unfortunately this index was a very bad choice for the "existing and very popular" case.

The critical point of the case is that due to other, additional filters on the same table/index the final number of rows produced by the row source operation was (almost) the same for the "non-existing" and the "existing and very popular" case, but in the latter case a large proportion of the bad index had to be scanned due to the popularity of the bind value, which was the only predicate that could be used as ACCESS predicate and the additional ones could only be applied as FILTER on the index.

So although there was a suitable histogram available on the column in question and the optimizer in principle was aware of the different popularities of values and marked the cursor as "bind sensitive" it looks like when a cursor is then monitored it just is assigned to different buckets as exposed via V$SQL_CS_HISTOGRAM, based on the rows produced by the row source. Since the operation in question didn't differ much in the final number of rows produced between the two cases (either 0 rows or 1 row) the different executions were all assigned to the same bucket - although the amount of work required to produce this single row was huge, reflected in execution time and physical / logical I/Os performed per execution.

Obviously this vastly different amount of work performed to produce a similar number of rows is not used as criteria for marking a cursor as "bind aware" and evaluate different execution plans based on ranges of bind values.

There is a view V$SQL_CS_STATISTICS that includes information about CPU_TIME and BUFFER_GETS but in 11.2.0.4 it only reflects the initial execution when starting with the "non-existing" value case, but not the slow executions afterwards. In 12.1.0.2 the view is no longer populated at all, which suggests that this information is simply not used for deciding the "bind aware" status of a cursor.

Discussing the case with Mohamed Houri, who has spent significant time on investigating the "Adaptive Cursor Sharing" feature seemed to confirm this assumption.

Here is a simple test case that allows reproducing the issue:
-- FILTER1 is highly skewed (here one very popular value -1)
create table t
as
select
        rownum as id
      , rownum as n
      , case when rownum <= 100000 then rownum else -1 end as filter1
      , rownum as filter2
      , rpad('x', 200) as filler1
      , rownum as filter3
      , rpad('x', 200) as filler2
from
        dual
connect by level <= 1e6;

-- Histogram on FILTER1 should make the sample query "bind sensitive"
exec dbms_stats.gather_table_stats(null, 't', method_opt => 'for all columns size 1 for columns size 254 filter1')

-- The "good" index, ACCESS on all three columns
-- But higher BLEVEL (3) due to FILLER1 only one value and wide - needs to be part of every branch block
create index t_idx1 on t (filter1, filler1, filter2) compress 2 /*advanced low*/;

-- The "bad" index, ACCESS only on first column, FILTER on subsequent columns
-- But lower BLEVEL (2) due to FILTER3 before FILLER1, leads to much more compact branch blocks
create index t_idx2 on t (filter1, filter3, filler1, filler2, filter2) compress 1 /*advanced low*/;
The critical point is that the good index has a higher BLEVEL than the bad index. So in case a non-existing value for FILTER1 gets used the optimizer will favor the index with the lower BLEVEL, which is for that case the key cost component:
set echo on

-- Non-existing value
-- T_IDX2 gets preferred
explain plan for
select sum(n) from t where filter1 = -999999 and filler1 = rpad('x', 200) and filter2 = 999999;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |        |     1 |   216 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |        |     1 |   216 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T      |     1 |   216 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T_IDX2 |     1 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FILTER1"=(-999999) AND "FILLER1"='x

                                                                                         ' AND
              "FILTER2"=999999)
       filter("FILTER2"=999999 AND "FILLER1"='x

                                                                                      ')
                                                                                     
-- Non-existing value
-- T_IDX1 has a higher cost
explain plan for
select /*+ index(t t_idx1) */ sum(n) from t where filter1 = -999999 and filler1 = rpad('x', 200) and filter2 = 999999;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |        |     1 |   216 |     5   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |        |     1 |   216 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T      |     1 |   216 |     5   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T_IDX1 |     1 |       |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FILTER1"=(-999999) AND "FILLER1"='x

                                                                                         ' AND
              "FILTER2"=999999)
              
-- Existing and very popular value
-- T_IDX1 gets preferred
explain plan for
select sum(n) from t where filter1 = -1 and filler1 = rpad('x', 200) and filter2 = 999999;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |        |     1 |   216 |     5   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                      |        |     1 |   216 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T      |     1 |   216 |     5   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | T_IDX1 |     1 |       |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FILTER1"=(-1) AND "FILLER1"='x

                                                                                    ' AND
              "FILTER2"=999999)


-- Existing and very popular value
-- T_IDX2 is a bad idea, correctly reflected in the cost estimate
explain plan for
select /*+ index(t t_idx2) */ sum(n) from t where filter1 = -1 and filler1 = rpad('x', 200) and filter2 = 999999;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------------------------
| Id  | Operation                            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |        |     1 |   216 | 58269   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE                      |        |     1 |   216 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T      |     1 |   216 | 58269   (1)| 00:00:03 |
|*  3 |    INDEX RANGE SCAN                  | T_IDX2 |     1 |       | 58268   (1)| 00:00:03 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FILTER1"=(-1) AND "FILLER1"='x

                                                                                    ' AND
              "FILTER2"=999999)
       filter("FILTER2"=999999 AND "FILLER1"='x

                                                                                      ')
So when optimizing for the two different cases the optimizer does the right thing and understands that for the case "existing and very popular" T_IDX2 is a bad choice. It's also obvious from the "Predicate Information" section that the index T_IDX2 only can use FILTER1 as ACCESS predicate.

But when using bind variables the different cases are not recognized and the bad index is used for both cases when the optimization is based on the "non-existing value" case:
-- Default behaviour, although being BIND_SENSITIVE thanks to the histogram, no ACS kicks in
-- The V$SQL_CS_HISTOGRAM shows all executions being in bucket 0, so according to BIND_SENSITIVE monitoring no need for action
set echo on timing on time on

alter session set statistics_level = all;

variable n number

exec :n := -999999

declare
  res number;
begin
  select sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
end;
/

exec :n := -1

declare
  res number;
begin
  for i in 1..1000 loop
    select sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
  end loop;
end;
/

SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.03
SQL>
SQL> variable n number
SQL>
SQL> exec :n := -999999

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> declare
  2    res number;
  3  begin
  4    select sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
  5  end;
  6  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.28
SQL>
SQL> exec :n := -1

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> declare
  2    res number;
  3  begin
  4    for i in 1..1000 loop
  5      select sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:04:48.64
SQL>

-- With BIND_AWARE a second child cursors gets generated and used on second parse, interesting
set echo on timing on time on

alter session set statistics_level = all;

variable n number

exec :n := -999999

declare
  res number;
begin
  select /*+ bind_aware */ sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
end;
/

exec :n := -1

declare
  res number;
begin
  for i in 1..1000 loop
    select /*+ bind_aware */ sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
  end loop;
end;
/

SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.03
SQL>
SQL> variable n number
SQL>
SQL> exec :n := -999999

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> declare
  2    res number;
  3  begin
  4    select /*+ bind_aware */ sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
  5  end;
  6  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> exec :n := -1

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> declare
  2    res number;
  3  begin
  4    for i in 1..1000 loop
  5      select /*+ bind_aware */ sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.20
SQL>

-- Or hinting for a fixed plan helps, too
set echo on timing on time on

alter session set statistics_level = all;

variable n number

exec :n := -999999

declare
  res number;
begin
  select /*+ index(t t_idx1) */ sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
end;
/

exec :n := -1

declare
  res number;
begin
  for i in 1..1000 loop
    select /*+ index(t t_idx1) */ sum(n) into res from t where filter1 = :n and filler1 = rpad('x', 200) and filter2 = 999999;
  end loop;
end;
/
Executing the unhinted, default case leads to a very inefficient execution of the thousand executions using the popular value (and the actual table / index at the client site was much larger than this example and the query got executed very frequently) - compare that to the execution time for the other two variants, either using the BIND_AWARE hint or simply requesting the good index.

So when requesting to mark the cursor immediately BIND_AWARE via the corresponding hint (only possible if the cursor qualifies to be bind sensitive) Oracle happily generates a second child cursor at the second PARSE call with a different, more efficient execution plan for the popular value, because when a cursor is marked bind aware a completely different algorithm gets used that is based on range of values as exposed via V$SQL_CS_SELECTIVITY, and since the different values are falling into different ranges of values a new plan gets generated that is different from the previous one and hence gets used from then on for those (range of) values.

3 comments:

Stefan Koehler said...

Hey Randolf,
thanks for reminding about ACS and PARSE call.

However it seems like something more is going on as ACS (with BIND_AWARE hint) works in case of procedures, but not for functions, e.g. here is a test case: https://community.oracle.com/message/11131628#11131628

If you wrap the PL/SQL function into an anonymous PL/SQL block (and not into a SQL) it works again, e.g. here is the test case: https://community.oracle.com/message/11132322#11132322

Any ideas?

Thanks.

Regards
Stefan

Randolf said...

Hi Stefan,

I vaguely remember that thread but didn't spend time at that time looking into it.

I think the more precise definition would be "ACS doesn't mark cursors of recursive SQL as bind sensitive in PL/SQL functions called from SQL", right?

Because it seems to behave as expected if you call the function from an anonymous PL/SQL block - have you also tried if it works when calling the PL/SQL function from a stored procedure?

Randolf

Mohamed Houri said...

Stephan, Randolf

I think that Oracle needs absolutely to see the cursor bind variables in order to mark this cursor bind sensitive.

However, SQL can't see bind variables used inside PL/SQL function (and procedure as well).

This is why when you select from a PL/SQL function using SQL your cursor is not bind sensitive

This also explains why a cursor is not bind sensitive as well when we replace the way of running a PL/SQL stored procedure, from example from this:

SQL> exec proc(1) -- this is in fact a PL/SQL anonymous block and the message "PL/SQL procedure successfully completed" is a reminder for that

to this:

SQL> call proc(1) -- this is a pure SQL function. For example this will ignore a no data found exception


In the trace file of a PL/SQL function called from SQL I can see that the parse and the execution of the top level sql is done before the parse, the execution and the fetch of the PL/SQL function query.

While in the trace file of a PL/SQL function called from an anonymous PL/SQL block I can see that this anonymous PL/SQL block is first parsed but it's execution is delayed until after the PL/SQL function query is parsed, executed, fetched and closed.

Best regards
Mohamed Houri