Tuesday, February 25, 2020

New Parallel Distribution Method For Direct Path Loads

Starting with version 12c Oracle obviously has introduced another parallel distribution method for direct path loads (applicable to INSERT APPEND and CTAS operations) when dealing with partitioned objects.

As you might already know, starting with version 11.2 Oracle supported a new variation of the PQ_DISTRIBUTE hint allowing more control how data gets distributed for the actual DML load step. In addition to the already documented methods (NONE, RANDOM / RANDOM_LOCAL, PARTITION) there is a new one EQUIPART which obviously only applies to scenarios where both, source and target table are equi partitioned.

In principle it looks like a "full-partition wise load", where the PX partition granule gets used as chunking method and each PX slave reads from the partition to process from source and writes into the corresponding partition of target. Therefore it doesn't require a redistribution of data and uses only a single PX slave set. Depending on the skew (partitions of different data volume) this might not be the best choice, but for massive data loads with evenly sized partitions it might give some advantage over the other distribution methods - the NONE distribution method being the closest, because it doesn't require additional redistribution either - but here all PX slaves read and write from any partition, so potentially there could be more contention.

Of course this new distributed method works only for the special case of equi partitioned source and target tables - and according to my tests only for the simple case of loading from the source table with no further operations like joins etc. involved.

A simple demonstration:

drop table t_part1 purge;
drop table t_part2 purge;

create table t_part1 (id, filler) partition by range (id) (
  partition n10000 values less than (10001),
  partition n20000 values less than (20001),
  partition n30000 values less than (30001),
  partition n40000 values less than (40001),
  partition n50000 values less than (50001),
  partition n60000 values less than (60001),
  partition n70000 values less than (70001),
  partition n80000 values less than (80001),
  partition n90000 values less than (90001),
  partition n100000 values less than (100001)
 )
as
select rownum as id, rpad('x', 200) as filler
from dual
connect by level <= 100000
;

create table t_part2 (id, filler) partition by range (id) (
  partition n10000 values less than (10001),
  partition n20000 values less than (20001),
  partition n30000 values less than (30001),
  partition n40000 values less than (40001),
  partition n50000 values less than (50001),
  partition n60000 values less than (60001),
  partition n70000 values less than (70001),
  partition n80000 values less than (80001),
  partition n90000 values less than (90001),
  partition n100000 values less than (100001)
 )
as
select rownum as id, rpad('x', 200) as filler
from dual
where 1 = 2
;

alter session enable parallel dml;

-- alter session set tracefile_identifier = equipart;

-- alter session set events 'trace [RDBMS.SQL_Optimizer.*] disk=highest';

--explain plan for
insert /*+ append parallel(2) pq_distribute(t_part2 equipart) */ into t_part2 select * from t_part1;

From 12.1.0.2 on the execution plan for the INSERT APPEND operation looks like this:

--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                   |          |   100K|    19M|   446   (1)| 00:00:01 |       |       |        |      |            |
|   1 |  PX COORDINATOR                    |          |       |       |            |          |       |       |        |      |            |
|   2 |   PX SEND QC (RANDOM)              | :TQ10000 |   100K|    19M|   446   (1)| 00:00:01 |       |       |  Q1,00 | P->S | QC (RAND)  |
|   3 |    LOAD AS SELECT (EQUI-PARTITION) | T_PART2  |       |       |            |          |       |       |  Q1,00 | PCWP |            |
|   4 |     OPTIMIZER STATISTICS GATHERING |          |   100K|    19M|   446   (1)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
|   5 |      PX PARTITION RANGE ALL        |          |   100K|    19M|   446   (1)| 00:00:01 |     1 |    10 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL            | T_PART1  |   100K|    19M|   446   (1)| 00:00:01 |     1 |    10 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
 
   1 - SEL$1
   6 - SEL$1 / T_PART1@SEL$1
 
Outline Data
-------------
 
  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "T_PART1"@"SEL$1")
      FULL(@"INS$1" "T_PART2"@"INS$1")
      PQ_DISTRIBUTE(@"INS$1" "T_PART2"@"INS$1" EQUIPART)
      OUTLINE_LEAF(@"INS$1")
      OUTLINE_LEAF(@"SEL$1")
      ALL_ROWS
      DB_VERSION('18.1.0')
      OPTIMIZER_FEATURES_ENABLE('18.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Oracle doesn't always automatically choose this distribution method. If you want to enforce it (and it is legal) you can use the PQ_DISTRIBUTE(EQUIPART) hint as outlined.

Monday, January 13, 2020

12.2.0.1 And Later Support (Limited) Extended Stats On Virtual columns / Column Groups of Expressions

I do have a demo as part of my optimizer related workshops that shows the restriction / limitation of DBMS_STATS not supporting extended statistics on virtual columns / group of expressions, so for example the combination of both expressions and column groups, like ((TRUNC(COL1)), (TRUNC(COL2))).

Surprisingly, when following a certain sequence of operation, this starts working (to some degree) from 12.2.0.1 on.

The official documentation up to and including 19c still mentions this as a restriction, and since it doesn't work when explicitly referencing virtual columns (see the test case what I exactly mean by this) I assume this is more like a side effect / unintended feature.

Nevertheless, the optimizer happily picks up this additional information and comes up with improved estimates when having a combination of skew and correlation on expressions, for example.

The following test case shows the change in behaviour from 12.2.0.1 on:

set echo on linesize 200 trimspool on trimout on tab off pagesize 999 timing on

alter session set nls_language = american;

drop table t1;

purge table t1;

-- Initialize the random generator for "reproducible" pseudo-randomness
exec dbms_random.seed(0)

-- ATTR1 and ATTR2 are both skewed and correlated
create table t1
as
select
        rownum as id
      , trunc(dbms_random.value(1, 1000000000000)) as fk
      , case when rownum <= 900000 then 1 else ceil(rownum / 10) end as attr1
      , case when rownum <= 900000 then 1 else ceil(rownum / 10) end as attr2
      , rpad('x', 100) as filler
from
        dual
connect by
        level <= 1000000
;

-- Histograms on ATTR1 and ATTR2 for representing skew properly
exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2')

-- In addition histograms / extended stats on (ATTR1, ATTR2) for covering both skew and correlation
exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (attr1, attr2)')

-- Assuming a query expression TRUNC(ATTR1) = x and TRUNC(ATTR2) = x create a histogram on each expression separately for representing skew properly
exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc(attr1)), (trunc(attr2))')

-- In addition histogram / extended stats on (TRUNC(ATTR1), TRUNC(ATTR2)) for covering both skew and correlation on the expressions
-- Officially this is isn't allowed and errors out up to and including 12.1.0.2
-- ORA-20001: Error when processing extension -  missing right parenthesis
-- Starting with 12.2.0.1 this is successful, but documentation still lists this as restriction, including 19c
-- But: It's only supported when there is a corresponding virtual column already existing for each expression used - so Oracle can map this expression back to a virtual column
-- Remove the previous step and it will error out:
-- ORA-20001: Invalid Extension: Column group can contain only columns seperated by comma
exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))')

-- Although it doesn't look like officially supported / possibly just a side effect of some bug fix, the optimizer picks up the information properly
-- Pre 12.2.0.1 don't recognize the correlation and show approx. 800K rows as estimate
-- 12.2.0.1 and later with the column group on the expression in place show approx. 900K rows as estimate which is (almost) correct
explain plan for
select
        count(*)
from
        t1 a
where
        trunc(attr1) = 1
and     trunc(attr2) = 1;

select * from table(dbms_xplan.display(format => 'TYPICAL'));

-- But: Explicitly referencing a virtual column doesn't work
-- This will error out:
-- ORA-20001: Error when processing extension -  virtual column is referenced in a column expression

exec dbms_stats.drop_extended_stats(null, 't1', '(trunc(attr1))')

exec dbms_stats.drop_extended_stats(null, 't1', '(trunc(attr2))')

-- Remove this and the extended stats on the virtual columns below will work in 12.2.0.1 and later (because the extension still exists from above call)
exec dbms_stats.drop_extended_stats(null, 't1', '((trunc(attr1)), (trunc(attr2)))')


alter table t1 add (trunc_attr1 as (trunc(attr1)));

alter table t1 add (trunc_attr2 as (trunc(attr2)));

-- This works and is supported in all versions supporting virtual columns
exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, trunc_attr1, trunc_attr2')

-- This errors out: ORA-20001: Error when processing extension -  virtual column is referenced in a column expression
-- even in 12.2.0.1 and later
-- But: Works in 12.2.0.1 and later if the call to
--   dbms_stats.drop_extended_stats(null, 't1', '((trunc(attr1)), (trunc(attr2)))')
-- above is removed, because the extension then already exists (!)
exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc_attr1, trunc_attr2)')

-- But: This works again from 12.2.0.1 on, not explicitly referencing the virtual columns but using the expressions covered by the virtual columns instead
exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))')

And here is the output I get when using 12.1.0.2:

SQL*Plus: Release 11.2.0.1.0 Production on Fri Jan 10 12:39:00 2020

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options


USERNAME             INST_NAME    HOST_NAME                 SID   SERIAL#  VERSION    STARTED  SPID            OPID  CPID            SADDR            PADDR
-------------------- ------------ ------------------------- ----- -------- ---------- -------- --------------- ----- --------------- ---------------- ----------------
CBO_TEST             orcl121      DELLXPS13                 368   46472    12.1.0.2.0 20200110 6908            59    15536:4996      00007FFA110E9B88 00007FFA12B6F1E8



SQL>
SQL> alter session set nls_language = american;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> drop table t1;

Table dropped.

Elapsed: 00:00:00.02
SQL>
SQL> purge table t1;

Table purged.

Elapsed: 00:00:00.06
SQL>
SQL> -- Initialize the random generator for "reproducible" pseudo-randomness
SQL> exec dbms_random.seed(0)

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SQL>
SQL> -- ATTR1 and ATTR2 are both skewed and correlated
SQL> create table t1
  2  as
  3  select
  4          rownum as id
  5        , trunc(dbms_random.value(1, 1000000000000)) as fk
  6        , case when rownum <= 900000 then 1 else ceil(rownum / 10) end as attr1
  7        , case when rownum <= 900000 then 1 else ceil(rownum / 10) end as attr2
  8        , rpad('x', 100) as filler
  9  from
 10          dual
 11  connect by
 12          level <= 1000000
 13  ;

Table created.

Elapsed: 00:00:13.07
SQL>
SQL> -- Histograms on ATTR1 and ATTR2 for representing skew properly
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2')

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.43
SQL>
SQL> -- In addition histograms / extended stats on (ATTR1, ATTR2) for covering both skew and correlation
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (attr1, attr2)')

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.08
SQL>
SQL> -- Assuming a query expression TRUNC(ATTR1) = x and TRUNC(ATTR2) = x create a histogram on each expression separately for representing skew properly
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc(attr1)), (trunc(attr2))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.51
SQL>
SQL> -- In addition histogram / extended stats on (TRUNC(ATTR1), TRUNC(ATTR2)) for covering both skew and correlation on the expressions
SQL> -- Officially this is isn't allowed and errors out up to and including 12.1.0.2
SQL> -- ORA-20001: Error when processing extension -  missing right parenthesis
SQL> -- Starting with 12.2.0.1 this is successful, but documentation still lists this as restriction, including 19c
SQL> -- But: It's only supported when there is a corresponding virtual column already existing for each expression used - so Oracle can map this expression back to a virtual column
SQL> -- Remove the previous step and it will error out:
SQL> -- ORA-20001: Invalid Extension: Column group can contain only columns seperated by comma
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))')
BEGIN dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))'); END;

*
ERROR at line 1:
ORA-20001: Error when processing extension -  missing right parenthesis


Elapsed: 00:00:00.04
SQL>
SQL> -- Although it doesn't look like officially supported / possibly just a side effect of some bug fix, the optimizer picks up the information properly
SQL> -- Pre 12.2.0.1 don't recognize the correlation and show approx. 800K rows as estimate
SQL> -- 12.2.0.1 and later with the column group on the expression in place show approx. 900K rows as estimate which is (almost) correct
SQL> explain plan for
  2  select
  3          count(*)
  4  from
  5          t1 a
  6  where
  7          trunc(attr1) = 1
  8  and     trunc(attr2) = 1;

Explained.

Elapsed: 00:00:00.02
SQL>
SQL> select * from table(dbms_xplan.display(format => 'TYPICAL'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     8 |  4777   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   807K|  6305K|  4777   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC("ATTR1")=1 AND TRUNC("ATTR2")=1)

14 rows selected.

Elapsed: 00:00:00.11
SQL>
SQL> -- But: Explicitly referencing a virtual column doesn't work
SQL> -- This will error out:
SQL> -- ORA-20001: Error when processing extension -  virtual column is referenced in a column expression
SQL>
SQL> exec dbms_stats.drop_extended_stats(null, 't1', '(trunc(attr1))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.13
SQL>
SQL> exec dbms_stats.drop_extended_stats(null, 't1', '(trunc(attr2))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> -- Remove this and the extended stats on the virtual columns below will work in 12.2.0.1 and later (because the extension still exists from above call)
SQL> exec dbms_stats.drop_extended_stats(null, 't1', '((trunc(attr1)), (trunc(attr2)))')
BEGIN dbms_stats.drop_extended_stats(null, 't1', '((trunc(attr1)), (trunc(attr2)))'); END;

*
ERROR at line 1:
ORA-20001: Error when processing extension -  missing right parenthesis


Elapsed: 00:00:00.02
SQL>
SQL>
SQL> alter table t1 add (trunc_attr1 as (trunc(attr1)));

Table altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter table t1 add (trunc_attr2 as (trunc(attr2)));

Table altered.

Elapsed: 00:00:00.01
SQL>
SQL> -- This works and is supported in all versions supporting virtual columns
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, trunc_attr1, trunc_attr2')

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.57
SQL>
SQL> -- This errors out: ORA-20001: Error when processing extension -  virtual column is referenced in a column expression
SQL> -- even in 12.2.0.1 and later
SQL> -- But: Works in 12.2.0.1 and later if the call to
SQL> --   dbms_stats.drop_extended_stats(null, 't1', '((trunc(attr1)), (trunc(attr2)))')
SQL> -- above is removed, because the extension then already exists (!)
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc_attr1, trunc_attr2)')
BEGIN dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc_attr1, trunc_attr2)'); END;

*
ERROR at line 1:
ORA-20001: Error when processing extension -  virtual column is referenced in a column expression
ORA-06512: at "SYS.DBMS_STATS", line 34634
ORA-06512: at line 1


Elapsed: 00:00:00.06
SQL>
SQL> -- But: This works again from 12.2.0.1 on, not explicitly referencing the virtual columns but using the expressions covered by the virtual columns instead
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))')
BEGIN dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))'); END;

*
ERROR at line 1:
ORA-20001: Error when processing extension -  missing right parenthesis


Elapsed: 00:00:00.04
SQL>

And that is what I get from 12.2.0.1 on, here using 19.3:

SQL*Plus: Release 11.2.0.1.0 Production on Fri Jan 10 12:35:18 2020

Copyright (c) 1982, 2010, Oracle.  All rights reserved.


Verbunden mit:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production


USERNAME             INST_NAME    HOST_NAME                 SID   SERIAL#  VERSION    STARTED  SPID            OPID  CPID            SADDR            PADDR
-------------------- ------------ ------------------------- ----- -------- ---------- -------- --------------- ----- --------------- ---------------- ----------------
CBO_TEST             orcl19c      DELLXPS13                 146   48961    19.0.0.0.0 20200110 5648            53    7260:13644      00007FF91687B3D8 00007FF91656B858



SQL>
SQL> alter session set nls_language = american;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> drop table t1;

Table dropped.

Elapsed: 00:00:00.02
SQL>
SQL> purge table t1;

Table purged.

Elapsed: 00:00:00.10
SQL>
SQL> -- Initialize the random generator for "reproducible" pseudo-randomness
SQL> exec dbms_random.seed(0)

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SQL>
SQL> -- ATTR1 and ATTR2 are both skewed and correlated
SQL> create table t1
  2  as
  3  select
  4          rownum as id
  5        , trunc(dbms_random.value(1, 1000000000000)) as fk
  6        , case when rownum <= 900000 then 1 else ceil(rownum / 10) end as attr1
  7        , case when rownum <= 900000 then 1 else ceil(rownum / 10) end as attr2
  8        , rpad('x', 100) as filler
  9  from
 10          dual
 11  connect by
 12          level <= 1000000
 13  ;

Table created.

Elapsed: 00:00:11.89
SQL>
SQL> -- Histograms on ATTR1 and ATTR2 for representing skew properly
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2')

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.68
SQL>
SQL> -- In addition histograms / extended stats on (ATTR1, ATTR2) for covering both skew and correlation
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (attr1, attr2)')

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.75
SQL>
SQL> -- Assuming a query expression TRUNC(ATTR1) = x and TRUNC(ATTR2) = x create a histogram on each expression separately for representing skew properly
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc(attr1)), (trunc(attr2))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.43
SQL>
SQL> -- In addition histogram / extended stats on (TRUNC(ATTR1), TRUNC(ATTR2)) for covering both skew and correlation on the expressions
SQL> -- Officially this is isn't allowed and errors out up to and including 12.1.0.2
SQL> -- ORA-20001: Error when processing extension -  missing right parenthesis
SQL> -- Starting with 12.2.0.1 this is successful, but documentation still lists this as restriction, including 19c
SQL> -- But: It's only supported when there is a corresponding virtual column already existing for each expression used - so Oracle can map this expression back to a virtual column
SQL> -- Remove the previous step and it will error out:
SQL> -- ORA-20001: Invalid Extension: Column group can contain only columns seperated by comma
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:05.24
SQL>
SQL> -- Although it doesn't look like officially supported / possibly just a side effect of some bug fix, the optimizer picks up the information properly
SQL> -- Pre 12.2.0.1 don't recognize the correlation and show approx. 800K rows as estimate
SQL> -- 12.2.0.1 and later with the column group on the expression in place show approx. 900K rows as estimate which is (almost) correct
SQL> explain plan for
  2  select
  3          count(*)
  4  from
  5          t1 a
  6  where
  7          trunc(attr1) = 1
  8  and     trunc(attr2) = 1;

Explained.

Elapsed: 00:00:00.01
SQL>
SQL> select * from table(dbms_xplan.display(format => 'TYPICAL'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     8 |  4797   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |   893K|  6977K|  4797   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TRUNC("ATTR1")=1 AND TRUNC("ATTR2")=1)

14 rows selected.

Elapsed: 00:00:00.10
SQL>
SQL> -- But: Explicitly referencing a virtual column doesn't work
SQL> -- This will error out:
SQL> -- ORA-20001: Error when processing extension -  virtual column is referenced in a column expression
SQL>
SQL> exec dbms_stats.drop_extended_stats(null, 't1', '(trunc(attr1))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.33
SQL>
SQL> exec dbms_stats.drop_extended_stats(null, 't1', '(trunc(attr2))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.33
SQL>
SQL> -- Remove this and the extended stats on the virtual columns below will work in 12.2.0.1 and later (because the extension still exists from above call)
SQL> exec dbms_stats.drop_extended_stats(null, 't1', '((trunc(attr1)), (trunc(attr2)))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.11
SQL>
SQL>
SQL> alter table t1 add (trunc_attr1 as (trunc(attr1)));

Table altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter table t1 add (trunc_attr2 as (trunc(attr2)));

Table altered.

Elapsed: 00:00:00.01
SQL>
SQL> -- This works and is supported in all versions supporting virtual columns
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, trunc_attr1, trunc_attr2')

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.49
SQL>
SQL> -- This errors out: ORA-20001: Error when processing extension -  virtual column is referenced in a column expression
SQL> -- even in 12.2.0.1 and later
SQL> -- But: Works in 12.2.0.1 and later if the call to
SQL> --   dbms_stats.drop_extended_stats(null, 't1', '((trunc(attr1)), (trunc(attr2)))')
SQL> -- above is removed, because the extension then already exists (!)
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc_attr1, trunc_attr2)')
BEGIN dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, (trunc_attr1, trunc_attr2)'); END;

*
ERROR at line 1:
ORA-20001: Error when processing extension -  virtual column is referenced in a column expression
ORA-06512: at "SYS.DBMS_STATS", line 40751
ORA-06512: at "SYS.DBMS_STATS", line 40035
ORA-06512: at "SYS.DBMS_STATS", line 38912
ORA-06512: at "SYS.DBMS_STATS", line 15726
ORA-06512: at "SYS.DBMS_STATS", line 22064
ORA-06512: at "SYS.DBMS_STATS", line 22162
ORA-06512: at "SYS.DBMS_STATS", line 22232
ORA-06512: at "SYS.DBMS_STATS", line 22864
ORA-06512: at "SYS.DBMS_STATS", line 38313
ORA-06512: at "SYS.DBMS_STATS", line 39738
ORA-06512: at "SYS.DBMS_STATS", line 40183
ORA-06512: at "SYS.DBMS_STATS", line 40732
ORA-06512: at line 1


Elapsed: 00:00:00.09
SQL>
SQL> -- But: This works again from 12.2.0.1 on, not explicitly referencing the virtual columns but using the expressions covered by the virtual columns instead
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1 for columns size 254 attr1, attr2, ((trunc(attr1)), (trunc(attr2)))')

PL/SQL procedure successfully completed.

Elapsed: 00:00:05.32
SQL>

So, from 12.2.0.1 it looks like extended statistics on virtual columns are supported to some degree, when following a certain sequence of operation - first creating extended statistics on each of the expressions used, which creates corresponding virtual columns under the cover, and afterwards creating extended statistics using those expressions as part of the column group expression.

Strange enough, when explicitly creating virtual columns for those expressions creating a column group explicitly referencing those virtual columns doesn't work - but using the expressions covered by the virtual columns instead works even with those virtual columns created explicitly.