Monday, January 14, 2008

Subquery factoring and cost based query transformation (CBQT)

Oracle 9i introduced the "subquery factoring" feature (the WITH clause), and also offers a optimizer feature called "complex view merging". Subquery factoring allows the database to optionally "materialize" the result of a WITH clause as global temporary table (GTT), which can be beneficial if the WITH clause is referenced more than once in the main statement.
Complex views are view definitions or inline views that contain e.g. distinct or group by clauses.

Oracle 10g in addition introduced the "Cost based query transformation" (CBQT) feature that allows Oracle to work out if e.g. applying complex view merging is beneficial by comparing
the cost of the transformed with the cost of the original query, whereas 9i - if possible - always applies the complex view merging even if the transformed query is not superior to the original one in terms of costs.

In order to take advantage of the potential benefit of "subquery factoring" one might come to the conclusion that it's a good idea to use the subquery factoring as often as possible, since in theory it should just offer advantages. In case the instantiation of the query result is not used the plan should be not worse than the one generated when using traditional views/inline views.

Unfortunately test cases show that the usage of subquery factoring actually disables certain options otherwise available to the optimizer, in particular the "Cost based query transformation" optimizer feature seems to be skipped if the subquery factoring is used.

This is a pitty as it means that you have to be careful with the usage of subquery factoring, because rather than potentially improving the performance it might make things actually worse under certain circumstances.

This still holds true even for the current Oracle release 11g, it looks like things haven't changed there.

The following test case ran on Oracle 11.1.0.6.0 Windows 32-Bit (default database installation) shows the different execution plans generated by the optimizer when optimizing two similar statements using traditional inline views and subquery factoring.

Tests on 10.2.0.3 Windows 32-Bit (default database installation) have shown similar results.

The test case setup uses two generic tables that are populated using ALL_OBJECTS. The first table contains in a default 11g installation approximately 50.000 records, the second one 10 records.

create table complex_view_merging_test as
select
trunc(sqrt(rownum-1)) as skewed_data,
rownum-1 as id,
lpad(rownum-1,10) id_char,
rpad('x',50, 'x') as filler
from
all_objects;

create table complex_view_merging_test2 as
select * from complex_view_merging_test
where skewed_data between 0 and 2;

begin
dbms_stats.gather_table_stats(
ownname=>USER,
tabname=>'complex_view_merging_test',
method_opt=>'FOR ALL COLUMNS SIZE 1');

dbms_stats.gather_table_stats(
ownname=>USER,
tabname=>'complex_view_merging_test2',
method_opt=>'FOR ALL COLUMNS SIZE 1');

end;
/

The tables are analyzed using simple options of DBMS_STATS, even omitting histograms in order to keep things as simple as possible, although using histograms won't change the outcome in this case.

This is the query using traditional inline views:

select * from (
select skewed_data, id, count(*) as cnt
from complex_view_merging_test
group by skewed_data, id
) a, (
select * from complex_view_merging_test2
) b
where a.id = b.id;

Since the second table just contains a fraction of the first query, it is rather obvious that it is probably better to first join the two tables in order to eliminate most of the rows and afterwards apply the group by to the greatly reduced result set. This can be achieved by the "Complex view merging" feature and the cost of the merged statement should reflect this as it should be less than the cost of the unmerged statement that first applies the group by to the larger table and afterwards performs the join, since the group by in this case does not reduce the result very much.

As a side note it's worth to mention that when using the parallel query feature things might be quite different in case you have a HASH distribution used by the "Hash Join" operation and the join key criteria is very skewed. Since this results in a skewed distribution of the data to the parallel query slaves which means that most of the work will be done by only a few of the parallel slaves rendering the parallel operation rather inefficient in this particular case it could be more advantageous to potentially reduce the skewness first by applying the grouping before joining the resulting set.

The plan generated looks like this:

Planִhashִvalue:ִ3999694253

--------------------------------------------------------------------------------------------------
|ִIdִִ|ִOperationִִִִִִִִִִִ|ִNameִִִִִִִִִִִִִִִִִִִִִִִ|ִRowsִִ|ִBytesִ|ִCostִ(%CPU)|ִTimeִִִִִ|
--------------------------------------------------------------------------------------------------
|ִִִ0ִ|ִSELECTִSTATEMENTִִִִ|ִִִִִִִִִִִִִִִִִִִִִִִִִִִִ|ִִִִִ9ִ|ִִִ684ִ|ִִִ159ִִִ(2)|ִ00:00:02ִ|
|ִִִ1ִ|ִִHASHִGROUPִBYִִִִִִ|ִִִִִִִִִִִִִִִִִִִִִִִִִִִִ|ִִִִִ9ִ|ִִִ684ִ|ִִִ159ִִִ(2)|ִ00:00:02ִ|
|*ִִ2ִ|ִִִHASHִJOINִִִִִִִִִ|ִִִִִִִִִִִִִִִִִִִִִִִִִִִִ|ִִִִִ9ִ|ִִִ684ִ|ִִִ158ִִִ(1)|ִ00:00:02ִ|
|ִִִ3ִ|ִִִִTABLEִACCESSִFULL|ִCOMPLEX_VIEW_MERGING_TEST2ִ|ִִִִִ9ִ|ִִִ603ִ|ִִִִִ2ִִִ(0)|ִ00:00:01ִ|
|ִִִ4ִ|ִִִִTABLEִACCESSִFULL|ִCOMPLEX_VIEW_MERGING_TESTִִ|ִ53755ִ|ִִִ472K|ִִִ156ִִִ(1)|ִ00:00:02ִ|
--------------------------------------------------------------------------------------------------

PredicateִInformationִ(identifiedִbyִoperationִid):
---------------------------------------------------
ִ
ִִִ2ִ-ִaccess("ID"="COMPLEX_VIEW_MERGING_TEST2"."ID")

It looks like the optimizer actually has come to the conclusion that merging the view, that means performing the join first and group afterwards, is the most efficient way.

This is the same statement, but using subquery factoring:

with a as (
select skewed_data, id, count(*) as cnt
from complex_view_merging_test
group by skewed_data, id
), b as (
select * from complex_view_merging_test2
)
select * from a, b
where a.id = b.id;

The generated execution plan looks like this:

Planִhashִvalue:ִ1393518913

-----------------------------------------------------------------------------------------------------------
|ִIdִִ|ִOperationִִִִִִִִִִִִ|ִNameִִִִִִִִִִִִִִִִִִִִִִִ|ִRowsִִ|ִBytesִ|TempSpc|ִCostִ(%CPU)|ִTimeִִִִִ|
-----------------------------------------------------------------------------------------------------------
|ִִִ0ִ|ִSELECTִSTATEMENTִִִִִ|ִִִִִִִִִִִִִִִִִִִִִִִִִִִִ|ִִִִִ9ִ|ִִִ954ִ|ִִִִִִִ|ִִִ367ִִִ(2)|ִ00:00:05ִ|
|*ִִ1ִ|ִִHASHִJOINִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִִִִִִִִִִִִ|ִִִִִ9ִ|ִִִ954ִ|ִִִִִִִ|ִִִ367ִִִ(2)|ִ00:00:05ִ|
|ִִִ2ִ|ִִִTABLEִACCESSִFULLִִ|ִCOMPLEX_VIEW_MERGING_TEST2ִ|ִִִִִ9ִ|ִִִ603ִ|ִִִִִִִ|ִִִִִ2ִִִ(0)|ִ00:00:01ִ|
|ִִִ3ִ|ִִִVIEWִִִִִִִִִִִִִִִ|ִִִִִִִִִִִִִִִִִִִִִִִִִִִִ|ִ53755ִ|ִִ2047K|ִִִִִִִ|ִִִ364ִִִ(1)|ִ00:00:05ִ|
|ִִִ4ִ|ִִִִHASHִGROUPִBYִִִִִ|ִִִִִִִִִִִִִִִִִִִִִִִִִִִִ|ִ53755ִ|ִִִ472K|ִִ2120K|ִִִ364ִִִ(1)|ִ00:00:05ִ|
|ִִִ5ִ|ִִִִִTABLEִACCESSִFULL|ִCOMPLEX_VIEW_MERGING_TESTִִ|ִ53755ִ|ִִִ472K|ִִִִִִִ|ִִִ156ִִִ(1)|ִ00:00:02ִ|
-----------------------------------------------------------------------------------------------------------

PredicateִInformationִ(identifiedִbyִoperationִid):
---------------------------------------------------
ִ
ִִִ1ִ-ִaccess("A"."ID"="COMPLEX_VIEW_MERGING_TEST2"."ID")

The generated plan is more expensive than the one generated using the statement without subquery factoring. In order to get the same plan as before, the "merge" hint can be used to force the view merge transformation to take place:

with a as (
select /*+ merge */ skewed_data, id, count(*) as cnt
from complex_view_merging_test
group by skewed_data, id
), b as (
select * from complex_view_merging_test2
)
select * from a, b
where a.id = b.id;

This results in the cheaper plan seen before, but the optimizer does not come up with this plan on its own. Examining the 10053 optimizer trace of the subquery factoring statement, the following output can be found multiple times:

Considering Query Transformations on query block (#0)
**************************
Query transformations (QT)
**************************
CBQT: copy not possible on query block (#0) because linked to with clause
CBQT bypassed for query block (#0): Cannot copy query block.
CBQT: Validity checks failed for 6phrc6zz5g12p.

Whereas looking at the trace of the first statement, the following can be found:

Considering Query Transformations on query block (#0)
**************************
Query transformations (QT)
**************************
CBQT: Validity checks passed for 24kj5u4mxr19r.
.
.
.
CVM: CBQT Marking query block as valid for CVM.
CVM: Merging complex view (#2) into (#1).
.
.
.

So this seems to confirm the assumption that the usage of the subquery factoring effectively disables some other features of the optimizer, in this particular case the complex view merging of the cost based query transformation.

As of now you probably have to manually test which of the two approaches might offer the best performance when considering the subquery factoring feature. It is advisable to use the subquery factoring option not too blindly, especially if the query contains views to merge.