Sunday, January 16, 2011

Hash Aggregation

Oracle introduced in Oracle 10g the hash aggregation as a new feature. It can be used for both GROUP BY and UNIQUE operations (HASH GROUP BY and HASH UNIQUE respectively) and is by default the preferred aggregation method if there is no particular reason that lets the cost based optimizer prefer the sort based aggregation (SORT GROUP BY and SORT UNIQUE), for example if the GROUP BY is followed by an ORDER BY on the same expression (using the SORT GROUP BY in such cases is not always beneficial by the way, see Guy Harrison's blog for an example).

Ever since its introduction from time to time I've heard complaints about performance degradations of aggregation operations that are based on the new hash aggregation algorithm compared to the previously used sort based aggregations.

Now there may be many potential reasons for such performance degradations (and possibly many of them might not have anything to do with the hash aggregation) but here is a surprising revelation that might explain why some of them were indeed caused by the switch to the new algorithm: The hash aggregation operation does not work very well together with the automatic PGA management (WORKAREA_SIZE_POLICY = AUTO, default since 9i). The fundamental defect is that it is not able to dynamically resize to a larger workarea size when using automatic PGA management and therefore remains more or less at its initial expected size based on the estimates at optimization time.

This effectively means that the efficiency of the hash aggregation operation when using automatic PGA management is heavily dependant on the cardinality estimates at optimization time - in case of estimates in the right ballpark, the memory used at execution time will correspond to the actual requirements at runtime, but in case of bad estimates, the operation potentially uses far less memory than available and unnecessarily spills to disk.

Let's start with a simple script to demonstrate the issue:


set echo on timing on

show parameter pga
show parameter processes

-- alter session set workarea_size_policy = manual sort_area_size = 40000000;

drop table t1 purge;

drop table t2 purge;

create table t1
as
select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

create table t2
as
select distinct user_id from t1;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

column low_val new_value low_value
column high_val new_value high_value

select
max(user_id) + 1 as low_val
, max(user_id) + max(user_id) - min(user_id) + 1 as high_val
from
t2;

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

-- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

-- alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_hash(@inner t1)
no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner)
*/
max(cnt)
from
(
select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
group by t1.object_id, t3.user_id
)
;

-- alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

set pagesize 14

-- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
declare
srec dbms_stats.statrec;
novals dbms_stats.numarray;
distcnt number;
avgclen number;
nullcnt number;
density number;
srec2 dbms_stats.statrec;
begin
dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
srec2.epc := 2;
novals := dbms_stats.numarray(
&low_value,
&high_value
);
srec2.bkvals := null;
dbms_stats.prepare_column_values(srec2,novals);
dbms_stats.set_column_stats(
ownname=>null,
tabname=>'t1',
colname=>'user_id',
distcnt=>distcnt,
nullcnt=>nullcnt,
srec=>srec2,
avgclen=>avgclen,
density=>density
);
end;
/

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

-- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

-- alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_hash(@inner t1)
no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner)
*/
max(cnt)
from
(
select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
group by t1.object_id, t3.user_id
)
;

-- alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));



I first start with showing the current PGA_AGGREGATE_TARGET (P_A_T) and PROCESSES setting.

I create then two tables as sample data that will be joined and already collect information about the minimum and maximum value of the join column.

Any cursors that are dependent on table T1 will then be invalidated to make sure that a reoptimization of the next query will take place (of course running the complete script invalidates such cursors anyway due to the drop and re-create of the tables).

I then run a simple join followed by a group by operation. I've added a cartesian join by the way, but its only purpose is to ensure that the generated row source is sufficiently large to give the group by something to do.

I have used hints to ensure that I always get the same execution plan even if I later on manipulate the statistics to see the effect of incorrect cardinality estimates.

After getting the actual runtime execution plan along with execution statistics I modify the column statistics of one of the tables' join column in such a way that the optimizer thinks that there is no overlap between the join column values and therefore computes a very low join cardinality.

After making sure again that a reoptimization will take place I run the same statement again with the same data volume and the same hints in place.

And this is what I get for all major versions that are currently out there (10.2.0.4, 10.2.0.5, 11.1.0.7, 11.2.0.1 and 11.2.0.2):


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;
drop table t1 purge
*
ERROR at line 1:
ORA-00942: table or view does not exist


Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;
drop table t2 purge
*
ERROR at line 1:
ORA-00942: table or view does not exist


Elapsed: 00:00:00.00
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.07
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.04
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.71
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*)
as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id,
t3.user_id )

Plan hash value: 3134842094

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.72 | 139 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.72 | 139 | | | |
| 2 | VIEW | | 1 | 570K| 810K|00:00:00.43 | 139 | | | |
| 3 | HASH GROUP BY | | 1 | 570K| 810K|00:00:00.43 | 139 | 35M| 5521K| 38M (0)|
|* 4 | HASH JOIN | | 1 | 807K| 810K|00:00:00.02 | 139 | 1348K| 1348K| 1093K (0)|
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29907 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")


29 rows selected.

Elapsed: 00:00:00.12
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.03
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:01.37
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner)
*/ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id =
t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.37 | 139 | 2715 | 2715 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.37 | 139 | 2715 | 2715 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.62 | 139 | 2715 | 2715 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.62 | 139 | 2715 | 2715 | 35M| 5521K| 7010K (1)| 23552 |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 0 | 0 | 1348K| 1348K| 1167K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29907 | 30000 |00:00:00.01 | 55 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")


28 rows selected.

Elapsed: 00:00:00.03
SQL>


As it can be seen with an estimate in the right ballpark the HASH GROUP BY operation completes in memory (very close to the 20% PGA_AGGREGATE_TARGET maximum size of a single workarea from 10.2 on with automatic PGA management for PGA_AGGREGATE_TARGET < 500M, for more information see Joze Senegacnik's paper on the internals of automatic PGA management).

However repeating exactly the same operation with the fudged statistics it spills to disk and uses only 7M, although it could have used up to 40M (given that there is no concurrent workload).

The same does not happen when repeating this experiment with other operations that use a workarea - the most obvious one being a SORT GROUP BY, as can be seen from this output:


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 15) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.03
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
15 29

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.54
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 94tqnspjuzk8x, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) no_use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */
count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by
t1.object_id, t3.user_id )

Plan hash value: 2068941295

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.53 | 103 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.53 | 103 | | | |
| 2 | VIEW | | 1 | 315K| 450K|00:00:00.37 | 103 | | | |
| 3 | SORT GROUP BY | | 1 | 315K| 450K|00:00:00.37 | 103 | 28M| 1913K| 25M (0)|
|* 4 | HASH JOIN | | 1 | 446K| 450K|00:00:00.01 | 103 | 1348K| 1348K| 1097K (0)|
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29787 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")


29 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 15,
old 14: &high_value
new 14: 29

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.56
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 94tqnspjuzk8x, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) no_use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */
count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by
t1.object_id, t3.user_id )

Plan hash value: 2068941295

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.55 | 103 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.55 | 103 | | | |
| 2 | VIEW | | 1 | 1 | 450K|00:00:00.38 | 103 | | | |
| 3 | SORT GROUP BY | | 1 | 1 | 450K|00:00:00.38 | 103 | 28M| 1913K| 25M (0)|
|* 4 | HASH JOIN | | 1 | 1 | 450K|00:00:00.01 | 103 | 1348K| 1348K| 1053K (0)|
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29787 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")


29 rows selected.

Elapsed: 00:00:00.03
SQL>


All I've done is to reduce the data set, because the SORT GROUP BY in this case required more memory for the same amount of data than the HASH GROUP BY, in order to prevent the operation to spill to disk with a 200M PGA_AGGREGATE_TARGET setting and change the USE_HASH_AGGREGATION hint to NO_USE_HASH_AGGREGATION.

As you can see, the operation completes both times in memory and uses the same amount of memory no matter what the estimates look like.

I've tested the serial execution of the most common workarea based operations like HASH JOIN, SORT ORDER BY, WINDOW SORT, SORT UNIQUE and all of them were able to dynamically resize the workarea in cases where the initial estimated size was too small.

If you carefully check then you'll notice that I haven't mentioned the HASH UNIQUE operation yet, and later on you'll see why.

A cunning feature of the automatic PGA management comes to help, however, which is a kind of "feedback loop" for the workareas based on statistics maintained by the automatic PGA memory management, and indeed when the HASH GROUP BY cursor based on the incorrect cardinality estimate gets shared (not invalidated) and re-executed, the next execution will look like this:


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality_repeated_execution';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.76
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id,
t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.75 | 139 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.75 | 139 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.44 | 139 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.44 | 139 | 35M| 5521K| 38M (0)| |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 1348K| 1348K| 1412K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29836 | 30000 |00:00:00.01 | 55 | | | | |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")


28 rows selected.

Elapsed: 00:00:00.03
SQL>


So Oracle this time based the workarea memory requirements on the feedback from the previous execution and therefore allocated sufficient memory to complete the operation without spilling to disk.

Notice however that the feedback loop unfortunately does not work as desired when the workarea execution fails due to insufficient TEMP space. Ideally the feedback loop should allow the subsequent executions to grab more memory specifically in such cases, but at present it doesn't - the failure obviously prevents an update of the statistics and therefore subsequent executions continue to fail since they still use the same amount of memory.

You can test this by simply assigned a very small TEMP tablespace to the user executing the query so that the second execution fails due to insufficient TEMP space. If you repeat the execution in this case, it will fail again and keeps doing so.

What happens if I invalidate the cursor and repeat the execution?


SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.01
SQL> set echo on timing on
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality_repeated_exec_invalid';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;
1

Elapsed: 00:00:01.29
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner)
*/ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id =
t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.28 | 139 | 2715 | 2715 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.28 | 139 | 2715 | 2715 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.61 | 139 | 2715 | 2715 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.61 | 139 | 2715 | 2715 | 35M| 5521K| 7067K (1)| 23552 |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 0 | 0 | 1348K| 1348K| 1129K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29836 | 30000 |00:00:00.01 | 55 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")


28 rows selected.

Elapsed: 00:00:00.03
SQL>


Back to square one - with the invalidation the statistics are gone, too, so the bad cardinality estimate again lead to the suboptimal execution of the HASH GROUP BY.

So you might think, why bother? Only the first execution of a cursor without workarea execution statistics will be affected by the problem, subsequent execution of the same cursor will benefit from the statistics from the previous executions.

The problem however is, that this is fine for applications that share cursors. Unfortunately applications that peform heavy duty aggregations like data warehouses typically do not share cursors, since they do not care about the optimization overhead and deliberately use literals to provide as much information to the optimizer as possible.

Also these heavy duty aggregations usually use Parallel Execution features, and as you'll see from the output of the same test case, if I run the HASH GROUP BY in parallel by simply setting table T1 to parallel degree 2, a similar problem occurs - so Parallel Execution is affected as well.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> alter table t1 parallel 2;

Table altered.

Elapsed: 00:00:00.03
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.03
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.42
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id,
t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 464898991

--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | O/1/M |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.43 | 87 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.43 | 87 | | | |
| 2 | PX COORDINATOR | | 1 | | 2 |00:00:00.43 | 87 | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | 0 | 1 | 0 |00:00:00.01 | 0 | | | |
| 4 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:00.81 | 0 | | | |
| 5 | VIEW | | 2 | 575K| 810K|00:00:00.48 | 0 | | | |
| 6 | HASH GROUP BY | | 2 | 575K| 810K|00:00:00.48 | 0 | 35M| 5521K| 2/0/0|
| 7 | PX RECEIVE | | 2 | 814K| 810K|00:00:00.01 | 0 | | | |
| 8 | PX SEND HASH | :TQ10001 | 0 | 814K| 0 |00:00:00.01 | 0 | | | |
|* 9 | HASH JOIN | | 2 | 814K| 810K|00:00:00.01 | 89 | 1348K| 1348K| 2/0/0|
| 10 | BUFFER SORT | | 2 | | 1458 |00:00:00.01 | 0 | 29696 | 29696 | 2/0/0|
| 11 | PX RECEIVE | | 2 | 729 | 1458 |00:00:00.01 | 0 | | | |
| 12 | PX SEND BROADCAST | :TQ10000 | 0 | 729 | 0 |00:00:00.01 | 0 | | | |
| 13 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | |
| 14 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | |
| 15 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | |
| 16 | PX BLOCK ITERATOR | | 2 | 30164 | 30000 |00:00:00.01 | 89 | | | |
|* 17 | TABLE ACCESS FULL | T1 | 18 | 30164 | 30000 |00:00:00.01 | 89 | | | |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - access("T1"."USER_ID"="T2"."USER_ID")
17 - access(:Z>=:Z AND :Z<=:Z)


38 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:01.17
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1) use_nl(@inner t3)
use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner) */ max(cnt) from ( select /*+
qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 464898991

------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | O/1/M | Max-Tmp |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 2 | PX COORDINATOR | | 1 | | 2 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 4 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:02.25 | 0 | 2460 | 2460 | | | | |
| 5 | VIEW | | 2 | 1 | 810K|00:00:01.53 | 0 | 2460 | 2460 | | | | |
| 6 | HASH GROUP BY | | 2 | 1 | 810K|00:00:01.53 | 0 | 2460 | 2460 | 35M| 5521K| | 11264 |
| 7 | PX RECEIVE | | 2 | 1 | 810K|00:00:00.01 | 0 | 0 | 0 | | | | |
| 8 | PX SEND HASH | :TQ10001 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
|* 9 | HASH JOIN | | 2 | 1 | 810K|00:00:00.01 | 89 | 0 | 0 | 1348K| 1348K| 2/0/0| |
| 10 | BUFFER SORT | | 2 | | 1458 |00:00:00.01 | 0 | 0 | 0 | 29696 | 29696 | 2/0/0| |
| 11 | PX RECEIVE | | 2 | 729 | 1458 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 12 | PX SEND BROADCAST | :TQ10000 | 0 | 729 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 13 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 14 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 15 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 16 | PX BLOCK ITERATOR | | 2 | 30164 | 30000 |00:00:00.01 | 89 | 0 | 0 | | | | |
|* 17 | TABLE ACCESS FULL | T1 | 18 | 30164 | 30000 |00:00:00.01 | 89 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - access("T1"."USER_ID"="T2"."USER_ID")
17 - access(:Z>=:Z AND :Z<=:Z)


37 rows selected.

Elapsed: 00:00:00.04
SQL>


So, if your database uses automatic PGA management and

- uses the hash aggregation HASH GROUP BY (or HASH UNIQUE, more on that in a moment)
- and does not share cursors

every execution that is based on bad cardinality estimates potentially has a problem with the hash aggregation because it might not make efficient use of the available memory.

The same applies to applications that share cursors, however in that case only the first execution after re-optimization / invalidation is affected.

So you might want to carefully check the runtime execution statistics of your critial hash aggregations.

Mind you, things could be worse, and that is where the HASH UNIQUE operation comes into the picture.

When I realised the issue with the HASH GROUP BY operation I was quite certain that the HASH UNIQUE operation will be affected in a similar way, since internally Oracle seems to use the same mechanism for both operations (In the SMM trace files both are called HASH GROUP BY).

To my surprise I noticed that in 10g versions below 10.2.0.5 and 11g versions below 11.2.0.1 (which means in this case 10.2.0.4 and 11.1.0.7 respectively, I didn't test other versions) the HASH UNIQUE operation suffers from an even more dramatic problem: The cardinality estimate is not considered and the initial workarea size is always based on minimum assumptions.

In passing, according to the SMM trace files that can be activated using the undocumented "_smm_trace" parameter it looks like that many of the sort-based workareas like SORT ORDER BY or WINDOW SORT seem to suffer from the same defect, since they however are able to dynamically resize and make use of the workarea statistics feedback they effectively work as expected, they just start with a too low workarea size estimate every time they are executed for the first time after re-optimization / invalidation.

The combination of the two issues - inability to dynamically resize and ignoring the optimizer estimates - leads to a dire result: Every first execution of a HASH UNIQUE operation in those versions will only use the minimum amount of memory. The following test case shows the problem:


set echo on timing on

show parameter pga
show parameter processes

-- alter session set workarea_size_policy = manual sort_area_size = 40000000;

drop table t1 purge;

drop table t2 purge;

create table t1
as
select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

create table t2
as
select distinct user_id from t1;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

column low_val new_value low_value
column high_val new_value high_value

select
max(user_id) + 1 as low_val
, max(user_id) + max(user_id) - min(user_id) + 1 as high_val
from
t2;

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_nl(@inner t1)
use_hash_aggregation(@inner)
*/
max(user_id)
from
(
select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
)
;

alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

set pagesize 14

-- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
declare
srec dbms_stats.statrec;
novals dbms_stats.numarray;
distcnt number;
avgclen number;
nullcnt number;
density number;
srec2 dbms_stats.statrec;
begin
dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
srec2.epc := 2;
novals := dbms_stats.numarray(
&low_value,
&high_value
);
srec2.bkvals := null;
dbms_stats.prepare_column_values(srec2,novals);
dbms_stats.set_column_stats(
ownname=>null,
tabname=>'t1',
colname=>'user_id',
distcnt=>distcnt,
nullcnt=>nullcnt,
srec=>srec2,
avgclen=>avgclen,
density=>density
);
end;
/

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_nl(@inner t1)
use_hash_aggregation(@inner)
*/
max(user_id)
from
(
select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
)
;

alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));


Obviously it is very similar to the HASH GROUP BY test case, this time however I've used a DISTINCT clause and replaced the HASH JOIN with a NESTED LOOP which makes the generated trace files easier to read, since there is exactly one workarea involved in this execution.

And this is what I get from 11.1.0.7, this time using AMM, and therefore the PGA_AGGREGAT_TARGET has been set to 0 (You'll get the same result from 10.2.0.4 with a corresponding P_A_T setting):


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 0
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.00
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 use_hash_aggregation(@inner)
8 */
9 max(user_id)
10 from
11 (
12 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
13 from t1, t2, t2 t3
14 where t1.user_id = t2.user_id
15 )
16 ;

MAX(USER_ID)
------------
26

Elapsed: 00:00:01.79
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 7cnqz02uwcs1a, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
use_hash_aggregation(@inner) */ max(user_id) from ( select /*+
qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from t1,
t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 1541846686

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.78 | 40179 | 1470 | 1470 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.78 | 40179 | 1470 | 1470 | | | | |
| 2 | VIEW | | 1 | 572K| 810K|00:00:00.97 | 40179 | 1470 | 1470 | | | | |
| 3 | HASH UNIQUE | | 1 | 572K| 810K|00:00:00.97 | 40179 | 1470 | 1470 | 25M| 3296K| 6029K (1)| 13312 |
| 4 | NESTED LOOPS | | 1 | 810K| 810K|00:00:00.01 | 40179 | 0 | 0 | | | | |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
|* 8 | TABLE ACCESS FULL | T1 | 729 | 1111 | 810K|00:00:00.03 | 40095 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")


30 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 use_hash_aggregation(@inner)
8 */
9 max(user_id)
10 from
11 (
12 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
13 from t1, t2, t2 t3
14 where t1.user_id = t2.user_id
15 )
16 ;

MAX(USER_ID)
------------
26

Elapsed: 00:00:01.26
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 7cnqz02uwcs1a, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
use_hash_aggregation(@inner) */ max(user_id) from ( select /*+
qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from t1,
t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 1541846686

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.26 | 40179 | 1470 | 1470 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.26 | 40179 | 1470 | 1470 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.98 | 40179 | 1470 | 1470 | | | | |
| 3 | HASH UNIQUE | | 1 | 1 | 810K|00:00:00.98 | 40179 | 1470 | 1470 | 25M| 3296K| 6093K (1)| 13312 |
| 4 | NESTED LOOPS | | 1 | 1 | 810K|00:00:00.01 | 40179 | 0 | 0 | | | | |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
|* 8 | TABLE ACCESS FULL | T1 | 729 | 1 | 810K|00:00:00.01 | 40095 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")


30 rows selected.

Elapsed: 00:00:00.03
SQL>


As you can see no matter what the estimates looks like, on this system the first execution of a HASH UNIQUE will not get more than 6M - subsequent executions of the same cursor benefit from the statistics from the previous run as seen before.

Again, switching to a SORT UNIQUE and reducing the data set accordingly, the problem can not be reproduced:


Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 0
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 150
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.15
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.07
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 15) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
15 29

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(user_id)
11 from
12 (
13 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 )
17 ;

MAX(USER_ID)
------------
14

Elapsed: 00:00:00.73
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID ckb2sbz3y2z14, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
no_swap_join_inputs(@inner t1)
no_use_hash_aggregation(@inner) */ max(user_id) from ( select
/*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from
t1, t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 3828303002

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.71 | 12423 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.71 | 12423 | | | |
| 2 | VIEW | | 1 | 318K| 450K|00:00:00.62 | 12423 | | | |
| 3 | SORT UNIQUE | | 1 | 318K| 450K|00:00:00.62 | 12423 | 22M| 1744K| 20M (0)|
| 4 | NESTED LOOPS | | 1 | 450K| 450K|00:00:00.01 | 12423 | | | |
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
|* 8 | TABLE ACCESS FULL | T1 | 225 | 2000 | 450K|00:00:00.01 | 12375 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")


31 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 15,
old 14: &high_value
new 14: 29

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(user_id)
11 from
12 (
13 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 )
17 ;

MAX(USER_ID)
------------
14

Elapsed: 00:00:00.74
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID ckb2sbz3y2z14, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
no_swap_join_inputs(@inner t1)
no_use_hash_aggregation(@inner) */ max(user_id) from ( select
/*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from
t1, t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 3828303002

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.73 | 12423 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.73 | 12423 | | | |
| 2 | VIEW | | 1 | 1 | 450K|00:00:00.61 | 12423 | | | |
| 3 | SORT UNIQUE | | 1 | 1 | 450K|00:00:00.61 | 12423 | 22M| 1744K| 20M (0)|
| 4 | NESTED LOOPS | | 1 | 1 | 450K|00:00:00.01 | 12423 | | | |
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
|* 8 | TABLE ACCESS FULL | T1 | 225 | 1 | 450K|00:00:00.01 | 12375 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")


31 rows selected.

Elapsed: 00:00:00.01
SQL>
SQL> set doc off
SQL> doc
SQL>


In 10.2.0.5 and 11.2.0.1/11.2.0.2 this issue is apparently fixed and the cardinality estimates are used for an initial workarea size calculation - however this doesn't mean that the dynamic resize problem is fixed - it simply means that the HASH UNIQUE in those versions behaves exactly as the HASH GROUP BY and the memory usage for the first execution depends on the cardinality estimates.

Summary

If your database uses automatic PGA management then for the hash aggregation HASH GROUP BY / HASH UNIQUE operations every initial execution that is based on bad cardinality estimates potentially has a problem because it might not make efficient use of the available memory.

The same applies to applications that share cursors, however in that case only the initial execution after re-optimization / invalidation is affected, subsequent executions benefit from the workarea statistics feedback mechanism.

Furthermore in 10g versions below 10.2.0.5 and 11g versions below 11.2.0.1 the initial execution of a HASH UNIQUE operation ignores the cardinality estimates and will always be based on minimum assumptions.

So you might want to carefully check the runtime execution statistics of your critical hash aggregations.

Possible Workarounds

Different strategies are available as workarounds, depending on the situation:

- Upgrading to versions where the HASH UNIQUE operation at least considers the optimizer estimates might be beneficial

- Obviously good cardinality estimates are crucial, in that case and when using the correct Oracle versions you should be fine. Using the undocumented OPT_ESTIMATE hint (or the deprecated undocumented CARDINALITY hint) might help in cases where other options like manually crafted statistics are not able to help the optimizer to come up with reasonable cardinality estimates.

- Applications that are able to share cursors might not be too much affected due to the ability to use the workarea executions for subsequent executions of the same cursor

- The described problems disappear when switching to manual workarea size policy and allowing for sufficient memory of workarea. Interestingly the less obvious SORT_AREA_SIZE is used for the manual control of the HASH GROUP BY operation, and not the HASH_AREA_SIZE. Of course using manual PGA memory management system-wide is highly unrecommended, so this might only be a workaround in certain cases of large batch jobs where manual workarea sizes are used anyway. Also note, when switching to manual workareas be aware of a nasty bug that was introduced in the 10.2.0.3 patchset. For more information see MOS note "6053134.8: ALTER SESSION to set SORT_AREA_SIZE no honoured" and for example mine and Jonathen Lewis' post about the issue. According to the MOS note the bug has been fixed in 10.2.0.4.3, 10.2.0.5 and 11.1.0.7 / 11.2

- Obviously the hash aggregation can be avoided on statement level using the NO_USE_HASH_AGGREGATION hint or on session / system level using the _GBY_HASH_AGGREGATION_ENABLED parameter. Since the sort based aggregation can be less efficient (but note that it doesn't have to be, it depends on the individual grouping and data pattern) again this doesn't necessarily solve the problem since aggregate workareas might spill to disk which wouldn't be necessary when hash aggregation was used and worked as expected.

Final Note

Of course there are a lot of things that I haven't touched yet or only briefly, like Parallel Execution of the hash aggregations, workareas used for index sorts, and all the other details when the workarea spills to disk.

If you want to get an idea what kind of nasty things can happen in that case you might want to read Jonathan Lewis' post on analytic functions.

I also haven't tried yet to fiddle around with the undocumented _smm* related parameters if setting them to non-default values allows to work around the issue.

And of course then there are bugs like this one: Bug 6817844 - Multi pass sort with auto memory management even with plenty of PGA [ID 6817844.8], which is also mentioned in Jonathan's post.

As a final note, the odd TEMP tablespace I/O pattern issue that Jonathan describes in his post (Bug 9041800) is marked as fixed in the 11.2.0.2 patch set, but I haven't tested this yet.

No comments:

Post a Comment