Sunday, January 15, 2012

Incremental Partition Statistics Review


Here is a summary of the findings while evaluating Incremental Partition Statistics that have been introduced in Oracle 11g.

The most important point to understand is that Incremental Partition Statistics are not "cost-free", so anyone who is telling you that you can gather statistics on the lowest level (partition or sub-partition in case of composite partitioning) without any noticeable overhead in comparison to non-incremental statistics (on the lowest level) is not telling you the truth.

Although this might be obvious I've already personally heard someone making such claims so it's probably worth to mention.

In principle you need to test on your individual system whether the overhead that is added to each statistics update on the lowest level outweighs the overhead of actually gathering statistics on higher levels, of course in particular on global level.

This might also depend on your strategy how and how often you used to gather statistics so far.

The overhead introduced by Incremental Partition Statistics can be significant, in terms of both runtime and data volume. You can expect the SYSAUX tablespace to grow by several GBs (for larger databases in the TB range easily in the tenth of GBs) depending on the number of partitions, number of columns and distinct values per column.

To give you an idea here are some example figures from the evaluation:

Table 1: 4 million total rows, 1 GB total size, 6 range partitions, 155 columns
Table 2: 200 million total rows, 53 GB total size, 629 range-list subpartitions, 104 columns

For Table 1 Incremental stats maintained 700,000 rows in SYS.WRI$_OPTSTAT_SYNOPSIS$. For Table 2 it was 3,400,000 rows. In total for these two tables approx. 4.1 million rows and 170 MB had to be maintained in the SYS.WRI$_OPTSTAT_SYNOPSIS$ tables.

When I first saw this significant data volume generated for the synopsis meta data I was pretty sure that processing that amount of data will clearly cause some significant overhead, too.

And that is exactly what happens - for example a recursive DELETE statement on the SYS.WRI$_OPTSTAT_SYNOPSIS$ table took about 10 secs out of 16 secs total runtime of statistics gathering for a rather small partition of above partitioned table. Here are some more figures from the test runs:

Timing comparison on an Exadata X2-8
(tests were performed as only user on the system)

Exadata X2-8 was BP6, for comparison purposes a full rack V2 running BP6(?) was used

The following relevant parameters were used in the call to DBMS_STATS:

ownname => ...
, tabname => ...,
, partname=>'<PARTNAME>'
, granularity=>'AUTO'
, estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
, method_opt => 'FOR ALL COLUMNS SIZE 1'
, cascade => true

were <PARTNAME> is the name of the partition that was modified. Basically it was a simulation of a per-partition data load where the data is loaded into a separate segment and afterwards an exchange (sub)partition is performed with the main table.

After exchange partition the statistics were refreshed on the main table using above call.

Modification of a single partition of above Table 1, approx. 500,000 rows resp. 110 MB of data in this single partition.

INCREMENTAL => FALSE: 7-13 seconds
INCREMENTAL => TRUE : 16 seconds (the majority of time is spent on a DELETE from

Modification of a single subpartition of above Table 2, approx. 300,000 rows resp. 75 MB of data in this single subpartition.

INCREMENTAL => FALSE: 67 seconds
INCREMENTAL => TRUE : 390 (!) seconds 30 seconds with fix_control=8917507:OFF: 70 seconds

So the overhead ratio depends largely on the time it actually takes to gather the statistics - for rather small partitions the meta data maintenance overhead will be enormous.

On an Exadata X2-8 using the non-incremental approach of gathering lowest level partition statistics plus partition plus global statistics for the 53GB table (629 range-list subpartitions, 104 columns), took almost the same time as it took the incremental statistics to gather statistics only on lowest level plus the meta data maintenance / aggregation overhead.

Of course you'll appreciate that the activity performed for those two operations is vastly different - the conventional statistics approach needs to throw all processing power of the X2-8 at this problem and any concurrent activity will have to share the CPU and I/O demand of that operation, while the mostly meta data based incremental statistics only allocate a single CPU and some I/O during the processing, leaving most of the I/O and CPU resources available for other concurrent tasks.

On a larger data volume and/or slower systems the Incremental Partition Statistics will probably easily outperform the non-incremental approach.

Furthermore it should be mentioned that the tests used the "FOR ALL COLUMNS SIZE 1" METHOD_OPT option that doesn't generate any histograms. The INCREMENTAL partition statistics feature is however capable of deriving upper level histograms from lower levels of statistics with histograms in place. This can mean a significant saving in processing time if histograms need to be maintained on upper levels since each histogram adds another pass to the DBMS_STATS processing. In fact the histograms generating by INCREMENTAL partition statistics might be even of better quality than those generated via explicit gathering because by default a quite low sample size is used for histogram generation in order to keep the overhead as small as possible.

Note that according to the description the APPROX_GLOBAL AND PARTITION granularity also supports aggregation of histograms, but I haven't looked in detail into this option yet.

As usual you'll have to test it yourself on your system and hardware, but the main point is that it doesn't come for free - it requires both significant space and runtime.

One idea that might make sense is limiting the column statistics to those columns that you are sure you'll use in predicates / group bys / order bys. Any columns that are only used for display purposes could be left without any column statistics. Depending on your data model this might allow to save some volume and processing time, but it needs to be maintained on a per table basis rather than a one size fits all approach.

Further Findings

Here are some further findings that I found relevant:

- INCREMENTAL => TRUE means that ESTIMATE_PERCENT will be ignored - the new approximate NDV algorithm that reads all data but doesn't add the grouping overhead of a conventional aggregation method is mandatory for the new feature. This means in case of very large data sets to analyze that former approaches using very low sample sizes will now take significantly longer (approx. the time it takes to sample 10% of the data with the former approach), however with the benefit of producing almost 100% accurate statistics. There is currently no way around this - if you want to use INCREMENTAL you have to process 100% of the data using the new NDV algorithm. Note that this applies to 11.2 - I haven't tested this on 11.1

- INCREMENTAL doesn't maintain any synopses for indexes, so in order to obtain higher level index statistics for partitioned indexes it always includes a gather global index statistics. However it resorts to a sample size and doesn't analyze the whole index. For very large indexes and/or a very large number of indexes the overhead can still be significant, so this is something to keep in mind: Even with incremental partition statistics there is a component that is dependent on the global volume, in this case the index volume

- In order to effectively use INCREMENTAL the meta data for the synopses needs to be created initially for all partitions, even for those where the data doesn't change any longer. So for very large (historic) data volumes this initial synopsis generation can represent a challenge that needs to be planned and considered how it will be approached. You need to be careful how incremental will be enabled: If you simply switch it on and use GRANULARITY=>AUTO as outlined in the manuals the next gather statistics call on the table will gather the meta data for all (sub-)partitions of the table - this might take very, very long. It might be more sensible to gather statistics with a different GRANULARITY. This still adds the meta data maintenance overhead but you are in control of which partitions are going to be analyzed, allowing for a step-wise approach.

- In the underlying internal table structure has been changed significantly. In particular the table SYS.WRI$_OPTSTAT_SYNOPSIS$ has been changed from unpartitioned to composite partitioned. Interestingly it doesn't have a single index in - it looks like having it composite-partitioned seemed to be sufficient to the developers. The change very likely has been introduced due to bug 9038395 that addresses the problem that deleting the statistics for a single table used to be dependent on the total amount of tables using incremental statistics. So that problem should be addressed now, but it still doesn't mean that the meta data maintenance overhead is now negligible

- There is a bug in that basically rendered the incremental partition statistics unusable with composite partitioned tables used at that client. A particular recursive SQL statement got executed multiple thousand times. This means it took up to several minutes to complete the meta data operation (see above timings). This is tracked with bug 12833442. The behaviour can be changed by using fix control 8917507 - which helped in this case to arrive at reasonable runtimes although was still twice as fast.

- INCREMENTAL => TRUE doesn't work with locked statistics, you'll always end up with an ORA-20005 Object Statistics are locked even when specifying the FORCE => TRUE option. This is tracked with bug 12369250 (according to MyOracleSupport fixed in the patch set)


All of the above applies to resp. I haven't had the chance yet to repeat those tests on