Sunday, January 15, 2012

Incremental Partition Statistics Review

Introduction

Here is a summary of the findings while evaluating Incremental Partition Statistics that have been introduced in Oracle 11g.

The most important point to understand is that Incremental Partition Statistics are not "cost-free", so anyone who is telling you that you can gather statistics on the lowest level (partition or sub-partition in case of composite partitioning) without any noticeable overhead in comparison to non-incremental statistics (on the lowest level) is not telling you the truth.

Although this might be obvious I've already personally heard someone making such claims so it's probably worth to mention.

In principle you need to test on your individual system whether the overhead that is added to each statistics update on the lowest level outweighs the overhead of actually gathering statistics on higher levels, of course in particular on global level.

This might also depend on your strategy how and how often you used to gather statistics so far.

The overhead introduced by Incremental Partition Statistics can be significant, in terms of both runtime and data volume. You can expect the SYSAUX tablespace to grow by several GBs (for larger databases in the TB range easily in the tenth of GBs) depending on the number of partitions, number of columns and distinct values per column.

To give you an idea here are some example figures from the evaluation:

Table 1: 4 million total rows, 1 GB total size, 6 range partitions, 155 columns
Table 2: 200 million total rows, 53 GB total size, 629 range-list subpartitions, 104 columns

For Table 1 Incremental stats maintained 700,000 rows in SYS.WRI$_OPTSTAT_SYNOPSIS$. For Table 2 it was 3,400,000 rows. In total for these two tables approx. 4.1 million rows and 170 MB had to be maintained in the SYS.WRI$_OPTSTAT_SYNOPSIS$ tables.

When I first saw this significant data volume generated for the synopsis meta data I was pretty sure that processing that amount of data will clearly cause some significant overhead, too.

And that is exactly what happens - for example a recursive DELETE statement on the SYS.WRI$_OPTSTAT_SYNOPSIS$ table took about 10 secs out of 16 secs total runtime of statistics gathering for a rather small partition of above partitioned table. Here are some more figures from the test runs:

Timing comparison on an Exadata X2-8
(tests were performed as only user on the system)

Exadata X2-8 was 11.2.0.2 BP6, for comparison purposes a full rack V2 running 11.2.0.1.2 BP6(?) was used

The following relevant parameters were used in the call to DBMS_STATS:


dbms_stats.gather_table_stats(
ownname => ...
, tabname => ...,
, partname=>'<PARTNAME>'
, granularity=>'AUTO'
, estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
, method_opt => 'FOR ALL COLUMNS SIZE 1'
, cascade => true
);


were <PARTNAME> is the name of the partition that was modified. Basically it was a simulation of a per-partition data load where the data is loaded into a separate segment and afterwards an exchange (sub)partition is performed with the main table.

After exchange partition the statistics were refreshed on the main table using above call.

Modification of a single partition of above Table 1, approx. 500,000 rows resp. 110 MB of data in this single partition.

INCREMENTAL => FALSE: 7-13 seconds
INCREMENTAL => TRUE : 16 seconds (the majority of time is spent on a DELETE from
SYS.WRI$_OPTSTAT_SYNOPSIS$)


Modification of a single subpartition of above Table 2, approx. 300,000 rows resp. 75 MB of data in this single subpartition.

INCREMENTAL => FALSE: 67 seconds
INCREMENTAL => TRUE : 11.2.0.2 390 (!) seconds
11.2.0.1 30 seconds
11.2.0.2 with fix_control=8917507:OFF: 70 seconds


So the overhead ratio depends largely on the time it actually takes to gather the statistics - for rather small partitions the meta data maintenance overhead will be enormous.

On an Exadata X2-8 11.2.0.2 using the non-incremental approach of gathering lowest level partition statistics plus partition plus global statistics for the 53GB table (629 range-list subpartitions, 104 columns), took almost the same time as it took the incremental statistics to gather statistics only on lowest level plus the meta data maintenance / aggregation overhead.

Of course you'll appreciate that the activity performed for those two operations is vastly different - the conventional statistics approach needs to throw all processing power of the X2-8 at this problem and any concurrent activity will have to share the CPU and I/O demand of that operation, while the mostly meta data based incremental statistics only allocate a single CPU and some I/O during the processing, leaving most of the I/O and CPU resources available for other concurrent tasks.

On a larger data volume and/or slower systems the Incremental Partition Statistics will probably easily outperform the non-incremental approach.

Furthermore it should be mentioned that the tests used the "FOR ALL COLUMNS SIZE 1" METHOD_OPT option that doesn't generate any histograms. The INCREMENTAL partition statistics feature is however capable of deriving upper level histograms from lower levels of statistics with histograms in place. This can mean a significant saving in processing time if histograms need to be maintained on upper levels since each histogram adds another pass to the DBMS_STATS processing. In fact the histograms generating by INCREMENTAL partition statistics might be even of better quality than those generated via explicit gathering because by default a quite low sample size is used for histogram generation in order to keep the overhead as small as possible.

Note that according to the description the APPROX_GLOBAL AND PARTITION granularity also supports aggregation of histograms, but I haven't looked in detail into this option yet.

As usual you'll have to test it yourself on your system and hardware, but the main point is that it doesn't come for free - it requires both significant space and runtime.

One idea that might make sense is limiting the column statistics to those columns that you are sure you'll use in predicates / group bys / order bys. Any columns that are only used for display purposes could be left without any column statistics. Depending on your data model this might allow to save some volume and processing time, but it needs to be maintained on a per table basis rather than a one size fits all approach.

Further Findings

Here are some further findings that I found relevant:

- INCREMENTAL => TRUE means that ESTIMATE_PERCENT will be ignored - the new approximate NDV algorithm that reads all data but doesn't add the grouping overhead of a conventional aggregation method is mandatory for the new feature. This means in case of very large data sets to analyze that former approaches using very low sample sizes will now take significantly longer (approx. the time it takes to sample 10% of the data with the former approach), however with the benefit of producing almost 100% accurate statistics. There is currently no way around this - if you want to use INCREMENTAL you have to process 100% of the data using the new NDV algorithm. Note that this applies to 11.2 - I haven't tested this on 11.1

- INCREMENTAL doesn't maintain any synopses for indexes, so in order to obtain higher level index statistics for partitioned indexes it always includes a gather global index statistics. However it resorts to a sample size and doesn't analyze the whole index. For very large indexes and/or a very large number of indexes the overhead can still be significant, so this is something to keep in mind: Even with incremental partition statistics there is a component that is dependent on the global volume, in this case the index volume

- In order to effectively use INCREMENTAL the meta data for the synopses needs to be created initially for all partitions, even for those where the data doesn't change any longer. So for very large (historic) data volumes this initial synopsis generation can represent a challenge that needs to be planned and considered how it will be approached. You need to be careful how incremental will be enabled: If you simply switch it on and use GRANULARITY=>AUTO as outlined in the manuals the next gather statistics call on the table will gather the meta data for all (sub-)partitions of the table - this might take very, very long. It might be more sensible to gather statistics with a different GRANULARITY. This still adds the meta data maintenance overhead but you are in control of which partitions are going to be analyzed, allowing for a step-wise approach.

- In 11.2.0.2 the underlying internal table structure has been changed significantly. In particular the table SYS.WRI$_OPTSTAT_SYNOPSIS$ has been changed from unpartitioned to composite partitioned. Interestingly it doesn't have a single index in 11.2.0.2 - it looks like having it composite-partitioned seemed to be sufficient to the developers. The change very likely has been introduced due to bug 9038395 that addresses the problem that deleting the statistics for a single table used to be dependent on the total amount of tables using incremental statistics. So that problem should be addressed now, but it still doesn't mean that the meta data maintenance overhead is now negligible

- There is a bug in 11.2.0.2 that basically rendered the incremental partition statistics unusable with composite partitioned tables used at that client. A particular recursive SQL statement got executed multiple thousand times. This means it took up to several minutes to complete the meta data operation (see above timings). This is tracked with bug 12833442. The behaviour can be changed by using fix control 8917507 - which helped in this case to arrive at reasonable runtimes although 11.2.0.1 was still twice as fast.

- INCREMENTAL => TRUE doesn't work with locked statistics, you'll always end up with an ORA-20005 Object Statistics are locked even when specifying the FORCE => TRUE option. This is tracked with bug 12369250 (according to MyOracleSupport fixed in the 11.2.0.3 patch set)

Footnote

All of the above applies to 11.2.0.2 resp. 11.2.0.1. I haven't had the chance yet to repeat those tests on 11.2.0.3.

7 comments:

Timur Akhmadeev said...

Hi Randolf

excellent post, thanks.
I think that you should also try to test it on a "normal" hardware, though. Exadata is such a monster that 53 GB is of no real life for it. And it would be nice to have a comparison of how much metadata is needed to store synopses as your tables grow - to see the trend.

Randolf said...

Hi Timur,

> I think that you should also try to test it on a "normal" hardware, though

That's why I mention this in the article:

"On a larger data volume and/or slower systems the Incremental Partition Statistics will probably easily outperform the non-incremental approach.
.
.
.
As usual you'll have to test it yourself on your system and hardware, but the main point is that it doesn't come for free - it requires both significant space and runtime."

This evaluation was done on a system having (almost) real-life data for one of my clients, hence the Exadata environment.

Randolf

Marcin Przepiorowski said...

Hi Randolf,

I was trying to use incremental stats with method_opt set to 'FOR ALL COLUMNS SIZE 1' but it never worked for me. I was able to gather incremental statistics with histograms only and as far as I know it is a another limitation of incremental statistics.

Answering Timur question I was able to reduce gather statistic time from 6 h for whole table with to 15 - 20 min using incremental statistics for 432 GB table with 10 columns with more less 1.3 billion rows on non-exadata machine using 11.2.0.2 with patches.

regards,
Marcin

Randolf said...

Hi Marcin,

> I was able to gather incremental statistics with histograms only and as far as I know it is a another limitation of incremental statistics

As you can see from my tests I didn't have any problems with the "FOR ALL COLUMNS SIZE 1" option - although the gain in runtime might not have been as much as one might have expected incremental statistics were definitely used.

I'm not aware of such a limitation, why incremental statistics should only work with histograms - doesn't sound plausible to me.

Randolf

Marcin Przepiorowski said...

Hello Randolf,

Thanks for your answer. I found again my old test case and now I found a MOS note as well "DBMS_STATS 'gather auto' and 'gather stale' and column histograms [ID 1073120.1]". It looks like I run into that bug when I was testing incremental statistics.

I was able to gather incremental stats manually without histograms but due to above bug it switch into histograms when stats were gathered by build in job.

regards,
Marcin

Randolf said...

Hi Marcin,

> I was able to gather incremental stats manually without histograms but due to above bug it switch into histograms when stats were gathered by build in job

Interesting bug that I wasn't aware of. It has nothing to do with incremental statistics per se I would say though. It was just bad luck that up to 11.1.0.6 the GATHER AUTO option used a hard-coded value of FOR ALL COLUMNS SIZE SKEWONLY for METHOD_OPT (It has switched since then to FOR ALL COLUMNS SIZE AUTO which still might generate histograms that are not necessarily beneficial).

Randolf

Dani Schnider said...

Hi Randolf

Thank you for this excellent overview. I spent a lot of time to realize that incremental statistics cannot be combined with ESTIMATE_PERCENT (and documented it in my blog). I still have an open Service Request at Oracle Support, but your summary contains the answer I expected to read from Oracle: Incremental statistics with a small value for ESTIMATE_PERCENT are currently not possible.

By the way: The behaviour is the same in Oracle 11.1.0.7.