This is the first part of a series of posts that cover one of the fundamentals of the cost based optimizer in 9i and later. Understanding how the different system statistics modes work is crucial in making the most out of the cost based optimizer, therefore I'll attempt to provide some detailed explanations and samples about the formulas and arithmetics used. Finally I'll show (again) that using multiple block sizes for "tuning" purposes is a bad idea in general, along with detailed examples why I think this is so.
One of the deficiencies of the traditional I/O based costing was that it simply counted the number of I/O requests making no differentation between single-block I/O and multi-block I/O.
System statistics were introduced in Oracle 9i to allow the cost based optimizer to take into account that single-block I/Os and multi-block I/Os should be treated differently in terms of costing and to include a CPU component in the cost calculation.
The system statistics tell the cost based optimizer (CBO) among other things the time it takes to perform a single block read request and a multi-block read request. Given this information the optimizer ought to be able to come to estimates that better fit the particular environment where the database is running on and additionally use an appropriate costing for multi-block read requests that usually take longer than single block read requests. Given the information about the time it takes to perform the read requests the cost calculated can be turned into a time estimate.
The cost calculated with system statistics is still expressed in the same units as with traditional I/O based costing, which is in units of single-block read requests.
Although the mode using system statistics is also known as "CPU costing" despite the name the system statistics have the most significant impact on the I/O costs calculated for full table scans due to the different measure MREADTIM used for multi-block read requests.
Starting with Oracle 10g you have actually the choice of three different modes of system statistics also known as CPU costing:
1. Default NOWORKLOAD system statistics
2. Gathered NOWORKLOAD system statistics
3. Gathered WORKLOAD system statistics
The important point to understand here is that starting with Oracle 10g system statistics are enabled by default (using the default NOWORKLOAD system statistics) and you can only disable them by either downgrading your optimizer (using the OPTIMIZER_FEATURES_ENABLE parameter) or using undocumented parameters or hints ("_optimizer_cost_model" respectively the CPU_COSTING and NOCPU_COSTING hints).
This initial part of the series will focus on the default NOWORKLOAD system statistics introduced with Oracle 10g.
Default NOWORKLOAD system statistics
The default NOWORKLOAD system statistics measure only the CPU speed (CPUSPEEDNW), the two other remaining values used for NOWORKLOAD system statistics IOSEEKTIM (seek time) and IOTFRSPEED (transfer speed) are using default values (10 milliseconds seek time and 4096 bytes per millisecond transfer speed).
Using these default values for the I/O part the SREADTIM (single-block I/O read time) and MREADTIM (multi-block I/O read time) values are synthesized for cost calculation by applying the following formula:
SREADTIM = IOSEEKTIM + db_block_size / IOTFRSPEED
MREADTIM = IOSEEKTIM + mbrc * db_block_size / IOTFRSPEED
where "db_block_size" represents your database standard block size in bytes and "mbrc" is either the value of "db_file_multiblock_read_count" if it has been set explicitly, or a default of 8 if left unset. From 10.2 on this is controlled internally by the undocumented parameter "_db_file_optimizer_read_count". This means that in 10.2 and later the "mbrc" used by the optimizer to calculate the cost can be different from the "mbrc" actually used at runtime when performing multi-block read requests. If you leave the "db_file_multiblock_read_count" unset in 10.2 and later then Oracle uses a default of 8 for cost calculation but uses the largest possible I/O request size depending on the platform, which is usually 1MB (e.g. 128 blocks when using a block size of 8KB). In 10.2 and later this is controlled internally by the undocumented parameter "_db_file_exec_read_count".
Assuming a default block size of 8KB (8192 bytes) and "db_file_multiblock_read_count" left unset, this results in the following calculation:
SREADTIM = 10 + 8192 / 4096 = 10 + 2 = 12ms
MREADTIM = 10 + 8 * 8192 / 4096 = 10 + 16 = 26ms
These values will then be used to calculate the I/O cost of single block and multi-block read requests according to the execution plan (number of single-block reads + number of multi-block reads * MREADTIM / SREADTIM), which means that the I/O cost with system statistics aka. CPU costing is expressed in units of single block reads.
You can derive from above formula that with system statistics the cost of a full table scan operation is going to be more expensive approximately by the factor MREADTIM / SREADTIM compared to the traditional I/O based costing used in pre-10g by default, therefore system statistics usually tend to favor index access a bit more.
Note that above factor MREADTIM / SREADTIM is not entirely correct since the traditional I/O costing introduces a efficiency reduction factor when using higher MBRC settings, presumably to reflect that the larger the number of blocks per I/O request the higher the possibility that it won't be possible to use that large number of blocks per I/O request due to blocks already being in the buffer cache or hitting extent boundaries.
So with a MBRC setting of 8 the adjusted MBRC used for calculation is actually 6.59. Using e.g. a very high setting of 128 for the MBRC will actually use 40.82 for calculation. So the higher the setting the more the MRBC used for calculation will be reduced.
The following test case shall demonstrate the difference between traditional I/O costing, CPU costing and the factor MREADTIM / SREADTIM when using different "db_file_multiblock_read_count" settings. The test case was run against 10.2.0.4 Win32.
Note that the test case removes your current system statistics so you should be cautious if you have non-default system statistics at present in your database.
Furthermore the test case assumes a 8KB database default block size, and a locally managed tablespace with 1MB uniform extent size using manual segment space management (no ASSM).
drop table t1;
-- Create a table consisting of 10,000 blocks / 1 row per block
-- in a 8KB tablespace with manual segment space management (no ASSM)
create table t1
pctfree 99
pctused 1
-- tablespace test_2k
-- tablespace test_4k
tablespace test_8k
-- tablespace test_16k
as
with generator as (
select --+ materialize
rownum id
from all_objects
where rownum <= 3000
)
select
/*+ ordered use_nl(v2) */
rownum id,
trunc(100 * dbms_random.normal) val,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 10000
;
begin
dbms_stats.gather_table_stats(
user,
't1',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
end;
/
-- Use default NOWORKLOAD system statistics
-- for test but ignore CPU cost component
-- by using an artificially high CPU speed
begin
dbms_stats.delete_system_stats;
dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
end;
/
-- In order to verify the formula against the
-- optimizer calculations
-- don't increase the table scan cost by one
-- which is done by default from 9i on
alter session set "_table_scan_cost_plus_one" = false;
alter session set db_file_multiblock_read_count = 8;
-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 26
-- MREADTIM/SREADTIM = 26/12 = 2.16
-- Factor CPU Costing / traditional I/O costing
-- 2,709/1,518 = 1.78
-- MBRC = 8, adjusted MBRC = 10,000 / 1,518 = 6.59
-- 8/6.59 = 1.21
-- 2.16 / 1.21 = 1.78
select /*+ nocpu_costing */ max(val)
from t1;
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1518 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1518 |
-----------------------------------------------------------
select /*+ cpu_costing */ max(val)
from t1;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2709 (0)| 00:00:33 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2709 (0)| 00:00:33 |
---------------------------------------------------------------------------
alter session set db_file_multiblock_read_count = 16;
-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 42
-- MREADTIM/SREADTIM = 42/12 = 3.5
-- Factor CPU Costing / traditional I/O costing
-- 2,188/962 = 2.27
-- MBRC = 16, adjusted MBRC = 10,000 / 962 = 10.39
-- 16/10.39 = 1.54
-- 3.5 / 1.54 = 2.27
select /*+ nocpu_costing */ max(val)
from t1;
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 962 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 962 |
-----------------------------------------------------------
select /*+ cpu_costing */ max(val)
from t1;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2188 (0)| 00:00:27 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2188 (0)| 00:00:27 |
---------------------------------------------------------------------------
alter session set db_file_multiblock_read_count = 32;
-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 74
-- MREADTIM/SREADTIM = 74/12 = 6.16
-- Factor CPU Costing / traditional I/O costing
-- 1,928/610 = 3.16
-- MBRC = 32, adjusted MBRC = 10,000 / 610 = 16.39
-- 32/16.39 = 1.95
-- 6.16 / 1.95 = 3.16
select /*+ nocpu_costing */ max(val)
from t1;
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 610 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 610 |
-----------------------------------------------------------
select /*+ cpu_costing */ max(val)
from t1;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1928 (0)| 00:00:24 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1928 (0)| 00:00:24 |
---------------------------------------------------------------------------
alter session set db_file_multiblock_read_count = 64;
-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 138
-- MREADTIM/SREADTIM = 138/12 = 11.5
-- Factor CPU Costing / traditional I/O costing
-- 1,798/387 = 4.64
-- MBRC = 64, adjusted MBRC = 10,000 / 387 = 25.84
-- 64/25.84 = 2.48
-- 11.5 / 2.48 = 4.64
select /*+ nocpu_costing */ max(val)
from t1;
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 387 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 387 |
-----------------------------------------------------------
select /*+ cpu_costing */ max(val)
from t1;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1798 (0)| 00:00:22 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1798 (0)| 00:00:22 |
---------------------------------------------------------------------------
alter session set db_file_multiblock_read_count = 128;
-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 266
-- MREADTIM/SREADTIM = 266/12 = 22.16
-- Factor CPU Costing / traditional I/O costing
-- 1,732/245 = 7.07
-- MBRC = 128, adjusted MBRC = 10,000 / 245 = 40.82
-- 128/40.82 = 3.13
-- 22.16 / 3.13 = 7.07
select /*+ nocpu_costing */ max(val)
from t1;
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 245 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 245 |
-----------------------------------------------------------
select /*+ cpu_costing */ max(val)
from t1;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1732 (0)| 00:00:21 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1732 (0)| 00:00:21 |
---------------------------------------------------------------------------
So as you can see the I/O costs for a full table scan are significantly different when using default NOWORKLOAD system statistics. You can also see that the SREADTIM and MREADTIM values derived are quite different when using different "db_file_multiblock_read_count" settings. Furthermore the difference between traditional I/O based costing and the CPU costing is not the factor MREADTIM / SREADTIM as suggested by the formula, but is reduced by the adjustment applied to the MBRC when using traditional I/O costing.
The next part of the series will cover the remaining available System Statistics mode.