Sunday, January 17, 2016

DML Operations On Partitioned Tables Can Restart On Invalidation

It's probably not that well known that Oracle can actually rollback / re-start the execution of a DML statement should the cursor become invalidated. By rollback / re-start I mean that Oracle actually performs a statement level rollback (so any modification already performed by that statement until that point gets rolled back), performs another optimization phase of the statement on re-start (due to the invalidation) and begins the execution of the statement from scratch. Note that this can happen multiple times - actually it's possible to end up in a kind of infinite loop when this happens, leading to statements that can run for very, very long (I've seen statements on Production environments executing for several days although a single execution would only take minutes).

The pre-requisites to meet for this to happen are not that complex or exotic:

- The target table to manipulate needs to be partitioned

- The cursor currently executing gets invalidated - either by running DDL (typically think of partition related operations) - or simply by gathering statistics on one of the objects involved in the statement

- The DML statement hasn't touched yet one of the partitions of the target table but attempts to do so after the cursor got invalidated

When the last condition is met, the statement performs a rollback, and since it got invalidated - which is one of the conditions to be met - another optimization phase happens, meaning that it's also possible to get different execution plans for the different execution attempts. When the execution plan is ready the execution begins from scratch.

According to my tests the issue described here applies to both conventional and direct-path inserts, merge statements (insert / update / delete) as well as serial and parallel execution. I haven't explicitly tested UPDATE and DELETE statements, but the assumption is that they are affected, too.

The behaviour is documented in the following note on MOS: "Insert Statement On Partitioned Tables Is RE-Started After Invalidation (Doc ID 1462003.1)" which links to Bug "14102209 : INSERT STATEMENT' IS RESTARTING BY ITSELF AFTER INVALIDATION" where you can also find some more comments on this behaviour. The issue seems to be that Oracle at that point is no longer sure if the partition information compiled into the cursor for the partitioned target table is still correct or not (and internally raises and catches a corresponding error, like "ORA-14403: Cursor invalidation detected after getting DML partition lock", leading to the re-try), so it needs to refresh that information, hence the re-optimization and re-start of the cursor.

Note that this also means that the DML statement might already have performed modifications to other partitions but after being invalidated attempts to modify another partition it hasn't touched yet - it just needs an attempt to modify a partition not touched into yet by that statement.

It's also kind of nasty that the statement keeps running the potentially lengthy query part after being invalidated only to find out it needs to re-start after the first row is attempted to be applied to a target table partition not touched yet.

Note that applications typically run into this problem, when they behave like the following:

- There are longer running DML statements that take typically several seconds / minutes until they attempt to actually perform an modification to a partitioned target table

- They either use DBMS_STATS to gather stats on one of the involved tables, typically using NO_INVALIDATE=>FALSE, which leads to an immediate invalidation of all affected cursors

- And/Or they perform partition related operations on one of the tables involved, like truncating, creating or exchanging partitions. Note that it is important to point out that it doesn't matter which objects gets DDL / stats applied, so it's not limited to activity on the partitioned target table being modified - any object involved in the query can cause the cursor invalidation

In principle this is another variation of the general theme "Don't mix concurrent DDL with DML/queries on the same objects". Doing so is something that leads to all kinds of side effects, and the way the Oracle engine is designed means that it doesn't cope very well with doing so.

Here is a simple test case for reproducing the issue, using INSERTs in this case here (either via INSERT or MERGE statement):
create table t_target (
id number(*, 0) not null,
pkey number(*, 0) not null,
filler varchar2(500)
)
--segment creation immediate
partition by range (pkey) --interval (1)
(
  partition pkey_0 values less than (1)
, partition pkey_1 values less than (2)
, partition pkey_2 values less than (3)
, partition pkey_3 values less than (4)
);

create table t_source
compress
as
select 1 as id, rpad('x', 100) as filler
from
(select /*+ cardinality(1e3) */ null from dual connect by level <= 1e3),
(select /*+ cardinality(1e0) */ null from dual connect by level <= 1e0)
union all
select 1 as id, rpad('y', 100) as filler from dual;

-- Run this again once the DML statement below got started
exec dbms_stats.gather_table_stats(null, 't_source', no_invalidate=>false)

exec dbms_stats.gather_table_stats(null, 't_target', no_invalidate=>false)

----------------------------------------------------------------------------------------------------------------------------------
-- INSERT example                                                                                                               --
-- Run above DBMS_STATS calls or any other command that invalidates the cursor during execution to force re-start of the cursor --
----------------------------------------------------------------------------------------------------------------------------------

set echo on timing on time on

-- alter session set tracefile_identifier = 'insert_restart';

-- alter session set events '10046 trace name context forever, level 12';

-- exec sys.dbms_monitor.session_trace_enable(waits => true, binds => true/*, plan_stat => 'all_executions'*/)

insert /* append */ into t_target (id, pkey, filler)
select * from (
select /*+
           use_hash(a b)
           no_eliminate_oby
       */
       a.id, 1 as pkey, a.filler
from t_source a, t_source b
where a.id = b.id
and (
regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm')
or (b.filler = rpad('y', 100) and a.filler = rpad('y', 100))
)
order by a.id
)
union all
select * from (
select /*+
           use_hash(a b)
           no_eliminate_oby
       */
       a.id, 2 as pkey, a.filler
from t_source a, t_source b
where a.id = b.id
and (
regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm')
or (b.filler = rpad('y', 100) and a.filler = rpad('y', 100))
)
order by a.id
)
union all
select * from (
select /*+
           use_hash(a b)
           no_eliminate_oby
       */
       a.id, 3 as pkey, a.filler
from t_source a, t_source b
where a.id = b.id
and (
regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm')
or (b.filler = rpad('y', 100) and a.filler = rpad('y', 100))
)
order by a.id
)
;

-- exec sys.dbms_monitor.session_trace_disable

----------------------------------------------------------------------------------------------------------------------------------
-- MERGE example                                                                                                                --
-- Run above DBMS_STATS calls or any other command that invalidates the cursor during execution to force re-start of the cursor --
----------------------------------------------------------------------------------------------------------------------------------

set echo on timing on time on

merge /* append */ into t_target t
using (
select * from (
select /*+
           use_hash(a b)
           no_eliminate_oby
       */
       a.id, 1 as pkey, a.filler
from t_source a, t_source b
where a.id = b.id
and (
regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm')
or (b.filler = rpad('y', 100) and a.filler = rpad('y', 100))
)
order by a.id
)
union all
select * from (
select /*+
           use_hash(a b)
           no_eliminate_oby
       */
       a.id, 2 as pkey, a.filler
from t_source a, t_source b
where a.id = b.id
and (
regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm')
or (b.filler = rpad('y', 100) and a.filler = rpad('y', 100))
)
order by a.id
)
union all
select * from (
select /*+
           use_hash(a b)
           no_eliminate_oby
       */
       a.id, 3 as pkey, a.filler
from t_source a, t_source b
where a.id = b.id
and (
regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'c')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'i')
--or regexp_replace(a.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm') != regexp_replace(b.filler, '^\s+([[:alnum:]]+)\s+$', lpad('\1', 10), 1, 1, 'm')
or (b.filler = rpad('y', 100) and a.filler = rpad('y', 100))
)
order by a.id
)
) s
on (s.id = t.id)
when not matched then
insert (id, pkey, filler) values (s.id, s.pkey, s.filler)
;
The idea of the test case is to maximise the time until each UNION ALL branch produces data to insert by performing an inefficient HASH JOIN (that in fact generates a Cartesian product and needs to apply a costly REGEXP filter on that huge intermediate result) and forcing a sort on the join result, so rows will only be handed over to the parent operations until all rows were processed in the join operation - and each branch generates data for a different partition of the target table. Typically it should take several seconds per branch to execute (if you need more time just un-comment the additional REGEXP_REPLACE filters), so you should have plenty of time to cause the invalidation from another session.

This means during the execution of each branch invalidating the cursor (for example by executing either of the two DBMS_STATS calls on the source or target table using NO_INVALIDATE=>FALSE) will lead to a re-start of the statement at the next attempt to write into a new target partition, possibly rolling back rows already inserted into other partitions.

Diagnostics


If you run the provided INSERT or MERGE statement on newer versions of Oracle that include the SQL_EXEC_START and SQL_EXEC_ID in V$ACTIVE_SESSION_HISTORY (or V$SESSION for that matter) and invalidate the cursor during execution and before a partition of the target table gets inserted for the first time then you can see that these entries change as the statement restarts.

In such cases the INVALIDATIONS and LOADS increase in V$SQL accordingly and the OBJECT_STATUS changes from INVALID_UNAUTH to VALID again with each re-start attempt. In newer versions where you can configure the "plan_stat" information for SQL trace to "all_executions" you'll find STAT lines for each execution attempt dumped to the trace file, but only a single final EXEC line, where the elapsed time covers all execution attempts.

The oldest version I've tested was 10.2.0.4, and that one showed already the re-start behaviour, although I would be inclined to think that this wasn't the case with older versions. So if anybody still runs older versions than 10.2.0.4 I would be interested to hear whether the behaviour reproduces or not.

1 comment:

Fairlie Rego said...

FYI...I can reproduce this behaviour on a 10.2.0.3 database

Cheers
Fairlie