Thursday, March 1, 2012

Join Views, ROWIDs And Query Transformations

Here is an odd little bug that was discussed a couple of weeks ago on the OTN forums.

It's about queries on join views by ROWID that fail with "ORA-01410: invalid ROWID" under certain circumstances. The bug can only be reproduced when using the 11.2 code base. In fact the same setup will cause an internal error in 11.1, but 10.2 will return correct results.

It's probably not a very common scenario but it is an interesting example of how features that work fine by themselves can cause problems when used together.

First of all (hopefully) some of you may ask: How is it possible to query from a join view by ROWID, since the view is based on multiple objects and hence doesn't have a simple one-to-one mapping to a ROWID of a single table?

The answer is: It requires "Key-Preserved Tables". This is a term that is frequently used by Oracle in their documentation, in particular in the context of updatable join views. This means that you cannot query every join view by ROWID - it has to meet certain prerequisites otherwise Oracle will throw an error "ORA-01445: cannot select ROWID from, or sample, a join view without a key-preserved table".

Obviously Oracle needs to be able to determine if there is at least one table in the view for which it is guaranteed that its rows can show up at most once in the result set of the view, which means that there need to be a minimum set of unique or primary key constraints at least if the view involves more than a single table. Furthermore Oracle needs to be able to determine this automatically, and depending on the complexity and expressions used, this is not always the case. See this post by Jonathan Lewis about a discussion of possible improvements in this regard. Other database vendors seem to allow at least to some degree more complex queries to be recognized as what Oracle calls "key-preserved".

Let's start with a simple example that creates a simple join view on three tables, furthermore appropriate constraints are created. The sample generates more constraints than strictly necessary to get a key-preserved table. The advantage of these additional constraints is that Oracle can perform a table or join elimination transformation when querying data only from a subset of the tables joined in the view.


drop table t1;

purge table t1;

drop table t2;

purge table t2;

drop table t3;

purge table t3;

drop view v;

drop view v_t1;

create table t1
as
select
rownum as id
, mod(rownum, 1000) + 1 as fk_id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1000
;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

create table t2
as
select
rownum as id
, mod(rownum, 100) + 1 as fk_id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1000
;

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

create table t3
as
select
rownum as id
, mod(rownum, 10) + 1 as fk_id
, rpad('x', 100) as filler
from
dual
connect by
level <= 100
;

exec dbms_stats.gather_table_stats(null, 't3', method_opt => 'for all columns size 1')

alter table t1 add constraint t1_pk primary key (id);

alter table t2 add constraint t2_pk primary key (id);

alter table t3 add constraint t3_pk primary key (id);

alter table t1 add constraint t1_t2_fk foreign key (fk_id) references t2 (id);

alter table t2 add constraint t2_t3_fk foreign key (fk_id) references t3 (id);

alter table t1 modify fk_id not null;

alter table t2 modify fk_id not null;

alter table t3 modify fk_id not null;

create or replace view
v
as
select
t1.id as t1_id
, t1.fk_id as t1_fk_id
, t1.filler as t1_filler
, t2.id as t2_id
, t2.fk_id as t2_fk_id
, t2.filler as t2_filler
, t3.id as t3_id
, t3.fk_id as t3_fk_id
, t3.filler as t3_filler
from
t1
, t2
, t3
where
t1.fk_id = t2.id
and t2.fk_id = t3.id
;


So this is a simple join, and given the current setup the key-preserved table is T1.

We can verify this by this simple query:


SQL> select object_name from user_objects where object_id = (select sys.dbms_rowid.rowid_object(rowid) from v where rownum <= 1);

OBJECT_NAME
--------------------------------------------------------------------------------------------------------------------------------
T1


So Oracle picks automatically T1 as source for the ROWID.

Let's see the join table elimination feature in action by using the following sample:


SQL> column rid new_value rid
SQL>
SQL> select rowid as rid from v where rownum <= 1;

RID
------------------
AAAVTGAAIAAAAEBAAA

SQL>
SQL> select t1_id from v where rowid = '&rid';
old 1: select t1_id from v where rowid = '&rid'
new 1: select t1_id from v where rowid = 'AAAVTGAAIAAAAEBAAA'

T1_ID
----------
1

SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID 5stwqhguqgjr0, child number 0
-------------------------------------
select t1_id from v where rowid = 'AAAVTGAAIAAAAEBAAA'

Plan hash value: 487051824

-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)| |
| 1 | TABLE ACCESS BY USER ROWID| T1 | 1 | 20 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------


13 rows selected.

SQL>
SQL> column rid clear


Oracle 11.2 reduces this query to the simplest possible form - the tables T2 and T3 do not even show up in the execution plan. The validated and non-deferrable foreign key constraints ensure that Oracle can safely ignore the join to these tables given that only data from T1 is accessed.

So the Query Transformation seems to work pretty well and everything is fine so far.

Now let's modify the setup a little bit. As you might have noticed I've crafted the T2 and T1 tables deliberately in such a way that I can define a 1:1 relationship between them: The T1.FK_ID column is also unique.

So let's declare a unique constraint on T1.FK_ID to tell the database about this 1:1 relationship. Furthermore I'll define the simplest possible view on top of T1 that does nothing else than simply a SELECT * FROM T1:


alter table t1 add constraint t1_uq unique (fk_id);

create or replace view v_t1 as select * from t1;

create or replace view
v
as
select
t1.id as t1_id
, t1.fk_id as t1_fk_id
, t1.filler as t1_filler
, t2.id as t2_id
, t2.fk_id as t2_fk_id
, t2.filler as t2_filler
, t3.id as t3_id
, t3.fk_id as t3_fk_id
, t3.filler as t3_filler
from
v_t1 t1
, t2
, t3
where
t1.fk_id = t2.id
and t2.fk_id = t3.id
;


Finally I've changed the view V to refer to the view V_T1 instead of the base table.

Let's repeat the check about the source of the ROWID now:


SQL> select object_name from user_objects where object_id = (select sys.dbms_rowid.rowid_object(rowid) from v where rownum <= 1);

OBJECT_NAME
--------------------------------------------------------------------------------------------------------------------------------
T2


Oracle now has selected T2 as the key-preserved table which is fine since T2 is now also guaranteed to be unique in the result set of the view.

Finally, let's repeat the query that I used to demonstrate the join elimination query transformation:


SQL> column rid clear
SQL> set echo on verify on
SQL>
SQL> column rid new_value rid
SQL>
SQL> select rowid as rid from v where rownum <= 1;

RID
------------------
AAAVTHAAIAAAAGBAAA

SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
SQL_ID 8ns6u01cr94xa, child number 0
-------------------------------------
select rowid as rid from v where rownum <= 1

Plan hash value: 1420877628

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | NESTED LOOPS | | 1 | 20 | 2 (0)| 00:00:01 |
| 3 | INDEX FAST FULL SCAN| T1_UQ | 1000 | 4000 | 2 (0)| 00:00:01 |
|* 4 | INDEX UNIQUE SCAN | T2_PK | 1 | 16 | 0 (0)| |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
4 - access("FK_ID"="T2"."ID")


22 rows selected.

SQL>
SQL> select t1_id from v where rowid = '&rid';
old 1: select t1_id from v where rowid = '&rid'
new 1: select t1_id from v where rowid = 'AAAVTHAAIAAAAGBAAA'
select t1_id from v where rowid = 'AAAVTHAAIAAAAGBAAA'
*
ERROR at line 1:
ORA-01410: invalid ROWID


SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
SQL_ID cgvynnw6sthrw, child number 0
-------------------------------------
select t1_id from v where rowid = 'AAAVTHAAIAAAAGBAAA'

Plan hash value: 487051824

-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)| |
| 1 | TABLE ACCESS BY USER ROWID| T1 | 1 | 20 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------


13 rows selected.

SQL>
SQL> column rid clear


Oops, that doesn't look too good: Oracle applied the ROWID to the wrong table respectively eliminated T2 from the execution plan although it uses T2 to obtain the ROWID. As you can see from the execution plan of the initial query that fetches the first ROWID from the view, T2 is not eliminated in that case.

So the moral of the story: Simple View Merging, another Query Transformation together with Join Elimination causes Oracle 11.2 to apply the ROWID to the wrong table in case multiple possible candidates for key-preserved tables exist. You can see this from the optimizer trace file where suddenly a T1.ROWID = '...' predicate pops up.

When replacing the view with the base table the problem cannot be reproduced. Preventing one of the transformations (Join Elimination or View Merging) also prevents the issue.

Interestingly changing the optimizer features to something below 11g also allowed avoiding the bug in the OTN thread, but with the given test case here I can still reproduce the problem on 11.2.0.3 (but not 11.2.0.1 and 11.2.0.2) when setting the OPTIMIZER_FEATURES_ENABLE to 10.2.0.4 for example:


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter session set optimizer_features_enable = '10.2.0.4';

Session altered.

SQL> select object_name from user_objects where object_id = (select sys.dbms_rowid.rowid_object(rowid) from v where rownum <= 1);

OBJECT_NAME
--------------------------------------------------------------------------------------------------------------------------------
T2

SQL>
SQL> set echo on verify on
SQL>
SQL> column rid new_value rid
SQL>
SQL> select rowid as rid from v where rownum <= 1;

RID
------------------
AAAVAUAAEAAAEnTAAA

SQL>
SQL> select t1_id from v where rowid = '&rid';
old 1: select t1_id from v where rowid = '&rid'
new 1: select t1_id from v where rowid = 'AAAVAUAAEAAAEnTAAA'
select t1_id from v where rowid = 'AAAVAUAAEAAAEnTAAA'
*
ERROR at line 1:
ORA-01410: invalid ROWID


SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 72yy78z1ggn75, child number 1
-------------------------------------
select t1_id from v where rowid = 'AAAVAUAAEAAAEnTAAA'

Plan hash value: 396691268

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | NESTED LOOPS | | 1 | 27 | 2 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY USER ROWID| T1 | 1 | 20 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | T2_PK | 1 | | 0 (0)| |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("FK_ID"="T2"."ID")


20 rows selected.

SQL>
SQL> column rid clear


Although it can be seen that the lower optimizer features setting resulted in a different plan where T2 wasn't eliminated, the ROWID predicate was still applied to the wrong table, which is clearly a change in behaviour compared to previous releases, of course in particular 10.2.0.4. So this is again one of the cases where setting OPTIMIZER_FEATURES_ENABLE doesn't reproduce exactly the same plan.

So my preliminary analysis in the OTN thread wasn't entirely correct - the new Outer Join elimination Query Transformation introduced in 11g wasn't causing the problems in the original case - these additional transformations were just triggering the side effect: It needs both a join elimination and a view merging transformation. That is the reason why the test case includes a third table. When omitting the third table, or for example querying columns from the third table that prevent the elimination of T3 the error doesn't reproduce either.

Footnote: There are more variations of similar bugs. See for example "Bug 10129357: Assorted errors referencing ROWID from a view with subquery" where other kinds of query transformations result in invalid SQL throwing ROWID related errors.

3 comments:

  1. hi Randolf,
    the last time you helped me with that topic in the OTN.
    Unfortunatelly I got the message that this error occured again even with disabling the "outer join elimination" feature.
    Therefore I googled around and saw that you updated your blog with tests of that issue.
    Do you have an advice, how to disable both: Simple View Merging, and the Outer Join Elimination?
    For disabling the second one, I set the _fix_control parameter to "4967068:OFF"... how the best thing to disable the other one or will I have negative side effects when doing this?

    thanks again in advance for your advises and best regards
    Stefan

    ReplyDelete
  2. Hi Stefan,

    first of all I think it would require a detailed analysis of the particular case how SQL statement and the execution plan in question looks like.

    It is probably not a good idea to disable view merging in general, the side effects would potentially be huge.

    It might be more reasonable to see if a simple NO_MERGE hint for the particular statement failing is sufficient to prevent the problem without suffering too much performance.

    Potentially a re-write of the statement allows preventing the problem, too.

    Randolf

    ReplyDelete
  3. hi Randolf,

    thanks for your update on this.
    Even before I read your update, I modified the code of the view with that NO_MERGE hint ;-)

    ... and fortunatelly it worked well after that change!

    thanks again for your help and best regards Stefan

    ReplyDelete