Wednesday, April 17, 2013

ASM AU Size And LMT AUTOALLOCATE

When using Locally Managed Tablespaces (LMT) with variable, system managed extent sizes (AUTOALLOCATE) and data files residing in ASM the Allocation Unit (AU) size can make a significant difference to the algorithm that searches for free extents.

The corresponding free extent search algorithm when searching for free extents >= the AU size seems to only search for free extents on AU boundaries in order to avoid I/O splitting.

Furthermore the algorithm seems to use two extent sizes when searching for free extents: A "desired" (for example 8MB) and a "minimum acceptable" (for example 1MB) extent size - however when performing the search the "desired" size seems to be relevant when limiting the search to free extents on AU boundaries.

This can lead to some surprising side effects, in particular when using 4MB AUs.

It effectively means that although you might have plenty of space (I've seen cases with more than 90% free of a several hundred GB tablespace) processes inserting data might fail with "unable to extend segment" error messages when the "desired" extent size is >= 4MB and the AU size is 4MB, although there might be for example hundreds of (fragmented) free extents available of < 8MB size.

This is particularly relevant to Exadata systems because I believe the default ASM AU size is 4MB there.

This behaviour can be influenced by using event 60060, which disables the extent alignment to AU boundaries.

Here is a simple test case demonstrating the issue. It deliberately interleaves extents of two tables and then drops one of them to leave non-contiguous free space behind:

-----------------------------------------------------------------------
-- Exadata / ASM 4MB AUs LMT AUTOALLOCATE extent fragmentation issue --
-----------------------------------------------------------------------
--
-- Tablespaces using Locally Managed Extents and the AUTOALLOCATE option
-- seem to behave differently on Exadata storage / ASM 4MB AUs than on regular storage / ASM 1MB AUs
--
-- On regular storage / ASM 1MB AUs the algorithm has more flexibility when allocating a next extent
-- For example, if according to the rules an 8MB extent would be required but only 1MB extents
-- are available, it gracefully re-uses the smaller extents, at least to some degree,
-- if they are not too small (for example it won't start re-using 64K free extents if an 8MB extent should be allocated)
--
-- On Exadata / ASM 4MB AUs the same test case fails with an ORA-01652 since it obviously can't re-use the existing free extents
-- Since in the below example there is no suitable extent on AU boundary available the table creation fails
--
-- Given a corresponding usage pattern this can lead to huge space wastage or "unable to extend" errors with plenty of (fragmented) free space
-- in Exadata / ASM 4MB AUs tablespaces

set echo on timing on time on

-- Enable this EVENT to disable the extent alignment to AU boundaries
-- This will allow below table to get created also in a tablespace using 4MB AUs in ASM
-- alter session set events '60060 trace name context forever, level 1';

drop tablespace auto_alloc_test including contents and datafiles;

-- Create tablespace either on ASM / Exadata storage or outside on regular file system
-- by uncommenting the datafile name clause
create tablespace auto_alloc_test
datafile
--'auto_alloc.dbf'
size 400M
extent management local
autoallocate
segment space management auto;

-- Create two tables
begin
  for i in 1..2 loop
    execute immediate 'create table table'||i||'(col1 number,col2 number) /*segment creation immediate*/ tablespace auto_alloc_test';
  end loop;
end;
/

-- Interleave the extents until no space left (the ORA-01653 is expected)
-- This will generate lots of 1MB extents
begin
  for i in 1..1000 loop
    execute immediate 'alter table table'||(mod(i,2)+1)||' allocate extent';
  end loop;
end;
/

-- Free half of the tablespace, but free extents are fragmented, and max. free contiguous space is 1MB
-- If you drop TABLE2, no extents on 4MB AU boundaries will be available
drop table table2;
-- If you drop TABLE1, extents on 4MB AU boundaries will be available and below CREATE TABLE will be successful
--drop table table1;

select sum(bytes)/1024/1024 sum_free_mb, max(bytes)/1024/1024 max_free_mb from dba_free_space where tablespace_name='AUTO_ALLOC_TEST';

-- Create a table that fits into the free space (less than 200MB) but usually will request 8MB extents
--
-- When you drop TABLE2 above, this fails on Exadata storage/ ASM 4MB AUs because it attempts to find extents on AU boundaries that are not available
--
-- When you drop TABLE2 above, this succeeds on regular storage / ASM 1MB AUs because it gracefully re-uses the existing 1MB extents if no 8MB extents can be found among the free extents
--
-- This leads to ORA-01652 error messages on Exadata storage / ASM 4MB AUs with a suitable extent usage pattern although there is plenty of (fragmented) free space
create table test
tablespace auto_alloc_test
as
select rpad('x', 100) as col1 from
(select /*+ cardinality(1e5) */ * from dual connect by level <= 1e5),
(select /*+ cardinality(11) */ * from dual connect by level <= 11)
;

-- In case of success check segment size and extent layout
select bytes/1024/1024 as MB from dba_segments where segment_name = 'TEST' and owner = USER;

select extent_id, bytes, blocks from user_extents where segment_name = 'TEST';

-- In case of failure check AU boundaries of free extents
with au_size
as
(
  select allocation_unit_size from v$asm_diskgroup where name = (select substr(file_name, 2, instr(file_name, '/') - 2) from dba_data_files where tablespace_name = 'AUTO_ALLOC_TEST')
)
select count(*) from (
select block_id * 16384 / allocation_unit_size as AU_info, a.* from dba_free_space a, au_size where tablespace_name = 'AUTO_ALLOC_TEST'
)
where au_info = trunc(au_info);

If you run this test case on a tablespace using 4MB AU ASM data files, it will fail to create the last table, although it only requires approx. 127MB and there is approx. 200MB of (non-contiguous) free space (200 times 1MB extents), simply because the "desired" extent size is 8MB and no suitable free extents on "AU boundary" can be found.

The relevance of the "AU boundary" condition can easily be checked by changing the test case to drop the other table. This table's extents allocate all the "AU boundaries" and hence the table creation will succeed as a sufficient number of (1MB) extents on "AU boundaries" could be found.

If you repeat the same test case with the event set or using data files with a 1MB AU (or simply standard file system data files) the table will be created successfully (even with no extents on "AU boundaries" available), re-using the available 1MB extents. Although the "desired" extent size is still 8MB, the "minimum acceptable" extent size of 1MB allows the re-usage since the free extents don't need to be aligned on AU boundaries.

Performance Vs. Space Usage


So with the event you can choose between performance (larger extents, no I/O splitting for desired larger extents) and space usage (non-contiguous, smaller extents).

Another option in such cases is the usage of tablespaces with UNIFORM extent allocation, however it might be hard to find a good extent size when dealing with many segments of vastly differing sizes.

Footnote


This should only be relevant if the processes inserting into the tablespace manage to run at a high concurrency, therefore interleaving the extents (causing non-contiguous free extents when moving / dropping / truncating segments) in conjunction with extent sizes of vastly different sizes.

Interestingly I've observed this behaviour on an Exadata system running 11.2.0.3 BP14 and using a tablespace where mostly partitioned objects resided. Since partitioned objects are using an initial extent size of 8MB in recent versions you wouldn't expect this problem in such a scenario, but it looks like that mixing regular load processes with ALTER TABLE ... MOVE COMPRESS PARALLEL can lead to much smaller extent sizes than 8MB and therefore allowing the issue to occur.

Friday, April 5, 2013

Free Webinar

It's webinar time again.

Join me on Wednesday, May 8th at AllThingsOracle.com for an overview session on the specifics of Oracle Parallel Execution.

The session starts at 16:00 UK (17:00 Central European) time. The webinar is totally free and the recording will made available afterwards.

Here's the link to the official landing page where you can register and below is the official abstract:

Abstract


Oracle Parallel Execution, a feature of the Enterprise Edition, allows you to automatically distribute the processing of a SQL statement execution among multiple worker processes. This requires additional effort when analysing and troubleshooting such parallel executions, since it adds complexity that is simply not there with normal serial execution where only a single process is involved.

In this webinar I'll provide an overview of what these additional challenges are and how these can be approached. A live Q&A session will follow the presentation.