Oracle

Database

12c

Automatic Gather

Statistics

for Bulk Loads

 

 

 

 

 

Cost Based Optimizer to accurately determine the cost for an execution plan it must have information about all of the objects (tables and indexes) accessed in the SQL query. This necessary information is commonly referred to as Optimizer statistics. Optimizer statistics are a collection of data that describe the database, and the objects in the database. These statistics are used by the Optimizer to choose the best execution plan for each SQL statement. Gathering Statistics time could be very long. Oracle using different methods reduce statistics collection time. Started in Oracle 10g version gathering statistics of indexes, when indexes create or rebuild (Code Listing 1). You can determine when statistics for object were last collected time from LAST_ANALYZED column. This column in some tables, such as DBA_TABLES, DBA_INDEXES and DBA_TAB_COL_STATISTICS tables.

 

Code Listing 1: Gathering Statistics automatically for Index in Oracle Database 11g

 

SQL> conn / as sysdba

Connected.

SQL> select banner from v$version;

 

BANNER

--------------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

PL/SQL Release 11.2.0.4.0 - Production

CORE    11.2.0.4.0      Production

TNS for 64-bit Windows: Version 11.2.0.4.0 - Production

NLSRTL Version 11.2.0.4.0 - Production

 

SQL> conn usr/usr

Connected.

SQL> create table t as

  2   select level as n, 'Num :'||level as v from dual

  3   connect by level<=1e6;

 

Table created.

 

SQL> select table_name, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  2  from dba_tables where owner='USR';

 

TABLE_NAME                     LAST_ANALYZED

------------------------------ --------------

T

 

SQL> select index_name, last_analyzed from dba_indexes where owner='USR';

 

no rows selected

 

SQL> create index idx_t on t(n);

 

Index created.

 

SQL> select index_name, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  2  from user_indexes;

 

INDEX_NAME                     LAST_ANALYZED

------------------------------ -------------------

IDX_T                          19.10.2015 11:12:16

 

SQL> insert into t select * from t;

 

1000000 rows created.

 

SQL> commit;

 

Commit complete.

 

SQL> alter index idx_t rebuild;

 

Index altered.

 

SQL> select index_name, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  2  from user_indexes;

 

INDEX_NAME                     LAST_ANALYZED

------------------------------ -------------------

IDX_T                          19.10.2015 11:13:29

 

When index created Oracle analyzed index and automatically collected statistics for that index (Last analyzed time: 19.10.2015 11:12:16). Also when index rebuild database compute statistics again (Last analyzed time: 19.10.2015 11:13:29).  Automatically gathering statistics enhanced within Oracle Database 12c Release 1. So, Oracle online gather table statistics after a bulk load operations. What is bulk loads operation? A bulk load operation one of as following statements:

 

• CTAS – create table as select ...

• Insert into ... select  using with --+ APPEND hint (Direct Path Insert)

• Parallel inserts

 

I will show examples for every bulk operation and check new feature of Oracle Database 12c. I installed Oracle Database 12c Release 1 (12.1.0.2) on my demo environment.

 

Code Listing 2: CTAS Gather Statistics Automatically

 

[oracle@oel62-ora12c2 ~]$ export ORACLE_SID=dbprm

[oracle@oel62-ora12c2 ~]$ sqlplus / as sysdba

 

SQL*Plus: Release 12.1.0.2.0 Production on Mon Oct 19 15:47:31 2015

 

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

 

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

With the Partitioning, OLAP, Advanced Analytics, Real Application Testing

and Unified Auditing options

 

SQL> select banner from v$version;

 

BANNER

--------------------------------------------------------------------------------

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

PL/SQL Release 12.1.0.2.0 - Production

CORE 12.1.0.2.0 Production

TNS for Linux: Version 12.1.0.2.0 - Production

NLSRTL Version 12.1.0.2.0 - Production

 

SQL>

SQL> create user usr identified by usr;

 

User created.

 

SQL> grant create session, resource, dba to usr;

 

Grant succeeded.

 

SQL> alter user usr quota unlimited on users;

 

User altered.

 

SQL> conn usr/usr

Connected.

SQL>

SQL>select table_name, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  2 from dba_tables

  3 where owner='USR';

 

no rows selected

 

SQL> create table t as

  2   select level as n, 'Num :'||level as v from dual

  3   connect by level<=1e6;

 

Table created.

 

SQL>select table_name, num_rows, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss')  as last_analyzed

  2 from dba_tables

  3 where owner = 'USR';

 

TABLE_NAME   NUM_ROWS     LAST_ANALYZED

----------- ------------ ---------------------

T      1000000         19.10.2015 18:04:50

 

SQL> select table_name, column_name, num_distinct, notes from dba_tab_col_statistics

  2  where table_name = 'T';

 

TABLE_NAME  COLUMN_NAME  NUM_DISTINCT    NOTES

------------ ---------------------------------------------------------------

T    N       1000000   STATS_ON_LOAD

T    V       998912   STATS_ON_LOAD

 

I created a table (T) with create table as select (CTAS) statement.  Oracle Database analyzed new created table and collected table statistics on load, shows NOTES column of DBA_TAB_COL_STATISTICS table: STATS_ON_LOAD. It is new feature of Oracle Database 12c Release 1.

 

I want load empty table use APPEND hint with insert into … select statement, in other word I’ll try direct-path inserts into the empty table. Quick note: if table is not empty then this operation, automatic collect stats operation is not working (Code Listing 2).

 

Code Listing 3: Direct-Path Inserts non-Empty Table

 

SQL> insert /*+ APPEND */ into t select * from t;

 

1000000 rows created.

 

SQL> commit;

 

Commit complete.

 

SQL>select table_name, num_rows, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss')  as last_analyzed

  2 from dba_tables where owner = 'USR';

 

TABLE_NAME NUM_ROWS  LAST_ANALYZED

---------- ---------- -----------------------

T  1000000  19.10.2015 18:04:50

 

I inserted (1000000) rows to table T with Direct-Path insert, but statistics is not changed, because table is not empty. Trying direct-path insert with empty table.

 

Code Listing 4: Direct-path Inserts for Empty Table

 

SQL> create table t1 as select * from t where 1!=1;

 

Table created.

 

SQL>select table_name, num_rows, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss')  as last_analyzed

  2 from dba_tables where owner = 'USR';

 

TABLE_NAME NUM_ROWS  LAST_ANALYZED

---------- ---------- -----------------------

T   1000000  19.10.2015 18:04:50

T1   0   19.10.2015 18:06:16

 

SQL> insert /*+ append */ into t1 select * from t;

 

2000000 rows created.

 

SQL> commit;

 

Commit complete.

 

SQL> select table_name, num_rows, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  2 from dba_tables where owner = 'USR';

 

TABLE_NAME NUM_ROWS  LAST_ANALYZED

---------- ---------- -----------------------

T   1000000  19.10.2015 18:04:50

T1   2000000  19.10.2015 18:08:26

 

As you seen from output, Oracle collected table statistics when direct-path inserts operation. It is same also for parallel inserts on empty tables. Quick note: If you are using direct-path insert with values clause, it is not effective for automatic collect statistics.

 

Code Listing 5: Parallel Inserts

 

SQL> create table t2 as select * from t where 1!=1;

 

Table created.

 

SQL>select table_name, num_rows, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss')  as last_analyzed

  2 from dba_tables where owner = 'USR';

 

TABLE_NAME NUM_ROWS  LAST_ANALYZED

---------- ---------- -----------------------

T   1000000  19.10.2015 18:04:50

T1   2000000  19.10.2015 18:08:26

T2   0  19.10.2015 18:10:31

 

SQL> insert /*+ enable_parallel_dml parallel */ into t2 select /*+ parallel */ * from t;

 

2000000 rows created.

 

SQL>select table_name, num_rows, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss')  as last_analyzed

  2 from dba_tables where owner = 'USR';

 

TABLE_NAME NUM_ROWS  LAST_ANALYZED

---------- ------------- --------------------------

T  1000000  19.10.2015 18:04:50

T1  2000000  19.10.2015 18:08:26

T2  2000000  19.10.2015 18:35:01

 

This feature is working with partitioned tables with a little difference. When inserting into an empty partitioned table using direct path mode, the database will automatically gather global statistics, but partition-level statistics will not collected during the operation. If inserting directly into extended empty partition, then statistics collect for that partition.  Following code listing shows this feature.

 

Code Listing 6: Automatically Collected Statistics for Partitioned Table

 

SQL> CREATE TABLE d

  2(

  3  n, day

  4 )

  5 PARTITION BY RANGE (day)

  6 (

  7  PARTITION day_q2010 VALUES LESS THAN (TO_DATE('01/01/2011', 'DD/MM/YYYY')),

  8  PARTITION day_q2011 VALUES LESS THAN (TO_DATE('01/01/2012', 'DD/MM/YYYY')),

  9  PARTITION day_q2012 VALUES LESS THAN (TO_DATE('01/01/2013', 'DD/MM/YYYY')),

 10  PARTITION day_q2013 VALUES LESS THAN (TO_DATE('01/01/2014', 'DD/MM/YYYY')),

 11  PARTITION day_q2014 VALUES LESS THAN (TO_DATE('01/01/2015', 'DD/MM/YYYY'))

 12 )

 13 AS

 14 select to_number(to_char(to_date('01012010','ddmmyyyy') - 1 + level, 'YYYYMMDD')) AS n,

 15        to_date('01012010','ddmmyyyy') - 1 + level as day

 16  from dual

 17 connect by level <= to_date('31122014','ddmmyyyy') - to_date('01012010','ddmmyyyy') + 1;

 

Table created.

 

SQL> select table_name, num_rows, to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss')  as last_analyzed

  2  from dba_tables where owner = 'USR';

 

TABLE_NAME NUM_ROWS  LAST_ANALYZED

---------- ---------- -----------------------

T   1000000  19.10.2015 18:04:50

T1   2000000  19.10.2015 18:08:26

T2   2000000  19.10.2015 18:35:01

D         1826  20.10.2015 15:58:07

 

SQL> select table_name, partition_name,

  2          num_rows,to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  3  from dba_tab_partitions

  4  where table_name ='D';

 

TABLE_NAME  PARTITION_NAME  NUM_ROWS  LAST_ANALYZED

---------- ------------------- ---------- -----------------

D  DAY_Q2010

D  DAY_Q2011

D  DAY_Q2012

D  DAY_Q2013

D  DAY_Q2014

 

5 rows selected.

 

As you see from output, the database collected statistics for partitioned table, but not partition level. I add new partition and try direct-path insert rows to empty partition. The database collect automatically statistics for partition, which is loaded with direct-path inserts as below:

 

Code Listing 7: Automatically Collected Statistics for Partition Level

 

SQL> alter table d add partition day_q2015 values less than (TO_DATE('01/01/2016', 'DD/MM/YYYY'));

 

Table altered.

 

SQL> select table_name, partition_name,

  2          num_rows,to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  3 from dba_tab_partitions

  4 where table_name ='D';

 

 

TABLE_NAME  PARTITION_NAME    NUM_ROWS  LAST_ANALYZED

---------- ------------------- ---------- -----------------

D  DAY_Q2010

D  DAY_Q2011

D  DAY_Q2012

D  DAY_Q2013

D  DAY_Q2014

D  DAY_Q2015

 

6 rows selected.

 

SQL>insert /*+ APPEND */ into D Partition (day_q2015)

  2 select *

  3   from (select level + 2015000 as n, to_date('01012015','ddmmyyyy') - 1 + level from dual

  4              connect by level <= to_date('31122015','ddmmyyyy') - to_date('01012015','ddmmyyyy') + 1);

 

365 rows created.

 

SQL> commit;

 

Commit complete.

 

SQL>select table_name, partition_name,

  2          num_rows,to_char(last_analyzed,'dd.mm.yyyy hh24:mi:ss') as last_analyzed

  3 from dba_tab_partitions

  4 where table_name ='D';

 

TABLE_NAME  PARTITION_NAME  NUM_ROWS  LAST_ANALYZED

---------- ------------------- ---------- -----------------

D  DAY_Q2010

D  DAY_Q2011

D  DAY_Q2012

D  DAY_Q2013

D  DAY_Q2014

D  DAY_Q2015         365   20.10.2015 16:08:10

 

6 rows selected.

 

How to control Optimizer Gather Stats on Load?

 

Automatically gather statistics for tables in Oracle Database 12c controlling with hidden parameter _optimizer_gather_stats_on_load (Code Listing 8). The default value for this parameter is TRUE, it means automatically gather statistics on load enabled. If you don’t want to gather stats online on load, then you can set this parameter FALSE. It is strongly not recommended set hidden parameter.

 

Code Listing 8: Optimizer Gather Stats on Load parameter

 

SQL> select

  2  a.ksppinm ParameterName,

  3  b.ksppstvl SessionValue,

  4  c.ksppstvl InstanceValue,

  5  a.ksppdesc Description

  6 from x$ksppi a,

  7      x$ksppcv b,

  8   x$ksppsv c

  9 where a.indx = b.indx

  10  and  a.indx = c.indx

  11  and  a.ksppinm LIKE '/_optimizer_gather_stats_on_load' escape'/';

 

PARAMETERNAME       SESSIONVALUE   INSTANCEVALUE DESCRIPTION

--------------------------------- -------------- ------------- --------------------------------

_optimizer_gather_stats_on_load TRUE  TRUE enable/disable online statistics gathering

 

You can use GATHER_OPTIMIZER_STATISTICS hint to explicitly enable, also NO_GATHER_OPTIMIZER_STATISTICS hint to disable this feature. This feature will not work on any objects owned be SYS, also not work on nested tables, index-organized tables, external tables, tables with virtual column or global temporary tables that are defined as on commit delete rows.

 

Conclusion

Gathering Optimizer Statistics take more time in large databases. This feature is very helpful for reduce statistics gather times. Upgrade your database to Oracle Database 12c for using this and more new features.

 

 

advertisement

 

 

otech magazine - oracle winter 2015 issue  copyright otech magazine 2015

terms and conditions