Sunday, October 14, 2007

Oracle CPU Costing. More information for the optimizer.

We collect object statistics, so the optimizer has information about the data distribution to calculate execution plans. But, how the optimizer can get information about our system load ? The answer is CPU costing gathering system statistics in the desired time interval.

Since Oracle 9i we can collect system statistics manually but it's in 10g where there are two different system statistics options:
  • No workload (default option): when we don't collect system statistics, the optimizer use CPUSPEEDNW, IOSEEKTIM, IOTFRSPEED from AUX_STATS$ to calculate multiblock read time and single block read time.
  • Workload: when we collect system statistics, the AUX_STATS$ will have information of the system load to calculate multiblock read time, single block read time that reflects the system load.

We can see all of this with a simple example:

  • No workload: the AUX_STATS$ has these values stored in my system (10.2.0.3):

PNAME PVAL1
------------------------------ ----------
CPUSPEEDNW 524.368
IOSEEKTIM 10
IOTFRSPEED 4096
SREADTIM
MREADTIM
CPUSPEED
MBRC
MAXTHR
SLAVETHR

SQL> create table t1 as select * from all_objects;

Table created.

SQL> insert into t1 select * from all_objects;

41205 rows created.

SQL> R
1* insert into t1 select * from all_objects

41205 rows created.

SQL> R
1* insert into t1 select * from all_objects

41205 rows created.

SQL> R
1* insert into t1 select * from all_objects

41205 rows created.

SQL> R
1* insert into t1 select * from all_objects

41205 rows created.

SQL> commit;

Commit complete.

set autotrace traceonly explain statitics

SQL> select * from t1 where object_type = 'SYNONYM';

121176 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 123K| 11M| 795 (3)| 00:00:10 |
|* 1 | TABLE ACCESS FULL| T1 | 123K| 11M| 795 (3)| 00:00:10 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("OBJECT_TYPE"='SYNONYM')


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
11442 consistent gets
0 physical reads
0 redo size
5584938 bytes sent via SQL*Net to client
89254 bytes received via SQL*Net from client
8080 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
121176 rows processed


  • Workload: gathering system statistics in the time interval when we want to collect, the AUX_STATS$ has in my system these values stored:

PNAME PVAL1
------------------------------ ----------
CPUSPEEDNW 524.368
IOSEEKTIM 10
IOTFRSPEED 4096
SREADTIM 23.19
MREADTIM 39.333
CPUSPEED 2812
MBRC 14
MAXTHR 19456
SLAVETHR

SQL> alter system flush buffer_cache;

System altered.

SQL> alter system flush shared_pool;

System altered.

SQL>

SQL> select * from t1 where object_type = 'SYNONYM';

121176 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 123K| 11M| 432 (1)| 00:00:11 |
|* 1 | TABLE ACCESS FULL| T1 | 123K| 11M| 432 (1)| 00:00:11 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("OBJECT_TYPE"='SYNONYM')


Statistics
----------------------------------------------------------
591 recursive calls
0 db block gets
11551 consistent gets
3467 physical reads
0 redo size
5584938 bytes sent via SQL*Net to client
89254 bytes received via SQL*Net from client
8080 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
121176 rows processed


When we have collected system statistics in the time interval that we consider like representative of the our system load:

exec dbms_stats.gather_system_stats('START');

..... interval time

exec dbms_stats.gather_system_stats('STOP');

The optimizer has more information about the system load that it uses to calculate a more realistic cost on our system.

The recommendation is to use system statistics with workload mode to supply the optimizer with this strategic information to find the more appropiate execution plan.

I have done more testing about this topic following the comment of David Pujol. System statistics can change execution plans, we can see it in this testcase.

I have collected system statistics in a representative workload with:

exec DBMS_STATS.CREATE_STAT_TABLE ('SYS','STAT_TAB','USERS');
exec dbms_stats.gather_system_stats('START', NULL, 'STAT_TAB');

.... period of time of a typical workload

exec dbms_stats.gather_system_stats('STOP', NULL, 'STAT_TAB');

SQL> create table t9 as select * from all_objects;

Table created.

SQL> insert into t9 select * from all_objects;

41205 rows created.

SQL> R
1* insert into t9 select * from all_objects

41205 rows created.

SQL> R
1* insert into t9 select * from all_objects

41205 rows created.

SQL> R
1* insert into t9 select * from all_objects

41205 rows created.

SQL> R
1* insert into t9 select * from all_objects

41205 rows created.
...

SQL> commit;

Commit complete.

create index i1_t9 on t1(object_type) nologging;

exec DBMS_STATS.GATHER_TABLE_STATS (ownname=>USER, tabname=> 'T9',estimate_percent=>dbms_stats.auto_sample_size, method_opt=>'for columns size 20 OBJECT_TYPE',cascade=>true, granularity=>'all');

SQL> SELECT pname, pval1 FROM aux_stats$ WHERE sname = 'SYSSTATS_MAIN';

PNAME PVAL1
------------------------------ ----------
CPUSPEEDNW 803.781
IOSEEKTIM 10
IOTFRSPEED 4096
SREADTIM
MREADTIM
CPUSPEED
MBRC
MAXTHR
SLAVETHR

SQL> select owner, object_name, object_type from t9;

Execution Plan
----------------------------------------------------------
Plan hash value: 3973213776

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3225K| 123M| 9861 (2)| 00:01:59 |
| 1 | TABLE ACCESS FULL| T9 | 3225K| 123M| 9861 (2)| 00:01:59 |
--------------------------------------------------------------------------

SQL> select owner, object_name, object_type from t9 where object_type='SCHEDULE';

Execution Plan
----------------------------------------------------------
Plan hash value: 533680601

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 124K| 4846K| 7158 (1)| 00:01:26 |
| 1 | TABLE ACCESS BY INDEX ROWID| T9 | 124K| 4846K| 7158 (1)| 00:01:26 |
|* 2 | INDEX RANGE SCAN | I1_T9 | 124K| | 348 (1)| 00:00:05 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("OBJECT_TYPE"='SCHEDULE')


SQL> execute DBMS_STATS.IMPORT_SYSTEM_STATS('STAT_TAB');

PL/SQL procedure successfully completed.

SQL> SELECT pname, pval1 FROM aux_stats$ WHERE sname = 'SYSSTATS_MAIN';

PNAME PVAL1
------------------------------ ----------
CPUSPEEDNW 803.781
IOSEEKTIM 10
IOTFRSPEED 4096
SREADTIM 3.423
MREADTIM 4.243
CPUSPEED 1421
MBRC 14
MAXTHR 13312
SLAVETHR


SQL> alter system flush buffer_cache;

System altered.

SQL> alter system flush shared_pool;

System altered.


SQL> select owner, object_name, object_type from t9;

Execution Plan
----------------------------------------------------------
Plan hash value: 3973213776

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3225K| 123M| 4176 (6)| 00:00:15 |
| 1 | TABLE ACCESS FULL| T9 | 3225K| 123M| 4176 (6)| 00:00:15 |
--------------------------------------------------------------------------

SQL> select owner, object_name, object_type from t9 where object_type='SCHEDULE';

Execution Plan
----------------------------------------------------------
Plan hash value: 3973213776

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 124K| 4846K| 4209 (7)| 00:00:15 |
|* 1 | TABLE ACCESS FULL| T9 | 124K| 4846K| 4209 (7)| 00:00:15 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("OBJECT_TYPE"='SCHEDULE')


When we have system statistics in action (workload), the second query has changed the execution plan from INDEX RANGE SCAN to FULL SCAN. The system has more information about our system and can choose a more convenient execution plan.