Siebel Analytics Performance Tuning Guide > The Database Servers > Periodic Tuning Tasks >

Refresh Optimizer Statistics


Given a query, a database server generates a set of potential execution plans, and then executes the optimal plan. If cost-based optimization is enabled, the optimizer chooses the least costly plan. The server estimates the cost of each plan based on statistics stored in the database catalog, statistics that include characteristics of tables, indexes, aggregate tables, and other structures that determine access paths. If the statistics are not fresh, the optimizer can generate a suboptimal plan.

Cost-Based Optimization

To take advantage of cost-based optimizers, you must:

  1. Select and evaluate indexes for the database.
  2. Enable the cost-based optimizer.
  3. Generate statistics for database objects.
  4. Periodically update the statistics so they reflect the current situation.

Cost-based optimizers are well-suited for data warehousing environments.

CAUTION:  Do not assume that a technique that works well for one database will work similarly for others. Oracle, DB2, and SQL Server each have their own optimizers which generate query execution plans peculiar to their individual background assumptions. Query execution plans can differ markedly.

The Necessity of Current Statistics

Databases change over time, and as they change the statistics become stale and must be updated. Otherwise, the generated query plan may not be optimal.

To insure optimal query performance, the administrator must periodically calculate statistics for the database using tools provided by the database vendor. These tool vary by vendor. Statistics should be calculated for all tables and indexes which include those used by the server's query rewriter (aggregate tables in DB2 and materialized views in Oracle).

When You Refresh Statistics

To refresh statistics for a table, you submit a single Oracle DDL statement. This statement refreshes statistics for the table and all the indexes defined on the table. Oracle recursively splits up the analyze statement to analyze the table and its indexes.

When you execute the default statement, such as:

Analyze table w_person_d compute statistics ;

Oracle processes the entire analyze statement as an autonomous unit contained within a single undo transaction. This will also be held in a single instance of the SORT_AREA_SIZE in the PGA.

To speed up an analyze statement, use this sequence of steps:

Alter session set sort_area_size = 50000000
Analyze table w_person_d compute statistics for table;
Analyze index w_person_d_idx1 compute statistics ;

Fresh statistics for an index are critical to the cost-based optimizer.

Vendor Tools to Refresh Statistics

All three database vendors provide tools that calculate and update statistics for the cost-based optimizers. The table below lists these tools by vendor.

Database
Tool
Description
DB2
RUNSTATS
SQL command that calculates statistics for the physical characteristics of a table and its indexes, which include number of records, number of pages, and average record length.
For additional information see DB2 Administration Guide V7.1, Volume 3: Performance.
Oracle 8i and 9i
Pre-Oracle 8i
DBMS_STATS
ANALYZE
DBMS_STATS is a PL/SQL package that generates and manages statistics for cost-based optimization. Use this package to gather, modify, view, statistics on indexes, tables, columns, and partitions.
The SQL ANALYZE command updates statistics for pre-8i database tables.
For additional information, see Oracle9i Database Performance Guide and Reference.
SQL Server 2000
sp_createstats
A system stored procedure that creates indexes on all eligible columns in all user tables in the current database.
Microsoft recommends that the database option AUTO_CREATE_STATISTICS is set to ON, which is the default and which automatically updates statistics.
For additional information, see the Microsoft white paper RDBMS Performance Tuning Guide for Data Warehousing.

When To Update Statistics

The following events signal that statistics for the data warehouse should be updated.

  1. During the initial load of data into the data warehouse.
    1. Update statistics for the general tables (_G) immediately after they have been loaded and before the staging and dimension table loads.
    2. Update statistics for the dimension tables (_D) immediately after they have been loaded but before loading the fact tables.
    3. Update statistics for the fact tables (_F) immediately after they have been loaded but before running the remaining ETL.
    4. After the data warehouse has been loaded with the initial ETL and the query indexes have been created, update statistics for the indexes.
  2. After an incremental update or load.
  3. Any time when an index within the data warehouse is created or dropped.
  4. After an index has been reorganized.

NOTE:  For information on the Query indexes in the Siebel Data Warehouse refer to Siebel Analytics Server Administration Guide and Siebel Analytics Installation and Configuration Guide.


 Siebel Analytics Performance Tuning Guide 
 Published: 18 April 2003