Query Tuning, 2 of 5

Optimizing Queries with Statistics

Query optimization with statistics uses the collected statistics on the tables and indexes in a query to select an execution plan that can process the query in the most efficient manner. The optimizer attempts to choose the best execution plan based on the following parameters:

the selectivity on the CONTAINS predicate
the selectivity of other predicates in the query
the CPU and I/O costs of processing the CONTAINS predicates

The following sections describe how to use statistics with the extensible query optimizer. Optimizing with statistics allows for a more accurate estimation of the selectivity and costs of the CONTAINS predicate and thus a better execution plan.

Collecting Statistics

By default, Oracle uses the cost-based optimizer to determine the best execution plan for a query. To allow the optimizer to better estimate costs, you can calculate the statistics on the table you query. To do so, issue the following statement:

ANALYZE TABLE <table_name> COMPUTE STATISTICS;

Alternatively, you can estimate the statistics on a sample of the table as follows:

ANALYZE TABLE <table_name> ESTIMATE STATISTICS 1000 ROWS;

ANALYZE TABLE <table_name> ESTIMATE STATISTICS 50 PERCENT;

These statements collect statistics on all the objects associated with table_name including the table columns and any indexes (b-tree, bitmap, or Text domain) associated with the table. To re-collect the statistics on a table, you can issue the ANALYZE command as many times as necessary or use the DBMS_STATS package

See Also:

Oracle9i SQL Reference and Oracle9i Database Performance Guide and Reference for more information about the ANALYZE command.

By collecting statistics on the Text domain index, the Oracle cost-based optimizer is able to do the following:

estimate the selectivity of the CONTAINS predicate
estimate the I/O and CPU costs of using the Text index, that is, the cost of processing the CONTAINS predicate using the domain index
estimate the I/O and CPU costs of each invocation of CONTAINS

Knowing the selectivity of a CONTAINS predicate is useful for queries that contain more than one predicate, such as in structured queries. This way the cost-based optimizer can better decide whether to use the domain index to evaluate CONTAINS or to apply the CONTAINS predicate as a post filter.

Example

Consider the following structured query:

select score(1) from tab where contains(txt, 'freedom', 1) > 0 and author = 
'King' and year > 1960;

Assume the author column is of type VARCHAR2 and the year column is of type NUMBER. Assume that there is a b-tree index on the author column.

Also assume that the structured author predicate is highly selective with respect to the CONTAINS predicate and the year predicate. That is, the structured predicate (author = 'King') returns a much smaller number of rows with respect to the year and CONTAINS predicates individually, say 5 rows versus 1000 and 1500 rows respectively.

In this situation, Oracle can execute this query more efficiently by first doing a b-tree index range scan on the structured predicate (author = 'King'), followed by a table access by rowid, and then applying the other two predicates to the rows returned from the b-tree table access.

Note:

When statistics are not collected for a Text index, the cost-based optimizer assumes low selectivity and index costs for the CONTAINS predicate.

Re-Collecting Statistics

After synchronizing your index, you can re-collect statistics on a single index to update the cost estimates. To do so, you can issue any of the following statements:

ANALYZE INDEX <index_name> COMPUTE STATISTICS;
or 

ANALYZE INDEX <index_name> ESTIMATE STATISTICS SAMPLE 1000 ROWS;

or

ANALYZE INDEX <index_name> ESTIMATE STATISTICS SAMPLE 50 PERCENT;

Deleting Statistics

You can delete the statistics associated with a table by issuing:

ANALYZE TABLE <table_name> DELETE STATISTICS;

You can delete statistics on one index by issuing the following statement:

ANALYZE INDEX <index_name> DELETE STATISTICS;