Oracle8i interMedia Text Reference
Release 8.1.5

A67843-01

Library

Product

Contents

Index

Prev Next

A
Working with the Extensible Query Optimizer

This appendix discusses how to use the extensible query optimizer to optimize queries with CONTAINS predicates.

The following topics are covered:

Optimizing Queries with Statistics

Query optimization with statistics involves using the collected statistics on the tables and indexes involved in a query to select an execution plan that can process the query in the most efficient manner. The optimizer attempts to choose the best execution plan based on the following parameters:

The following sections describe how to use statistics with the extensible query optimizer. Optimizing with statistics allows for a more accurate estimation of the selectivity and costs of the CONTAINS predicate and thus a better execution plan.

Collecting Statistics

By default, the extensible query optimizer is enabled. To use the extensible optimizer, you must calculate the statistics on the table you query. To do so, issue the following statement:

ANALYZE TABLE <table_name> COMPUTE STATISTICS; 

This statement collects statistics on all the objects associated with table_name including the table columns and any indexes (b-tree, bitmap or Text domain) associated with the table. You can issue the above ANALYZE command as many times as necessary to re-collect the statistics on a table.

See Also:

For more information on the ANALYZE command, see Oracle8 SQL Reference and Oracle8 Tuning.  

By collecting statistics on the Text domain index, the extensible query optimizer is able to do the following:

Knowing the selectivity of a CONTAINS predicate is useful for queries that contain more than one predicate, such as in structured queries. This way the extensible query optimizer can better decide whether to use the domain index to evaluate CONTAINS or to apply the CONTAINS predicate as a post filter.

Example

Consider the following structured query:

select score(1) from tab where contains(txt, 'freedom', 1) > 0 and author = 
'King' and year > 1960;

Assume the author column is of type VARCHAR2 and the year column is of type NUMBER. Assume that there is a b-tree index on the author column.

Also assume that the structured author predicate is highly selective with respect to the CONTAINS predicate and the year predicate; that is, the structured predicate (author = 'King') returns a much smaller number of rows with respect to the year and CONTAINS predicates individually, say 5 rows versus 1000 and 1500 rows respectively.

In this situation, Oracle can execute this query more efficiently by first doing a b-tree index range scan on the structured predicate (author = 'King'), followed by a table access by rowid, and then applying the other two predicates to the rows returned from the b-tree table access.

Without associating a statistics type with indextype context, the extensible query optimizer will always choose to process the CONTAINS() predicate using the text domain index.


Note:

When the statistics are not collected for a Text index, the behavior is the same as not enabling the extensible query optimizer.  


Re-Collecting Statistics

You can re-collect statistics on a single index by issuing:

ANALYZE INDEX <index_name> COMPUTE STATISTICS;

Deleting Statistics

You can delete the statistics associated with a table by issuing:

ANALYZE TABLE <table_name> DELETE STATISTICS;

You can delete statistics on one index by issuing the following statement:

ANALYZE INDEX <index_name> DELETE STATISTICS;

Disabling and Enabling the Extensible Query Optimizer

By default the extensible query optimizer is enabled. To disable the extensible query optimizer, issue the following statements:

DISASSOCIATE STATISTICS FROM INDEXTYPES ConText;
DISASSOCIATE STATISTICS FROM PACKAGES ctx_contains;

After disabling the extensible query optimizer, you can re-enable it. To do so, issue the following SQL statements as CTXSYS:

ASSOCIATE STATISTICS WITH INDEXTYPES ConText USING textoptstats;
ASSOCIATE STATISTICS WITH PACKAGES ctx_contains USING textoptstats;

Optimizing Queries for Response Time

By default, Oracle optimizes queries for throughput. This results in queries returning all rows in shortest time possible.

However, in many cases, especially in a web-application scenario, queries must be optimized for response time, when you are only interested in obtaining the first n hits of a potentially large hitlist in the shortest time possible.

The following sections describe how to optimize Text queries for response time. You can do so in two ways:

Better Response Time with FIRST_ROWS

You can change the default query optimizer mode to optimize for response time using the FIRST_ROWS hint. When queries are optimized for response time, Oracle returns the first n rows in the shortest time possible.

For example, consider the following PL/SQL block that uses a cursor to retrieve the first 20 hits of a query and uses the FIRST_ROWS hint to optimize the response time:

declare
cursor c is 
select /*+ FIRST_ROWS */ pk, score(1), col from ctx_tab 
            where contains(txt_col, 'test', 1) > 0 order by score(1) desc;
begin
for i in c
loop
insert into t_s values(i.pk, i.col);
exit when c%rowcount > 21;
end loop;
end;
/

The cursor c is a SELECT statement that returns the rowids that contain the word test in sorted order. The code loops through the cursor to extract the first 20 rows. These rows are stored in the temporary table t_s.

With the FIRST_ROWS hint, Oracle instructs the Text index to return rowids in score-sorted order, if possible.

Without the hint, Oracle sorts the rowids after the Text index has returned all the rows in unsorted order that satisfy the CONTAINS predicate. Retrieving the entire result set as such takes time.

Since only the first 20 hits are needed in this query, using the hint results in better performance.


Note:

Use the FIRST_ROWS hint when you need only the first few hits of a query. When you need the entire result set, do not use this hint as it might result in poorer performance.

In addition, the FIRST_ROWS hint can be used with or without enabling the extensible query optimizer.  


Other Behavior with FIRST_ROWS

Besides instructing the Text index to return hits in score-sorted order, the FIRST_ROWS hint also tries to avoid blocking operations when optimizing queries for response time. Blocking operations include merge joins, hash joins and bitmap operations.

As a result, using the FIRST_ROWS hint to optimize for response time might result in a different execution plan than using CHOOSE with DOMAIN_INDEX_SORT, which also optimizes for response time.

You can examine query execution plans using the EXPLAIN PLAN command in SQL.

See Also:

For more information about the query optimizer and using hints such as FIRST_ROWS and CHOOSE, see Oracle8i Concepts and Oracle8i Tuning.

For more information about the EXPLAIN PLAN command, see Oracle8i SQL Reference  

Better Response Time with CHOOSE

When you use the CHOOSE or ALL_ROWS optimizer hints, the query is optimized for throughput. This is the default optimizer mode. In this mode, Oracle does not instruct the Text domain index to return score-sorted rows, choosing instead to sort all the rows fetched from the Text index.

To optimize for fast response time under CHOOSE or ALL_ROWS modes, you can use the DOMAIN_INDEX_SORT hint as follows:

declare
cursor c is 
select /*+ CHOOSE DOMAIN_INDEX_SORT */ pk, score(1), col from ctx_tab 
            where contains(txt_col, 'test', 1) > 0 order by score(1) desc;
begin
for i in c
loop
insert into t_s values(i.pk, i.col);
exit when c%rowcount > 21;
end loop;
end;
/


Note:

Although you can optimize fro response with this method as well as with FIRST _ROWS by itself, the actual execution plans of the two methods obtained with EXPLAIN PLAN might be different for a given query.  


See Also:

For more information about the query optimizer and using hints such as FIRST_ROWS and CHOOSE, see Oracle8i Concepts and Oracle8i Tuning.

For more information about the EXPLAIN PLAN command, see Oracle8i SQL Reference  

Optimizing Queries for Throughput

CHOOSE and ALL ROWS Modes

By default, queries are optimized for throughput under the CHOOSE and ALL_ROWS modes. When queries are optimized for throughput, Oracle returns all rows in the shortest time possible.

FIRST_ROWS Mode

In FIRST_ROWS mode, the extensible query optimizer optimizes for fast response time by having the Text domain index return score-sorted rows, if possible. This is the default behavior when you use the FIRST_ROWS hint.

If you want to optimize for better throughput under FIRST_ROWS, you can use the DOMAIN_INDEX_NO_SORT hint. Better throughput means you are interested in getting all the rows to a query in the shortest time.

The following example achieves better throughput by not using the Text domain index to return score-sorted rows. Instead, Oracle sorts the rows after all the rows that satisfy the CONTAINS predicate are retrieved from the index:

select /*+ FIRST_ROWS DOMAIN_INDEX_NO_SORT */ pk, score(1), col from ctx_tab 
            where contains(txt_col, 'test', 1) > 0 order by score(1) desc;

See Also:

For more information about the query optimizer and using hints such as FIRST_ROWS and CHOOSE, see Oracle8i Concepts and Oracle8i Tuning.  




Prev

Next
Oracle
Copyright © 1999 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index