Oracle 8i Data Cartridge Developer's Guide
Release 2 (8.1.6)

A76937-01

Library

Product

Contents

Index

Prev Up Next

Query Optimization, 2 of 7


Overview

Query Optimization is the process of choosing the most efficient way to execute a SQL statement. When the cost-based optimizer was offered for the first time with Oracle7, Oracle supported only standard relational data. The introduction of objects with Oracle8i extended the supported datatypes and functions. The Extensible Indexing feature discussed in the previous chapter, introduces user-defined access methods.


For more information regarding optimization see:

 

The extensible optimizer feature allows authors of user-defined functions and indexes to create statistics collection, selectivity, and cost functions that are used by the optimizer in choosing a query plan. The optimizer cost model is extended to integrate information supplied by the user. Specifically, you now can:

Please note that only the cost-based optimizer has been enhanced; Oracle has not altered the operation of the rule-based optimizer.

The optimizer generates an execution plan for a SQL DML statement: SELECT, INSERT, UPDATE, or DELETE. For simplicity, we describe the generation of an execution plan in terms of a SELECT statement, but the same process applies to other DML statements.

An execution plan includes an access method for each table in the FROM clause, and an ordering, called the join order, of the tables in the FROM clause. System-defined access methods include indexes, hash clusters, and table scans. The optimizer chooses a plan by generating a set of join orders, or permutations, by computing the cost of each, and then by selecting the process with the lowest cost. For each table in the join order, the optimizer computes the cost of each possible access method and join method and chooses the one with the lowest cost. The cost of the join order is the sum of the access method and join method costs. The costs are calculated using algorithms which together comprise the cost model. The cost model includes varying level of detail about the physical environment in which the query is executed.

The optimizer uses statistics about the objects referenced in the query to compute the selectivity and costs. The statistics are gathered using the ANALYZE command. The selectivity of a predicate is the fraction of rows in a table that is chosen by the predicate. It is a number between 0 and 1.

The Extensible Indexing feature allows users to define new operators, indextypes, and domain indexes. For user-defined operators and domain indexes, the Extensible Optimizer feature allows you to control the three main components used by the optimizer to select an execution plan:

In the following sections, we describe each of these components in greater detail.

Statistics

Statistics are collected using the ANALYZE command. Statistics can be collected for tables and indexes. In general, the more accurate the statistics, the better the execution plan generated by the optimizer. We call the statistics generated by the current ANALYZE command standard statistics. However, with the addition of user-defined domain indexes in Oracle8i.release 8.1.5, the standard ANALYZE statement cannot generate any statistics on the domain index since the database does not know the index storage structure.

User-Defined Statistics

The Extensible Optimizer feature lets you define statistics collection functions for domain indexes and columns. The extension to the ANALYZE command has the effect that whenever a domain index is analyzed, a call is made to the user-specified statistics collection function. The database does not know the representation and meaning of the user-collected statistics.

In addition to domain indexes, Oracle supports user-defined statistics collection functions for individual columns of a table, and for user-defined datatypes. In the former case, whenever a column is analyzed, the user-defined statistics collection function is called to collect statistics in addition to any standard statistics that the database collects. If a statistics collection function exists for a datatype, it is called for each column of the table being analyzed that has the required type.

Thus, the Extensible Optimizer feature extends ANALYZE to allow user-defined statistics collection functions for domain indexes, indextypes, datatypes, and individual table columns.

The cost of evaluating a user-defined function depends on the algorithm and the statistical properties of its arguments. It is not practical to store statistics for all possible combinations of columns that could be used as arguments for all functions. Therefore, Oracle maintains only statistics on individual columns. It is also possible that function costs depend on the different statistical properties of each argument. Every column could require statistics for every argument position of every applicable function. Oracle does not support such a proliferation of statistics and cost functions because it would decrease performance.

A user-defined function to drop statistics is required whenever there is a user-defined statistics collection function; it is called by ANALYZE DELETE.

User-Defined Statistics for Partitioned Objects

Since domain indexes cannot be partitioned in Oracle8i, release 8.1.5, a user-defined statistics collection function collects only global statistics on the non-partitioned index.

When an ANALYZE command specifies a list of partitions, this information is not passed to user-defined statistics collection functions.

Selectivity

The optimizer uses statistics to calculate the selectivity of predicates. The selectivity is the fraction of rows in a table that is chosen by the predicate. It is a number between 0 and 1. The selectivity of a predicate is used to estimate the cost of a particular access method; it is also used to determine the optimal join order. A poor choice of join order by the optimizer could result in a very expensive execution plan.

Currently, the optimizer uses a standard algorithm to estimate the selectivity of selection and join predicates. However, the algorithm does not always work well in cases in which predicates contain functions or type methods. In addition, in Oracle8i, release 8.1.5, 1 predicates can contain user-defined operators about which the optimizer does not have any information. In that case the optimizer cannot compute an accurate selectivity.

User-defined Selectivity

For greater control over the optimizer's selectivity estimation, this feature lets you specify user-defined selectivity functions for predicates containing user-defined operators, stand-alone functions, package functions, or type methods. The user-defined selectivity function is called by the optimizer whenever it encounters a predicate with one of the following forms:

operator(...) relational_operator <constant>

<constant> relational_operator operator(...)

operator(...) LIKE <constant>

where

For such cases, users can define selectivity functions associated with operator(...). The arguments to operator can be columns, constants, bind variables, or attribute references. When optimizer encounters such a predicate, it calls the user-defined selectivity function and passes the entire predicate as an argument (including the operator, function, or type method and its arguments, the relational operator relational_operator, and the constant expression or bind variable). The return value of the user-defined selectivity function must be expressed as a percent, and be between 0 and 100 inclusive; the optimizer ignores values outside this range.

Wherever possible, the optimizer uses user-defined selectivity values. However, this is not possible in the following cases:

In each of these cases, the optimizer uses heuristics to estimate the selectivity.

Cost

The optimizer estimates the cost of various access paths to choose an optimal plan. For example, it computes the cost of using an index and a full table scan to choose between the two. However, with regard to domain indexes, the optimizer does not know the internal storage structure of the index, and so it cannot compute a good estimate of the cost of a domain index.

User-Defined Cost

For greater flexibility, the cost model has been extended to let you define costs for domain indexes and user-defined stand-alone functions, package functions, and type methods. The user-defined costs can be in the form of default costs that the optimizer looks up, or they can be full-fledged cost functions which the optimizer calls to compute the cost.

As is the case with user-defined selectivity, user-defined cost is optional. If no user-defined cost is available, the optimizer uses heuristics to compute an estimate. However, in the absence of sufficient useful information about the storage structures in user-defined domain indexes and functions, such estimates can be very inaccurate and result in the choice of a sub-optimal execution plan.

User-defined cost functions for domain indexes are called by the optimizer only if a domain index is a valid access path for a user-defined operator (for details regarding when this is true, see the discussion of user-defined indexing in the previous chapter). User-defined cost functions for functions, methods and domain indexes are only called when a predicate has one of the following forms:

operator(...) relational_operator <constant>

<constant> relational_operator operator(...)

operator(...) LIKE <constant>

where

This is, of course, identical to the conditions for user-defined selectivity functions.

User-defined cost functions can return three cost values, each value representing the cost of a single execution of a function or domain index implementation:

The optimizer computes a composite cost from these cost values.

Optimizer Parameters

The cost of a query is a function of the cost values discussed above. These values can be combined in one of three ways depending on the settings of optimizer initialization parameters. The setting determines the function that is minimized.


Prev Up Next
Oracle
Copyright © 1999 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index