|Oracle 8i Data Cartridge Developer's Guide
Release 2 (8.1.6)
Part Number A76937-01
Design Considerations, 6 of 8
In Orace8i release 8.1.5 only the CPU and I/O costs are considered.
The cost of executing a C function can be determined using common profilers or tools. For SQL queries, an explain plan of the query would give a rough estimate of the cost of the query. In addition the
tkprof utility can be used to gather information about the CPU and the I/O cost involved in the operation. The cost of executing a callout could also be determined by using it in a SQL query which "selects from dual" and then estimating its cost from the
The cost of the index is a function of the selectivity of the predicate (which is passed as an argument to the cost function) * the total number of data blocks in the index structures. Hence the index cost function should be one which increases with the increase in selectivity of the predicate. With a selectivity of 100%, the cost of accessing the index should be the cost of accessing all the data in all the structures that comprise the domain index.
The total cost of accessing the index is the cost of performing the
ODCIIndexStart, N *
ODCIIndexClose operators, where N is the number of times the
ODCIIndexFetch routine will be called based on the selectivity of the predicate. The cost of
ODCIIndexClose functions can be determined as discussed above.
The selectivity of a predicate is the percentage of rows returned by the predicate divided by the total number of rows in the table(s).
The selectivity function should use the statistics collected for the table to determine what percentage of rows of the table will be returned by the predicate with the given list of arguments. For example, to compute the selectivity of a predicate
SelectedImage) which determines the images that are greater than the
SelectedImage, a histogram of the sizes of the images in the database can be a useful statistics to compute the selectivity.
Statistics can affect the calculation of selectivity for predicates and also the cost of domain indexes.
The statistics collected for a table can affect the computation of selectivity of a predicate. So statistics that can help the user make a better judgement about the selectivity of a predicate should be collected for a table/column. Knowing the predicates that would operate on the data will be helpful to determine what statistics would be good to collect.
Some example of statistics that can be useful in spatial domain for example could be the average/min/max number of elements in a
VARRAY that contains the nodes of the spatial objects.
Note that standard statistics are collected in addition to the user defined statistics when the
ANALYZE command is invoked.
When a domain index is analyzed statistics for the underlying objects which constitute the domain index should be analyzed. For example if the domain index is comprised of tables, the statistics collection function should
ANALYZE the tables when the domain index is analyzed. The cost of accessing the domain index can be influenced by the statistics that have been collected for the index. For example the cost of accessing a domain index could be approximated to the selectivity * the total number of data blocks (in the various tables) being accessed when the domain index is accessed.
To accurately define cost, selectivity and statistics functions, a good understanding of the domain is required. The above guidelines are meant to help you understand some of the issues you need to take into account while working on the cost, selectivity and statistics functions. In general it may be a good idea to start of by using the default cost and selectivity and observe how the queries of interest behave.