12
Data Access Methods

This chapter provides an overview of data access methods that can enhance performance, and it warns of situations to avoid. This chapter also explains how to use hints to force various approaches.

This chapter contains the following sections:

Using Indexes

This section describes:

When to Create Indexes

Indexes improve the performance of queries that select a small percentage of rows from a table. As a general guideline, create indexes on tables that are queried for less than 2% or 4% of the table's rows. This value may be higher in situations where all data can be retrieved from an index, or where the indexed columns and expressions can be used for joining to other tables.

This guideline is based on the following assumptions:

Rows with the same value for the key on which the query is based are uniformly distributed throughout the data blocks allocated to the table.
Rows in the table are randomly ordered with respect to the key on which the query is based.
The table contains a relatively small number of columns.
Most queries on the table have relatively simple WHERE clauses.
The cache hit ratio is low and there is no operating system cache.

If these assumptions do not describe the data in your table and the queries that access it, then an index may only be helpful if your queries typically access at least 25% of the table's rows.

Tuning the Logical Structure

Although cost-based optimization helps avoid the use of nonselective indexes within query execution, the SQL engine must continue to maintain all indexes defined against a table regardless of whether they are used. Index maintenance can present a significant CPU and I/O resource demand in any I/O intensive application. Put another way, building indexes "just in case" is not a good practice; indexes should not be built until required.

To maintain optimal performance with indexes, drop indexes that your application is not using. You can find indexes that are not referenced in execution plans by processing all of your application SQL through EXPLAIN PLAN and capturing the resulting plans. Unused indexes are typically, though not necessarily, nonselective.

Indexes within an application sometimes have uses that are not immediately apparent from a survey of statement execution plans. In particular, Oracle uses "pins" (nontransactional locks) on foreign key indexes to avoid using shared locks on the child table when enforcing foreign key constraints.

In many applications, a foreign key index never, or rarely, supports a query. In the example shown in Figure 12-1, the need to locate all of the order lines for a given product may never arise. However, when no index exists with LINES(PCODE) as its leading portion (as described in "Choosing Composite Indexes"), then Oracle places a share lock on the LINES table each time PRODUCTS(PCODE) is updated or deleted. Such a share lock is a problem only if the PRODUCTS table is subject to frequent DML.

If this contention arises, then to remove it, the application must either:

Accept the additional load of maintaining the index.
Accept the risk of running with the constraint disabled.

Figure 12-1 Foreign Key Constraint

Choosing Columns and Expressions to Index

A key is a column or expression on which you can build an index. Follow these guidelines for choosing index keys to index:

Consider indexing keys that are frequently used in WHERE clauses.
Consider indexing keys that are frequently used to join tables in SQL statements. For more information on optimizing joins, see the section "Using Hash Clusters".
Index keys that have high selectivity. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index's selectivity is optimal if few rows have the same value.

Note:
Oracle automatically creates indexes, or uses existing indexes, on the keys and expressions of unique and primary keys that you define with integrity constraints.

You can determine the selectivity of an index by dividing the number of rows in the table by the number of distinct indexed values. You can obtain these values using the ANALYZE statement. Selectivity calculated in this manner should be interpreted as a percentage.

Indexes with low selectivity can be helpful if the data distribution is skewed so that one or two values occur much less often than the others. If these values appear frequently in WHERE clauses, and if column statistics are gathered so that the optimizer knows which values are rare, then the index can be useful.
Do not use standard B*-tree indexes on keys or expressions with few distinct values. Such keys or expressions usually have poor selectivity and therefore do not optimize performance unless the frequently selected key values appear less frequently than the other key values. You can use bitmap indexes effectively in such cases, unless a high concurrency OLTP application is involved.
Do not index columns that are frequently modified. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo information.
Do not index keys that appear only in WHERE clauses with functions or operators. A WHERE clause that uses a function (other than MIN or MAX) or an operator with an indexed key does not make available the access path that uses the index.
Consider indexing foreign keys of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statements access the parent and child tables. Such an index allows UPDATEs and DELETEs on the parent table without share locking the child table.
When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You may want to experiment by comparing the processing times of your SQL statements with and without indexes. You can measure processing time with the SQL trace facility.

See Also:
For more information on the effects of foreign keys on locking, see Oracle8i Application Developer's Guide - Fundamentals.

Choosing Composite Indexes

A composite index contains more than one key column. Composite indexes can provide additional advantages over single-column indexes:

Improved selectivity

Sometimes two or more columns or expressions, each with poor selectivity, can be combined to form a composite index with more accurate selectivity.

Reduced I/O

If all columns selected by a query are in a composite index, then Oracle can return these values from the index without accessing the table.

A SQL statement can use an access path involving a composite index if the statement contains constructs that use a leading portion of the index. A leading portion of an index is a set of one or more columns that were specified first and consecutively in the list of columns in the CREATE INDEX statement that created the index. Consider this CREATE INDEX statement:

CREATE INDEX comp_ind 
ON tab1(x, y, z);

These combinations of columns are leading portions of the index: x, xy, and xyz. These combinations of columns are not leading portions of the index: yz, y, and z.

Follow these guidelines for choosing keys for composite indexes:

Consider creating a composite index on keys that are frequently used together in WHERE clause conditions combined with AND operators, especially if their combined selectivity is better than the selectivity of either key individually.
If several queries select the same set of keys based on one or more key values, then consider creating a composite index containing all of these keys.

Of course, consider the guidelines associated with the general performance advantages and trade-offs of indexes described in the previous sections. Follow these guidelines for ordering keys in composite indexes:

Create the index so the keys used in WHERE clauses make up a leading portion.
If some keys are used in WHERE clauses more frequently, then be sure to create the index so that the more frequently selected keys make up a leading portion to allow the statements that use only these keys to use the index.
If all keys are used in WHERE clauses equally often, then ordering these keys from most selective to least selective in the CREATE INDEX statement best improves query performance.
If all keys are used in the WHERE clauses equally often but the data is physically ordered on one of the keys, then place that key first in the composite index.

Writing Statements that Use Indexes

Even after you create an index, the optimizer cannot use an access path that uses the index simply because the index exists. The optimizer can choose such an access path for a SQL statement only if it contains a construct that makes the access path available.

To be sure that a SQL statement can use an access path that uses an index, be sure that the statement contains a construct that makes such an access path available. If you are using the cost-based approach, then also generate statistics for the index. After you have made the access path available for the statement, the optimizer may or may not choose to use the access path, based on the availability of other access paths.

If you create new indexes to tune statements, then you can also use the EXPLAIN PLAN statement to determine whether the optimizer will choose to use these indexes when the application is run. If you create new indexes to tune a statement that is currently parsed, then Oracle invalidates the statement. When the statement is next executed, the optimizer automatically chooses a new execution plan that could potentially use the new index. If you create new indexes on a remote database to tune a distributed statement, then the optimizer considers these indexes when the statement is next parsed.

Also keep in mind that the way you tune one statement may affect the optimizer's choice of execution plans for others. For example, if you create an index to be used by one statement, then the optimizer may choose to use that index for other statements in your application as well. For this reason, you should re-examine your application's performance and rerun the SQL trace facility after you have tuned those statements that you initially identified for tuning.

Writing Statements that Avoid Using Indexes

In some cases, you may want to prevent a SQL statement from using an access path that uses an existing index. You may want to do this if you know that the index is not very selective and that a full table scan would be more efficient. If the statement contains a construct that makes such an index access path available, then you can force the optimizer to use a full table scan through one of these methods:

You can use the NO_INDEX hint to give the CBO maximum flexibility while disallowing the use of a certain index.
You can use the FULL hint to force the optimizer to choose a full table scan instead of an index scan.
You can use the INDEX, INDEX_COMBINE, or AND_EQUAL hints to force the optimizer to use one index or a set of listed indexes instead of another.

Assessing the Value of Indexes

To determine whether an index is good, you must first create it, then analyze it, and use EXPLAIN PLAN on your query to see if the optimizer uses it. If it does, then keep the index, unless it is expensive to maintain. You can compare the optimizer cost (in the first row of EXPLAIN PLAN output) of the plans with and without the index.

Parallel execution uses indexes effectively. It does not perform parallel index range scans, but it does perform parallel index lookups for parallel nested loop join execution. If an index is very selective (there are few rows per index entry), then it may be better to use sequential index lookup than parallel table scan.

Using Fast Full Index Scans

The fast full index scan is an alternative to a full table scan when there is an index that contains all the keys that are needed for the query. A fast full scan is faster than a normal full index scan in that it can use multiblock I/O and can be parallelized just like a table scan. Unlike regular index scans, however, you cannot use keys and the rows will not necessarily come back in sorted order. The following query and plan illustrate this feature.

SELECT COUNT(*) 
FROM t1, t2
WHERE t1.c1 > 50 

AND t1.c2 = t2.c1;

The plan is as follows:

SELECT STATEMENT

SORT AGGREGATE


HASH JOIN


TABLE ACCESS t1 FULL
INDEX t2_c1_idx FAST FULL SCAN

Because index t2_c1_idx contains all columns needed from table t2, the optimizer uses a fast full index scan on that index.

Restrictions

Fast full index scans have the following restrictions:

At least one indexed column of the table must have the NOT NULL constraint.
Fast full index scans are not possible against bitmap indexes.
There must be a parallel clause on the index if you want to perform fast full index scan in parallel. The parallel degree of the index is set independently. The index does not inherit the degree of parallelism of the table.
Make sure that you have analyzed the index; otherwise, the optimizer may decide not to use it.

Fast full scan has a special index hint, INDEX_FFS, which has the same format and arguments as the regular INDEX hint.

Re-creating Indexes

You may want to re-create an index to compact it and minimize fragmented space, or to change the index's storage characteristics. When creating a new index that is a subset of an existing index, or when rebuilding an existing index with new storage characteristics, Oracle may use the existing index instead of the base table to improve performance.

However, there are cases where it may be beneficial to use the base table instead of the existing index. Consider an index on a table on which a lot of DML has been performed. Because of the DML, the size of the index may increase to the point where each block is only 50% full, or even less. If the index refers to most of the columns in the table, then the index could actually be larger than the table. In this case, it is faster to use the base table rather than the index to re-create the index. Another option is to create a new index on a subset of the columns of the original index.

For example, you have a table named cust with columns name, custid, phone, addr, balance, and an index named i_cust_custinfo on table columns name, custid and balance. To create a new index named i_cust_custno on columns custid and name, you would enter:

CREATE INDEX i_cust_custno ON cust(custid, name);

Oracle automatically uses the existing index (i_cust_custinfo) to create the new index rather than accessing the entire table. The syntax used is the same as if the index i_cust_custinfo did not exist.

Similarly, if you have an index on the empno and mgr columns of the emp table, and if you want to change the storage characteristics of that composite index, then Oracle can use the existing index to create the new index.

Use the ALTER INDEX ... REBUILD statement to reorganize or compact an existing index or to change its storage characteristics. The REBUILD statement uses the existing index as the basis for the new one. All index storage statements are supported, such as STORAGE (for extent allocation), TABLESPACE (to move the index to a new tablespace), and INITRANS (to change the initial number of entries).

ALTER INDEX ... REBUILD is usually faster than dropping and re-creating an index, because this statement uses the fast full scan feature. It reads all the index blocks using multiblock I/O then discards the branch blocks. A further advantage of this approach is that the old index is still available for queries while the rebuild is in progress.

Compacting Indexes

You can coalesce leaf blocks of an index using the ALTER INDEX statement with the COALESCE option. This allows you to combine leaf levels of an index to free blocks for re-use. You can also rebuild the index online.

See Also:
For more information about the syntax for this statement, see Oracle8i SQL Reference and Oracle8i Administrator's Guide.

Using Nonunique Indexes to Enforce Uniqueness

You can use an existing nonunique index on a table to enforce uniqueness, either for UNIQUE constraints or the unique aspect of a PRIMARY KEY constraint. The advantage of this approach is that the index remains available and valid when the constraint is disabled. Therefore, enabling a disabled UNIQUE or PRIMARY KEY constraint does not require rebuilding the unique index associated with the constraint. This can yield significant time savings on enable operations for large tables.

Using a nonunique index to enforce uniqueness also lets you eliminate redundant indexes. You do not need a unique index on a primary key column if that column already is included as the prefix of a composite index. You can use the existing index to enable and enforce the constraint. You also save significant space by not duplicating the index. However, if the existing index is partitioned, then the partitioning key of the index must also be a subset of the UNIQUE key; otherwise, Oracle creates an additional unique index to enforce the constraint.

Using Enabled Novalidated Constraints

An enabled novalidated constraint behaves similarly to an enabled validated constraint. Placing a constraint in the enabled novalidated state signifies that any new data entered into the table must conform to the constraint. Existing data is not checked. Placing a constraint in the enabled novalidated state allows you to enable the constraint without locking the table.

If you change a constraint from disabled to enabled, then the table must be locked. No new DML, queries, or DDL can occur because there is no mechanism to ensure that operations on the table conform to the constraint during the enable operation. The enabled novalidated state prevents operations violating the constraint from being performed on the table.

An enabled novalidated constraint can be validated with a parallel, consistent-read query of the table to determine whether any data violates the constraint. No locking is performed and the enable operation does not block readers or writers to the table. In addition, enabled novalidated constraints can be validated in parallel: multiple constraints can be validated at the same time and each constraint's validity check can be determined using parallel query.

Use the following approach to create tables with constraints and indexes:

Create the tables with the constraints. NOT NULL constraints may be unnamed and should be created enabled and validated. All other constraints (CHECK, UNIQUE, PRIMARY KEY, and FOREIGN KEY) should be named and should be "created disabled".

Note:
By default, constraints are created in the ENABLED state.
Load old data into the tables.
Create all indexes including indexes needed for constraints.
Enable novalidate all constraints. Do this to primary keys before foreign keys.
Allow users to query and modify data.

With a separate ALTER TABLE statement for each constraint, validate all constraints. Do this to primary keys before foreign keys. For example,

CREATE TABLE t (a NUMBER CONSTRAINT apk PRIMARY KEY DISABLE, 
b NUMBER NOT NULL);
CREATE TABLE x (c NUMBER CONSTRAINT afk REFERENCES t DISABLE);

At this point, use Import or Fast Loader to load data into t.

CREATE UNIQUE INDEX tai ON t (a); 
CREATE INDEX tci ON x (c); 
ALTER TABLE t MODIFY CONSTRAINT apk ENABLE NOVALIDATE;  
ALTER TABLE x MODIFY CONSTRAINT afk ENABLE NOVALIDATE;

Now, users can start performing inserts, updates, deletes, and selects on t.

ALTER TABLE t ENABLE CONSTRAINT apk; 
ALTER TABLE x ENABLE CONSTRAINT afk;

Now, the constraints are enabled and validated.

See Also:
For a complete discussion of integrity constraints, see Oracle8i Concepts.

Using Function-based Indexes

A function-based index is an index on an expression. Oracle strongly recommends using function-based indexes whenever possible. Define function-based indexes anywhere that you use an index on a column, except for columns with LOBs or REFs. Nested table columns and object types cannot contain these columns.

You can create function-based indexes for any repeatable SQL function. Oracle recommends using function-based indexes for range scans and for functions in ORDER BY clauses.

Note:
You must set the QUERY_REWRITE_ENABLED session parameter to true to enable function-based indexes for queries. If QUERY_REWRITE_ENABLED is false, then function-based indexes are not used for obtaining the values of an expression in the function-based index. However, function-based indexes can still be used for obtaining values in real columns. QUERY_REWRITE_ENABLED is a session-level and also an instance-level parameter.

Function-based indexes are an efficient mechanism for evaluating statements that contain functions in WHERE clauses. You can create a function-based index to materialize computational-intensive expressions in the index. This permits Oracle to bypass computing the value of the expression when processing SELECT and DELETE statements. When processing INSERT and UPDATE statements, however, Oracle evaluates the function to process the statement.

For example, if you create the following index:

CREATE INDEX idx ON table_1 (a + b * (c - 1), a, b);

Then, Oracle can use it when processing queries such as:

SELECT a 
FROM table_1 
WHERE a + b * (c - 1) < 100;

Function-based indexes defined with the UPPER(column_name) or LOWER(column_name) keywords allow case-insensitive searches. For example, the following index:

CREATE INDEX uppercase_idx ON emp (UPPER(empname));

Facilitates processing queries such as:

SELECT * 
FROM emp 
WHERE UPPER(empname) = 'MARK';

You can also use function-based indexes for NLS sort indexes that provide efficient linguistic collation in SQL statements.

Oracle treats descending indexes as function-based indexes. The columns marked DESC are sorted in descending order.

See Also:
For more information on the CREATE INDEX statement, see Oracle8i SQL Reference.

Function-based Indexes and Index Organized Tables

Use index organized tables (IOTs) on tables with large, non-key columns to speed data retrieval. Because IOTs can store key column values in the indexes and non-key values in the lower leaves of the tree, applications such as those retrieving large text files, coded with a short key value, like an ISBN, might make use of the IOT feature.

The secondary index on an IOT can be a function-based index.

Using Bitmap Indexes

This section describes:

When to Use Bitmap Indexes

This section describes three aspects of indexing that you must evaluate when deciding whether to use bitmap indexing on a given table:

Performance Considerations

Bitmap indexes can substantially improve performance of queries with the following characteristics:

The WHERE clause contains multiple predicates on low- or medium-cardinality columns.
The individual predicates on these low- or medium-cardinality columns select a large number of rows.
Bitmap indexes have been created on some or all of these low- or medium-cardinality columns.
The tables being queried contain many rows.

You can use multiple bitmap indexes to evaluate the conditions on a single table. Bitmap indexes are thus highly advantageous for complex ad hoc queries that contain lengthy WHERE clauses. Bitmap indexes can also provide optimal performance for aggregate queries and for optimizing joins in star schemas.

Storage Considerations

Bitmap indexes can provide considerable storage savings over the use of B*-tree indexes. In databases containing only B*-tree indexes, you must anticipate the columns that would commonly be accessed together in a single query, and create a composite B*-tree index on these columns.

Not only would this B*-tree index require a large amount of space, it would also be ordered. That is, a B*-tree index on (marital_status, region, gender) is useless for queries that only access region and gender. To completely index the database, you must create indexes on the other permutations of these columns. For the simple case of three low-cardinality columns, there are six possible composite B*-tree indexes. You must consider the trade-offs between disk space and performance needs when determining which composite B*-tree indexes to create.

Bitmap indexes solve this dilemma. Bitmap indexes can be efficiently combined during query execution, so three small single-column bitmap indexes can do the job of six three-column B*-tree indexes.

Bitmap indexes are much more efficient than B*-tree indexes, especially in data warehousing environments. Bitmap indexes are created not only for efficient space usage, but also for efficient execution, and the latter is somewhat more important.

Do not create bitmap indexes on unique key columns. However, for columns where each value is repeated hundreds or thousands of times, a bitmap index typically is less than 25% of the size of a regular B*-tree index. The bitmaps themselves are stored in compressed format.

Simply comparing the relative sizes of B*-tree and bitmap indexes is not an accurate measure of effectiveness, however. Because of their different performance characteristics, you should keep B*-tree indexes on high-cardinality columns, while creating bitmap indexes on low-cardinality columns.

Maintenance Considerations

Bitmap indexes benefit data warehousing applications, but they are not appropriate for OLTP applications with a heavy load of concurrent INSERTs, UPDATEs, and DELETEs. In a data warehousing environment, data is usually maintained by way of bulk inserts and updates. Index maintenance is deferred until the end of each DML operation. For example, if you insert 1000 rows, then the inserted rows are placed into a sort buffer, and then the updates of all 1000 index entries are batched. (This is why SORT_AREA_SIZE must be set properly for good performance with inserts and updates on bitmap indexes.) Thus, each bitmap segment is updated only once per DML operation, even if more than one row in that segment changes.

Note:
The sorts described above are regular sorts and use the regular sort area, determined by SORT_AREA_SIZE. The BITMAP_MERGE_AREA_SIZE and CREATE_BITMAP_AREA_SIZE parameters described in "Initialization Parameters for Bitmap Indexing" only affect the specific operations indicated by the parameter names.

DML and DDL statements, such as UPDATE, DELETE, DROP TABLE, affect bitmap indexes the same way they do traditional indexes: the consistency model is the same. A compressed bitmap for a key value is made up of one or more bitmap segments, each of which is at most half a block in size (but may be smaller). The locking granularity is one such bitmap segment. This may affect performance in environments where many transactions make simultaneous updates. If numerous DML operations have caused increased index size and decreasing performance for queries, then you can use the ALTER INDEX ... REBUILD statement to compact the index and restore efficient performance.

A B*-tree index entry contains a single rowid. Therefore, when the index entry is locked, a single row is locked. With bitmap indexes, an entry can potentially contain a range of rowids. When a bitmap index entry is locked, the entire range of rowids is locked. The number of rowids in this range affects concurrency. As the number of rowids increases in a bitmap segment, concurrency decreases.

Locking issues affect DML operations, and may affect heavy OLTP environments. Locking issues do not, however, affect query performance. As with other types of indexes, updating bitmap indexes is a costly operation. Nonetheless, for bulk inserts and updates where many rows are inserted or many updates are made in a single statement, performance with bitmap indexes can be better than with regular B*-tree indexes.

Creating Bitmap Indexes

To create a bitmap index, use the BITMAP keyword in the CREATE INDEX statement:

CREATE BITMAP INDEX ...

Multi-column (concatenated) bitmap indexes are supported. They can be defined over no more than 30 columns. Other SQL statements concerning indexes, such as DROP, ANALYZE, ALTER, and so on, can refer to bitmap indexes without any extra keyword.

See Also:
For information on bitmap index restrictions, see Oracle8i SQL Reference.

Index Type

System index views USER_INDEXES, ALL_INDEXES, and DBA_INDEXES indicate bitmap indexes by the word BITMAP appearing in the TYPE column. A bitmap index cannot be declared as UNIQUE. A bitmap index on a unique key is useless.

Using Hints

The INDEX hint works with bitmap indexes in the same way as with traditional indexes.

The INDEX_COMBINE hint identifies the most cost effective indexes for the optimizer. The optimizer recognizes all indexes that can potentially be combined, given the predicates in the WHERE clause. However, it may not be cost effective to use all of them. Oracle recommends using INDEX_COMBINE rather than INDEX for bitmap indexes, because it is a more versatile hint.

In deciding which of these hints to use, the optimizer includes non-hinted indexes that appear cost effective, as well as indexes named in the hint. If certain indexes are given as arguments for the hint, then the optimizer tries to use some combination of those particular bitmap indexes.

If the hint does not name indexes, then all indexes are considered hinted. Hence, the optimizer tries to combine as many as is possible given the WHERE clause, without regard to cost effectiveness. The optimizer always tries to use hinted indexes in the plan regardless of whether it considers them cost effective.

See Also:
For more information on the INDEX_COMBINE hint, see Chapter 7, "Using Optimizer Hints".

Performance and Storage Tips

To get optimal performance and disk space usage with bitmap indexes, consider the following tips:

Large block sizes improve the efficiency of storing, and hence, retrieving, bitmap indexes.
To make compressed bitmaps as small as possible, declare NOT NULL constraints on all columns that cannot contain null values.
Fixed-length datatypes are more amenable to a compact bitmap representation than variable length datatypes.

This is because Oracle needs to consider the theoretical maximum number of rows that will fit in a data block when creating bitmap indexes.

Efficient Mapping of Bitmaps to Rowids

Use SQL statements with the ALTER TABLE syntax to optimize the mapping of bitmaps to rowids. The MINIMIZE RECORDS_PER_BLOCK clause enables this optimization and the NOMINIMIZE RECORDS_PER_BLOCK clause disables it.

When enabled, Oracle scans the table and determines the maximum number of records in any block and restricts this table to this maximum number. This enables bitmap indexes to allocate fewer bits per block and results in smaller bitmap indexes. The block and record allocation restrictions this statement places on the table are only beneficial to bitmap indexes. Therefore, Oracle does not recommend using this mapping on tables that are not heavily indexed with bitmap indexes.

Indexing Null Values

Bitmap indexes index nulls, whereas all other index types do not. Consider, for example, a table with STATE and PARTY columns, on which you want to perform the following query:

SELECT COUNT(*) 
FROM people 
WHERE state='CA' 

AND party !='D';

Indexing nulls enables a bitmap minus plan where bitmaps for party equal to D and NULL are subtracted from state bitmaps equal to CA. The EXPLAIN PLAN output would look like this:

SELECT STATEMENT 
  SORT AGGREGATE 
    BITMAP CONVERSION COUNT 
      BITMAP MINUS 
        BITMAP MINUS 
          BITMAP INDEX SINGLE VALUE STATE_BM 
          BITMAP INDEX SINGLE VALUE PARTY_BM 
        BITMAP INDEX SINGLE VALUE PARTY_BM

If a NOT NULL constraint existed on party, then the second minus operation (where party is null) would be left out because it is not needed.

Initialization Parameters for Bitmap Indexing

The following initialization parameters have an effect on performance:

CREATE_BITMAP_AREA_SIZE

This parameter determines the amount of memory allocated for bitmap creation. The default value is 8MB. A larger value may lead to faster index creation. If cardinality is very small, then you can set a small value for this parameter. For example, if cardinality is only 2, then the value can be on the order of kilobytes rather than megabytes. As a general rule, the higher the cardinality, the more memory is needed for optimal performance. You cannot dynamically alter this parameter at the system or session level.

BITMAP_MERGE_AREA_SIZE

This parameter determines the amount of memory used to merge bitmaps retrieved from a range scan of the index. The default value is 1 MB. A larger value should improve performance because the bitmap segments must be sorted before being merged into a single bitmap. You cannot dynamically alter this parameter at the system or session level.

SORT_AREA_SIZE

This parameter must be set properly for good performance with inserts and updates on bitmap indexes. Thus, each bitmap segment is updated only once per DML operation, even if more than one row in that segment changes.

Using Bitmap Access Plans on Regular B*-tree Indexes

If there is at least one bitmap index on the table, then the optimizer considers using a bitmap access path using regular B*-tree indexes for that table. This access path may involve combinations of B*-tree and bitmap indexes, but may not involve any bitmap indexes at all. However, the optimizer will not generate a bitmap access path using a single B*-tree index unless instructed to do so by a hint.

To use bitmap access paths for B*-tree indexes, the rowids stored in the indexes must be converted to bitmaps. After such a conversion, the various Boolean operations available for bitmaps can be used. As an example, consider the following query, where there is a bitmap index on column c1, and regular B*-tree indexes on columns c2 and c3.

EXPLAIN PLAN FOR
SELECT COUNT(*) 
FROM t
WHERE c1 = 2 AND c2 = 6 
OR c3 BETWEEN 10 AND 20;

SELECT STATEMENT 
     SORT AGGREGATE
       BITMAP CONVERSION COUNT
         BITMAP OR
           BITMAP AND
             BITMAP INDEX c1_ind SINGLE VALUE
             BITMAP CONVERSION FROM ROWIDS
               INDEX c2_ind RANGE SCAN
           BITMAP CONVERSION FROM ROWIDS
             SORT ORDER BY
               INDEX c3_ind RANGE SCAN

Note:
This statement is executed by accessing indexes only, so no table access is necessary.

Here, a COUNT option for the BITMAP CONVERSION row source counts the number of rows matching the query. There are also conversions FROM rowids in the plan to generate bitmaps from the rowids retrieved from the B*-tree indexes. The occurrence of the ORDER BY sort in the plan is due to the fact that the conditions on column c3 result in more than one list of rowids being returned from the B*-tree index. These lists are sorted before they can be converted into a bitmap.

Estimating Bitmap Index Size

Although it is not possible to precisely size a bitmap index, you can estimate its size. This section describes how to determine the size of a bitmap index for a table using the computed size of a B*-tree index. It also illustrates how cardinality, NOT NULL constraints, and the number of distinct values affect bitmap size.

To estimate the size of a bitmap index for a given table, extrapolate the size of a B*-tree index for the table. Use the following approach:

Use the standard formula described in Oracle8i Concepts to compute the size of a B*-tree index for the table.
Determine the cardinality of the table data.
From the cardinality value, extrapolate the size of a bitmap index, according to the graph in Figure 12-2 or Figure 12-3.

For a 1 million row table, Figure 12-2 shows index size on columns with different numbers of distinct values for B*-tree indexes and bitmap indexes. Using Figure 12-2, you can estimate the size of a bitmap index relative to that of a B*-tree index for the table. Sizing is not exact: results vary somewhat from table to table.

Randomly distributed data was used to generate the graph. If, in your data, particular values tend to cluster close together, then you may generate considerably smaller bitmap indexes than indicated by the graph. Also, bitmap indexes may be slightly smaller than those in the graph if columns contain NOT NULL constraints.

Figure 12-3 shows similar data for a table with 5 million rows. When cardinality exceeds 100,000, bitmap index size does not increase as fast as it does in Figure 12-2. For a table with more rows, there are more repeating values for a given cardinality.

Figure 12-2 Extrapolating Bitmap Index Size: 1 Million Row Table

Figure 12-3 Extrapolating Bitmap Index Size: 5 Million Row Table

Bitmap Index Restrictions

Bitmap indexes have the following restrictions:

For bitmap indexes with direct load, the SORTED_INDEX flag does not apply.
Bitmap indexes are not considered by the rule-based optimizer.
Bitmap indexes cannot be used for referential integrity checking.

Using Domain Indexes

Domain indexes are built using the indexing logic supplied by a user-defined indextype. An indextype provides an efficient mechanism to access data that satisfy certain operator predicates. Typically, the user-defined indextype is part of an Oracle option, like the Spatial option.

For example, the SpatialIndextype allows efficient search and retrieval of spatial data that overlap a given bounding box.

The cartridge determines the parameters you can specify in creating and maintaining the domain index. Similarly, the performance and storage characteristics of the domain index are presented in the specific cartridge documentation.

Refer to the appropriate cartridge documentation for information such as:

What datatypes can be indexed?
What indextypes are provided?
What operators does the indextype support?
How can the domain index be created and maintained?
How do we efficiently use the operator in queries?
What are the performance characteristics?

Note:
You can also create index types with the CREATE INDEXTYPE SQL statement.
See Also:
For information about the SpatialIndextype, see the Oracle Spatial User's Guide and Reference.

Using Clusters

A cluster is a group of tables that share the same data blocks because they share common columns and are often used together.

See Also:
For more information on clusters, see Oracle8i Concepts.

Follow these guidelines when deciding whether to cluster tables:

Cluster tables that are often accessed by your application in join statements.
Do not cluster tables if your application joins them only occasionally or modifies their common column values frequently. Modifying a row's cluster key value takes longer than modifying the value in an unclustered table, because Oracle may have to migrate the modified row to another block to maintain the cluster.
Do not cluster tables if your application often performs full table scans of only one of the tables. A full table scan of a clustered table can take longer than a full table scan of an unclustered table. Oracle is likely to read more blocks because the tables are stored together.
Cluster master-detail tables if you often select a master record and then the corresponding detail records. Detail records are stored in the same data block(s) as the master record, so they are likely still to be in memory when you select them, requiring Oracle to perform less I/O.
Store a detail table alone in a cluster if you often select many detail records of the same master. This measure improves the performance of queries that select detail records of the same master but does not decrease the performance of a full table scan on the master table.
Do not cluster tables if the data from all tables with the same cluster key value exceeds more than one or two Oracle blocks. To access a row in a clustered table, Oracle reads all blocks containing rows with that value. If these rows take up multiple blocks, then accessing a single row could require more reads than accessing the same row in an unclustered table.
Do not cluster tables when the number of rows per cluster key value varies significantly. This causes space wastage for the low cardinality key value and collisions for the high cardinality key values. Collisions degrade performance.

Consider the benefits and drawbacks of clusters with respect to the needs of your application. For example, you may decide that the performance gain for join statements outweighs the performance loss for statements that modify cluster key values. You may want to experiment and compare processing times with your tables both clustered and stored separately. To create a cluster, use the CREATE CLUSTER statement.

Using Hash Clusters

Hash clusters group table data by applying a hash function to each row's cluster key value. All rows with the same cluster key value are stored together on disk. Consider the benefits and drawbacks of hash clusters with respect to the needs of your application. You may want to experiment and compare processing times with a particular table as it is stored in a hash cluster, and as it is stored alone with an index. This section describes:

When to Use Hash Clusters

Follow these guidelines for choosing when to use hash clusters:

Use hash clusters to store tables often accessed by SQL statements with WHERE clauses if the WHERE clauses contain equality conditions that use the same column or combination of columns. Designate this column or combination of columns as the cluster key.
Store a table in a hash cluster if you can determine how much space is required to hold all rows with a given cluster key value, including rows to be inserted immediately as well as rows to be inserted in the future.
Do not use hash clusters if space in your database is scarce, and you cannot afford to allocate additional space for rows to be inserted in the future.
Do not use a hash cluster to store a constantly growing table if the process of occasionally creating a new, larger hash cluster to hold that table is impractical.
Do not store a table in a hash cluster if your application often performs full table scans, and you must allocate a great deal of space to the hash cluster in anticipation of the table growing. Such full table scans must read all blocks allocated to the hash cluster, even though some blocks may contain few rows. Storing the table alone would reduce the number of blocks read by full table scans.
Do not store a table in a hash cluster if your application frequently modifies the cluster key values. Modifying a row's cluster key value can take longer than modifying the value in an unclustered table, because Oracle may have to migrate the modified row to another block to maintain the cluster.
Storing a single table in a hash cluster can be useful, regardless of whether the table is often joined with other tables, provided that hashing is appropriate for the table based on the previous points in this list.

Creating Hash Clusters

To create a hash cluster, use the CREATE CLUSTER statement with the HASHKEYS parameter.

When you create a hash cluster, you must use the HASHKEYS parameter of the CREATE CLUSTER statement to specify the number of hash values for the hash cluster. For best performance of hash scans, choose a HASHKEYS value that is at least as large as the number of cluster key values. Such a value reduces the chance of collisions, or multiple cluster key values resulting in the same hash value. Collisions force Oracle to test the rows in each block for the correct cluster key value after performing a hash scan. Collisions reduce the performance of hash scans.

Oracle always rounds up the HASHKEYS value that you specify to the nearest prime number to obtain the actual number of hash values. This rounding is designed to reduce collisions.

Improved selectivity	Sometimes two or more columns or expressions, each with poor selectivity, can be combined to form a composite index with more accurate selectivity.
Reduced I/O	If all columns selected by a query are in a composite index, then Oracle can return these values from the index without accessing the table.

12 Data Access Methods