This section discusses guidelines for managing indexes and contains the following topics:
Oracle Database Concepts for conceptual information about indexes and indexing, including descriptions of the various indexing schemes offered by Oracle
Oracle Database Data Cartridge Developer's Guide for information about defining domain-specific operators and indexing schemes and integrating them into the Oracle Database server
Data is often inserted or loaded into a table using either the SQL*Loader or an import utility. It is more efficient to create an index for a table after inserting or loading the data. If you create one or more indexes before loading data, the database then must update every index as each row is inserted.
Creating an index on a table that already has data requires sort space. Some sort space comes from memory allocated for the index creator. The amount for each user is determined by the initialization parameter
SORT_AREA_SIZE. The database also swaps sort information to and from temporary segments that are only allocated during the index creation in the users temporary tablespace.
Under certain conditions, data can be loaded into a table with SQL*Loader direct-path load and an index can be created as data is loaded.
See Also:Oracle Database Utilities for information about using SQL*Loader for direct-path load
Create an index if you frequently want to retrieve less than 15% of the rows in a large table. The percentage varies greatly according to the relative speed of a table scan and how the distribution of the row data in relation to the index key. The faster the table scan, the lower the percentage; the more clustered the row data, the higher the percentage.
To improve performance on joins of multiple tables, index columns used for joins.
Note:Primary and unique keys automatically have indexes, but you might want to create an index on a foreign key.
Small tables do not require indexes. If a query is taking too long, then the table might have grown from small to large.
Values are relatively unique in the column.
There is a wide range of values (good for regular indexes).
There is a small range of values (good for bitmap indexes).
The column contains many nulls, but queries often select all rows having a value. In this case, use the following phrase:
WHERE COL_X > -9.99 * power(10,125)
Using the preceding phrase is preferable to:
WHERE COL_X IS NOT NULL
This is because the first uses an index on
COL_X (assuming that
COL_X is a numeric column).
Columns with the following characteristics are less suitable for indexing:
There are many nulls in the column and you do not search on the not null values.
RAW columns cannot be indexed.
The order of columns in the
INDEX statement can affect query performance. In general, specify the most frequently used columns first.
If you create a single index across columns to speed up queries that access, for example,
col3; then queries that access just
col1, or that access just
col2, are also speeded up. But a query that accessed just
col3, or just
col3 does not use the index.
A table can have any number of indexes. However, the more indexes there are, the more overhead is incurred as the table is modified. Specifically, when rows are inserted or deleted, all indexes on the table must be updated as well. Also, when a column is updated, all indexes that contain the column must be updated.
Thus, there is a trade-off between the speed of retrieving data from a table and the speed of updating the table. For example, if a table is primarily read-only, having more indexes can be useful; but if a table is heavily updated, having fewer indexes could be preferable.
Consider dropping an index if:
It does not speed up queries. The table could be very small, or there could be many rows in the table but very few index entries.
The queries in your applications do not use the index.
The index must be dropped before being rebuilt.
See Also:"Monitoring Index Usage"
Estimating the size of an index before creating one can facilitate better disk space planning and management. You can use the combined estimated size of indexes, along with estimates for tables, the undo tablespace, and redo log files, to determine the amount of disk space that is required to hold an intended database. From these estimates, you can make correct hardware purchases and other decisions.
Use the estimated size of an individual index to better manage the disk space that the index uses. When an index is created, you can set appropriate storage parameters and improve I/O performance of applications that use the index. For example, assume that you estimate the maximum size of an index before creating it. If you then set the storage parameters when you create the index, fewer extents are allocated for the table data segment, and all of the index data is stored in a relatively contiguous section of disk space. This decreases the time necessary for disk I/O operations involving this index.
The maximum size of a single index entry is approximately one-half the data block size.
Storage parameters of an index segment created for the index used to enforce a primary key or unique key constraint can be set in either of the following ways:
INDEX clause of the
STORAGE clause of the
Indexes can be created in any tablespace. An index can be created in the same or different tablespace as the table it indexes. If you use the same tablespace for a table and its index, it can be more convenient to perform database maintenance (such as tablespace or file backup) or to ensure application availability. All the related data is always online together.
Using different tablespaces (on different disks) for a table and its index produces better performance than storing the table and index in the same tablespace. Disk contention is reduced. But, if you use different tablespaces for a table and its index and one tablespace is offline (containing either data or index), then the statements referencing that table are not guaranteed to work.
You can parallelize index creation, much the same as you can parallelize table creation. Because multiple processes work together to create the index, the database can create the index more quickly than if a single server process created the index sequentially.
When creating an index in parallel, storage parameters are used separately by each query server process. Therefore, an index created with an
INITIAL value of 5M and a parallel degree of 12 consumes at least 60M of storage during index creation.
Oracle Database Data Warehousing Guide for information about utilizing parallel execution in a data warehousing environment
Note:Because indexes created using
NOLOGGINGare not archived, perform a backup after you create the index.
Creating an index with
NOLOGGING has the following benefits:
Space is saved in the redo log files.
The time it takes to create the index is decreased.
Performance improves for parallel creation of large indexes.
In general, the relative performance improvement is greater for larger indexes created without
LOGGING than for smaller ones. Creating small indexes without
LOGGING has little effect on the time it takes to create an index. However, for larger indexes the performance improvement can be significant, especially when you are also parallelizing the index creation.
Improper sizing or increased growth can produce index fragmentation. To eliminate or reduce fragmentation, you can rebuild or coalesce the index. But before you perform either task weigh the costs and benefits of each option and choose the one that works best for your situation. Table 19-1 is a comparison of the costs and benefits associated with rebuilding and coalescing indexes.
|Rebuild Index||Coalesce Index|
Quickly moves index to another tablespace
Cannot move index to another tablespace
Higher costs: requires more disk space
Lower costs: does not require more disk space
Creates new tree, shrinks height if applicable
Coalesces leaf blocks within same branch of tree
Enables you to quickly change storage and tablespace parameters without having to drop the original index.
Quickly frees up index leaf blocks for use.
In situations where you have B-tree index leaf blocks that can be freed up for reuse, you can merge those leaf blocks using the following statement:
ALTER INDEX vmoore COALESCE;
Figure 19-1 illustrates the effect of an
ALTER INDEX COALESCE on the index
vmoore. Before performing the operation, the first two leaf blocks are 50% full. This means you have an opportunity to reduce fragmentation and completely fill the first block, while freeing up the second.
Because unique and primary keys have associated indexes, you should factor in the cost of dropping and creating indexes when considering whether to disable or drop a
PRIMARY KEY constraint. If the associated index for a
UNIQUE key or
PRIMARY KEY constraint is extremely large, you can save time by leaving the constraint enabled rather than dropping and re-creating the large index. You also have the option of explicitly specifying that you want to keep or drop the index when dropping or disabling a
PRIMARY KEY constraint.
See Also:"Managing Integrity Constraints"