Selecting Indexes
Before determining whether the index is appropriate, you need to know how the database engine selects indexes. In general, consider these basic rules:
-
The columns in the "where" clause are used when deciding on an index.
On most platforms, the database engine takes equality statements, "like" statements, and less-than and greater-than statements into consideration. For example, in the statement "where business_unit = 'NEWGN' and accounting_period >= 1 and <= 12", the application engine uses both the business_unit and accounting period when it accesses the data. With a "like" statement, if a specific value is passed, the system uses it to select an index; however, if the field contains a wild card ('%'), the system ranks the column lower in priority when it determines an index.
-
Each platform has specific index limitations.
For example, SQLServer and Oracle platform indexes do not perform well with "NOT = " or "!=" statements. In DB2, any column after a range (>, <) is not used when the system selects an index. Consult your platform system administration documentation for the specific index limitations on your system.
-
The system looks at cardinality, which refers to the number of unique values in a column.
For example, if you only have one business unit in your organization, the business_unit column in the ledger record only has one value in it—very low cardinality. In the demo database, the account column is always entered and has many unique values, so the cardinality is fairly high.
To determine cardinality on a particular ChartField, issue a SQL statement that selects count(*) from the table in question. The value returned is the number of entries in the record. In general, high cardinality fields should be included in the index.
-
The columns that are used to join records should generally be included in an index.
These are the fields in a "where" statement used to join one record to another. These columns tend toward low cardinality, and the optimizers do not rate equality to another column nearly as high as equality to a bind variable. For these reasons, columns used to join tables are usually in the unique index but generally are not included in all other indexes.
-
The system only uses an index up to the point that a column in the index is not included in the "where" clause.
For example, if the Journal Line record has an index that includes business unit, journal ID, and journal date, but the "where" clause includes only business unit and journal date, the index is only effective for the business unit. The journal date provided is ignored because the journal ID information is not included in the "where" clause. For this reason, the sequence of the fields in the index is very important.
-
The system uses the size of the record and the selectiveness of the index to determine whether the index or full-table scan is more efficient.
This is sometimes referred to as the filter factor. The effective filter factor for an index is the combined cardinalities for the index columns actually used in a particular access.
For example, if an index is built over FISCAL_YEAR, LEDGER and ACCOUNT, and the table contains four years, five ledgers, and 800 accounts, the potential filter factor is 1/(4*5*800), or 1/16000, or 0.0000625. (In a real-world data distribution, the filter factor would not be this good, but it would still be quite good unless the data is very skewed.) However, if the ACCOUNT field in the index could not be used because of the nature of the criteria for it, the filter factor would be only 1/20, which is not very selective. In general, an index should point to around 10% - 15% of a record in order to be efficient.