Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. The larger the table, the more this costs. If the table has an index for the columns in question, MySQL can quickly determine the position to seek to in the middle of the data file without having to look at all the data. If a table has 1,000 rows, this is at least 100 times faster than reading sequentially. If you need to access most of the rows, it is faster to read sequentially, because this minimizes disk seeks.
Most MySQL indexes (PRIMARY KEY,
UNIQUE, INDEX, and
FULLTEXT) are stored in B-trees. Exceptions
are that indexes on spatial data types use R-trees, and that
MEMORY tables also support hash indexes.
Strings are automatically prefix- and end-space compressed. See
Section 13.1.8, “CREATE INDEX Syntax”.
In general, indexes are used as described in the following
discussion. Characteristics specific to hash indexes (as used in
MEMORY tables) are described at the end of
this section.
MySQL uses indexes for these operations:
To find the rows matching a WHERE clause
quickly.
To eliminate rows from consideration. If there is a choice between multiple indexes, MySQL normally uses the index that finds the smallest number of rows.
To retrieve rows from other tables when performing joins.
MySQL can use indexes on columns more efficiently if they
are declared as the same type and size. In this context,
VARCHAR and
CHAR are considered the same
if they are declared as the same size. For example,
VARCHAR(10) and
CHAR(10) are the same size, but
VARCHAR(10) and
CHAR(15) are not.
Comparison of dissimilar columns may prevent use of indexes
if values cannot be compared directly without conversion.
Suppose that a numeric column is compared to a string
column. For a given value such as 1 in
the numeric column, it might compare equal to any number of
values in the string column such as '1',
' 1', '00001', or
'01.e1'. This rules out use of any
indexes for the string column.
To find the MIN() or
MAX() value for a specific
indexed column key_col. This is
optimized by a preprocessor that checks whether you are
using WHERE on all key
parts that occur before key_part_N =
constantkey_col
in the index. In this case, MySQL does a single key lookup
for each MIN() or
MAX() expression and replaces
it with a constant. If all expressions are replaced with
constants, the query returns at once. For example:
SELECT MIN(key_part2),MAX(key_part2) FROMtbl_nameWHEREkey_part1=10;
To sort or group a table if the sorting or grouping is done
on a leftmost prefix of a usable key (for example,
ORDER BY ). If all key
parts are followed by key_part1,
key_part2DESC, the key is
read in reverse order. See
Section 8.3.1.11, “ORDER BY Optimization”, and
Section 8.3.1.12, “GROUP BY Optimization”.
In some cases, a query can be optimized to retrieve values without consulting the data rows. If a query uses only columns from a table that are numeric and that form a leftmost prefix for some key, the selected values may be retrieved from the index tree for greater speed:
SELECTkey_part3FROMtbl_nameWHEREkey_part1=1
Suppose that you issue the following
SELECT statement:
mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
If a multiple-column index exists on col1 and
col2, the appropriate rows can be fetched
directly. If separate single-column indexes exist on
col1 and col2, the
optimizer will attempt to use the Index Merge optimization (see
Section 8.3.1.4, “Index Merge Optimization”), or attempt to find
the most restrictive index by deciding which index finds fewer
rows and using that index to fetch the rows.
If the table has a multiple-column index, any leftmost prefix of
the index can be used by the optimizer to find rows. For
example, if you have a three-column index on (col1,
col2, col3), you have indexed search capabilities on
(col1), (col1, col2), and
(col1, col2, col3).
MySQL cannot use an index if the columns do not form a leftmost
prefix of the index. Suppose that you have the
SELECT statements shown here:
SELECT * FROMtbl_nameWHERE col1=val1; SELECT * FROMtbl_nameWHERE col1=val1AND col2=val2; SELECT * FROMtbl_nameWHERE col2=val2; SELECT * FROMtbl_nameWHERE col2=val2AND col3=val3;
If an index exists on (col1, col2, col3),
only the first two queries use the index. The third and fourth
queries do involve indexed columns, but
(col2) and (col2, col3)
are not leftmost prefixes of (col1, col2,
col3).
A B-tree index can be used for column comparisons in expressions
that use the =,
>,
>=,
<,
<=,
or BETWEEN operators. The index
also can be used for LIKE
comparisons if the argument to LIKE
is a constant string that does not start with a wildcard
character. For example, the following
SELECT statements use indexes:
SELECT * FROMtbl_nameWHEREkey_colLIKE 'Patrick%'; SELECT * FROMtbl_nameWHEREkey_colLIKE 'Pat%_ck%';
In the first statement, only rows with 'Patrick' <=
are
considered. In the second statement, only rows with
key_col < 'Patricl''Pat' <= are considered.
key_col <
'Pau'
The following SELECT statements
do not use indexes:
SELECT * FROMtbl_nameWHEREkey_colLIKE '%Patrick%'; SELECT * FROMtbl_nameWHEREkey_colLIKEother_col;
In the first statement, the LIKE
value begins with a wildcard character. In the second statement,
the LIKE value is not a constant.
If you use ... LIKE
'% and
string%'string is longer than three
characters, MySQL uses the Turbo Boyer-Moore
algorithm to initialize the pattern for the string
and then uses this pattern to perform the search more quickly.
A search using employs indexes if
col_name IS
NULLcol_name is indexed.
Any index that does not span all
AND levels in the
WHERE clause is not used to optimize the
query. In other words, to be able to use an index, a prefix of
the index must be used in every AND
group.
The following WHERE clauses use indexes:
... WHEREindex_part1=1 ANDindex_part2=2 ANDother_column=3 /*index= 1 ORindex= 2 */ ... WHEREindex=1 OR A=10 ANDindex=2 /* optimized like "index_part1='hello'" */ ... WHEREindex_part1='hello' ANDindex_part3=5 /* Can use index onindex1but not onindex2orindex3*/ ... WHEREindex1=1 ANDindex2=2 ORindex1=3 ANDindex3=3;
These WHERE clauses do
not use indexes:
/*index_part1is not used */ ... WHEREindex_part2=1 ANDindex_part3=2 /* Index is not used in both parts of the WHERE clause */ ... WHEREindex=1 OR A=10 /* No index spans all rows */ ... WHEREindex_part1=1 ORindex_part2=10
Sometimes MySQL does not use an index, even if one is available.
One circumstance under which this occurs is when the optimizer
estimates that using the index would require MySQL to access a
very large percentage of the rows in the table. (In this case, a
table scan is likely to be much faster because it requires fewer
seeks.) However, if such a query uses LIMIT
to retrieve only some of the rows, MySQL uses an index anyway,
because it can much more quickly find the few rows to return in
the result.
Hash indexes have somewhat different characteristics from those just discussed:
They are used only for equality comparisons that use the
= or <=>
operators (but are very fast). They are
not used for comparison operators such as
< that find a range of values.
The optimizer cannot use a hash index to speed up
ORDER BY operations. (This type of index
cannot be used to search for the next entry in order.)
MySQL cannot determine approximately how many rows there are
between two values (this is used by the range optimizer to
decide which index to use). This may affect some queries if
you change a MyISAM table to a
hash-indexed MEMORY table.
Only whole keys can be used to search for a row. (With a B-tree index, any leftmost prefix of the key can be used to find rows.)