MySQL 8.0 Reference Manual Including MySQL NDB Cluster 8.0
          The range access method
          uses a single index to retrieve a subset of table rows that
          are contained within one or several index value intervals. It
          can be used for a single-part or multiple-part index. The
          following sections describe conditions under which the
          optimizer uses range access.
        
            For a single-part index, index value intervals can be
            conveniently represented by corresponding conditions in the
            WHERE clause, denoted as
            range conditions
            rather than “intervals.”
          
The definition of a range condition for a single-part index is as follows:
                For both BTREE and
                HASH indexes, comparison of a key
                part with a constant value is a range condition when
                using the
                =,
                <=>,
                IN(), IS
                NULL, or IS NOT
                NULL operators.
              
                Additionally, for BTREE indexes,
                comparison of a key part with a constant value is a
                range condition when using the
                >,
                <,
                >=,
                <=,
                BETWEEN,
                !=,
                or
                <>
                operators, or LIKE
                comparisons if the argument to
                LIKE is a constant string
                that does not start with a wildcard character.
              
                For all index types, multiple range conditions combined
                with OR or
                AND form a range condition.
              
“Constant value” in the preceding descriptions means one of the following:
            Here are some examples of queries with range conditions in
            the WHERE clause:
          
SELECT * FROM t1 WHEREkey_col> 1 ANDkey_col< 10; SELECT * FROM t1 WHEREkey_col= 1 ORkey_colIN (15,18,20); SELECT * FROM t1 WHEREkey_colLIKE 'ab%' ORkey_colBETWEEN 'bar' AND 'foo';
Some nonconstant values may be converted to constants during the optimizer constant propagation phase.
            MySQL tries to extract range conditions from the
            WHERE clause for each of the possible
            indexes. During the extraction process, conditions that
            cannot be used for constructing the range condition are
            dropped, conditions that produce overlapping ranges are
            combined, and conditions that produce empty ranges are
            removed.
          
            Consider the following statement, where
            key1 is an indexed column and
            nonkey is not indexed:
          
SELECT * FROM t1 WHERE (key1 < 'abc' AND (key1 LIKE 'abcde%' OR key1 LIKE '%b')) OR (key1 < 'bar' AND nonkey = 4) OR (key1 < 'uux' AND key1 > 'z');
            The extraction process for key key1 is as
            follows:
          
                Start with original WHERE clause:
              
(key1 < 'abc' AND (key1 LIKE 'abcde%' OR key1 LIKE '%b')) OR (key1 < 'bar' AND nonkey = 4) OR (key1 < 'uux' AND key1 > 'z')
                Remove nonkey = 4 and key1
                LIKE '%b' because they cannot be used for a
                range scan. The correct way to remove them is to replace
                them with TRUE, so that we do not
                miss any matching rows when doing the range scan.
                Replacing them with TRUE yields:
              
(key1 < 'abc' AND (key1 LIKE 'abcde%' OR TRUE)) OR (key1 < 'bar' AND TRUE) OR (key1 < 'uux' AND key1 > 'z')
Collapse conditions that are always true or false:
                    (key1 LIKE 'abcde%' OR TRUE) is
                    always true
                  
                    (key1 < 'uux' AND key1 >
                    'z') is always false
                  
Replacing these conditions with constants yields:
(key1 < 'abc' AND TRUE) OR (key1 < 'bar' AND TRUE) OR (FALSE)
                Removing unnecessary TRUE and
                FALSE constants yields:
              
(key1 < 'abc') OR (key1 < 'bar')
Combining overlapping intervals into one yields the final condition to be used for the range scan:
(key1 < 'bar')
            In general (and as demonstrated by the preceding example),
            the condition used for a range scan is less restrictive than
            the WHERE clause. MySQL performs an
            additional check to filter out rows that satisfy the range
            condition but not the full WHERE clause.
          
            The range condition extraction algorithm can handle nested
            AND/OR
            constructs of arbitrary depth, and its output does not
            depend on the order in which conditions appear in
            WHERE clause.
          
            MySQL does not support merging multiple ranges for the
            range access method for
            spatial indexes. To work around this limitation, you can use
            a UNION with identical
            SELECT statements, except
            that you put each spatial predicate in a different
            SELECT.
          
Range conditions on a multiple-part index are an extension of range conditions for a single-part index. A range condition on a multiple-part index restricts index rows to lie within one or several key tuple intervals. Key tuple intervals are defined over a set of key tuples, using ordering from the index.
            For example, consider a multiple-part index defined as
            key1(, and the
            following set of key tuples listed in key order:
          key_part1,
            key_part2,
            key_part3)
key_part1key_part2key_part3NULL 1 'abc' NULL 1 'xyz' NULL 2 'foo' 1 1 'abc' 1 1 'xyz' 1 2 'abc' 2 1 'aaa'
            The condition  defines this interval:
          key_part1
            = 1
(1,-inf,-inf) <= (key_part1,key_part2,key_part3) < (1,+inf,+inf)
The interval covers the 4th, 5th, and 6th tuples in the preceding data set and can be used by the range access method.
            By contrast, the condition
             does not define a single interval and cannot
            be used by the range access method.
          key_part3 =
            'abc'
The following descriptions indicate how range conditions work for multiple-part indexes in greater detail.
                For HASH indexes, each interval
                containing identical values can be used. This means that
                the interval can be produced only for conditions in the
                following form:
              
key_part1cmpconst1ANDkey_part2cmpconst2AND ... ANDkey_partNcmpconstN;
                Here, const1,
                const2, … are
                constants, cmp is one of the
                =,
                <=>,
                or IS NULL comparison
                operators, and the conditions cover all index parts.
                (That is, there are N
                conditions, one for each part of an
                N-part index.) For example,
                the following is a range condition for a three-part
                HASH index:
              
key_part1= 1 ANDkey_part2IS NULL ANDkey_part3= 'foo'
For the definition of what is considered to be a constant, see Range Access Method for Single-Part Indexes.
                For a BTREE index, an interval might
                be usable for conditions combined with
                AND, where each condition
                compares a key part with a constant value using
                =,
                <=>,
                IS NULL,
                >,
                <,
                >=,
                <=,
                !=,
                <>,
                BETWEEN, or
                LIKE
                ' (where
                pattern''
                does not start with a wildcard). An interval can be used
                as long as it is possible to determine a single key
                tuple containing all rows that match the condition (or
                two intervals if
                pattern'<>
                or !=
                is used).
              
                The optimizer attempts to use additional key parts to
                determine the interval as long as the comparison
                operator is
                =,
                <=>,
                or IS NULL. If the operator
                is
                >,
                <,
                >=,
                <=,
                !=,
                <>,
                BETWEEN, or
                LIKE, the
                optimizer uses it but considers no more key parts. For
                the following expression, the optimizer uses
                = from
                the first comparison. It also uses
                >=
                from the second comparison but considers no further key
                parts and does not use the third comparison for interval
                construction:
              
key_part1= 'foo' ANDkey_part2>= 10 ANDkey_part3> 10
The single interval is:
('foo',10,-inf) < (key_part1,key_part2,key_part3) < ('foo',+inf,+inf)
                It is possible that the created interval contains more
                rows than the initial condition. For example, the
                preceding interval includes the value ('foo',
                11, 0), which does not satisfy the original
                condition.
              
                If conditions that cover sets of rows contained within
                intervals are combined with
                OR, they form a condition
                that covers a set of rows contained within the union of
                their intervals. If the conditions are combined with
                AND, they form a condition
                that covers a set of rows contained within the
                intersection of their intervals. For example, for this
                condition on a two-part index:
              
(key_part1= 1 ANDkey_part2< 2) OR (key_part1> 5)
The intervals are:
(1,-inf) < (key_part1,key_part2) < (1,2) (5,-inf) < (key_part1,key_part2)
                In this example, the interval on the first line uses one
                key part for the left bound and two key parts for the
                right bound. The interval on the second line uses only
                one key part. The key_len column in
                the EXPLAIN output
                indicates the maximum length of the key prefix used.
              
                In some cases, key_len may indicate
                that a key part was used, but that might be not what you
                would expect. Suppose that
                key_part1 and
                key_part2 can be
                NULL. Then the
                key_len column displays two key part
                lengths for the following condition:
              
key_part1>= 1 ANDkey_part2< 2
But, in fact, the condition is converted to this:
key_part1>= 1 ANDkey_part2IS NOT NULL
For a description of how optimizations are performed to combine or eliminate intervals for range conditions on a single-part index, see Range Access Method for Single-Part Indexes. Analogous steps are performed for range conditions on multiple-part indexes.
            Consider these expressions, where
            col_name is an indexed column:
          
col_nameIN(val1, ...,valN)col_name=val1OR ... ORcol_name=valN
            Each expression is true if
            col_name is equal to any of
            several values. These comparisons are equality range
            comparisons (where the “range” is a single
            value). The optimizer estimates the cost of reading
            qualifying rows for equality range comparisons as follows:
          
                If there is a unique index on
                col_name, the row estimate
                for each range is 1 because at most one row can have the
                given value.
              
                Otherwise, any index on
                col_name is nonunique and the
                optimizer can estimate the row count for each range
                using dives into the index or index statistics.
              
            With index dives, the optimizer makes a dive at each end of
            a range and uses the number of rows in the range as the
            estimate. For example, the expression
             has three equality ranges and the optimizer
            makes two dives per range to generate a row estimate. Each
            pair of dives yields an estimate of the number of rows that
            have the given value.
          col_name IN (10, 20,
            30)
Index dives provide accurate row estimates, but as the number of comparison values in the expression increases, the optimizer takes longer to generate a row estimate. Use of index statistics is less accurate than index dives but permits faster row estimation for large value lists.
            The
            eq_range_index_dive_limit
            system variable enables you to configure the number of
            values at which the optimizer switches from one row
            estimation strategy to the other. To permit use of index
            dives for comparisons of up to N
            equality ranges, set
            eq_range_index_dive_limit
            to N + 1. To disable use of
            statistics and always use index dives regardless of
            N, set
            eq_range_index_dive_limit
            to 0.
          
            To update table index statistics for best estimates, use
            ANALYZE TABLE.
          
            Prior to MySQL 8.0, there is no way of skipping
            the use of index dives to estimate index usefulness, except
            by using the
            eq_range_index_dive_limit
            system variable. In MySQL 8.0, index dive
            skipping is possible for queries that satisfy all these
            conditions:
          
The query is for a single table, not a join on multiple tables.
                A single-index FORCE INDEX index hint
                is present. The idea is that if index use is forced,
                there is nothing to be gained from the additional
                overhead of performing dives into the index.
              
                The index is nonunique and not a
                FULLTEXT index.
              
No subquery is present.
                No DISTINCT, GROUP
                BY, or ORDER BY clause is
                present.
              
            For EXPLAIN FOR
            CONNECTION, the output changes as follows if index
            dives are skipped:
          
                For traditional output, the rows and
                filtered values are
                NULL.
              
                For JSON output,
                rows_examined_per_scan and
                rows_produced_per_join do not appear,
                skip_index_dive_due_to_force is
                true, and cost calculations are not
                accurate.
              
            Without FOR CONNECTION,
            EXPLAIN output does not
            change when index dives are skipped.
          
            After execution of a query for which index dives are
            skipped, the corresponding row in the Information Schema
            OPTIMIZER_TRACE table contains
            an index_dives_for_range_access value of
            skipped_due_to_force_index.
          
Consider the following scenario:
CREATE TABLE t1 (f1 INT NOT NULL, f2 INT NOT NULL, PRIMARY KEY(f1, f2)); INSERT INTO t1 VALUES (1,1), (1,2), (1,3), (1,4), (1,5), (2,1), (2,2), (2,3), (2,4), (2,5); INSERT INTO t1 SELECT f1, f2 + 5 FROM t1; INSERT INTO t1 SELECT f1, f2 + 10 FROM t1; INSERT INTO t1 SELECT f1, f2 + 20 FROM t1; INSERT INTO t1 SELECT f1, f2 + 40 FROM t1; ANALYZE TABLE t1; EXPLAIN SELECT f1, f2 FROM t1 WHERE f2 > 40;
            To execute this query, MySQL can choose an index scan to
            fetch all rows (the index includes all columns to be
            selected), then apply the f2 > 40
            condition from the WHERE clause to
            produce the final result set.
          
            A range scan is more efficient than a full index scan, but
            cannot be used in this case because there is no condition on
            f1, the first index column. However, as
            of MySQL 8.0.13, the optimizer can perform multiple range
            scans, one for each value of f1, using a
            method called Skip Scan that is similar to Loose Index Scan
            (see Section 10.2.1.17, “GROUP BY Optimization”):
          
                Skip between distinct values of the first index part,
                f1 (the index prefix).
              
                Perform a subrange scan on each distinct prefix value
                for the f2 > 40 condition on the
                remaining index part.
              
For the data set shown earlier, the algorithm operates like this:
                Get the first distinct value of the first key part
                (f1 = 1).
              
                Construct the range based on the first and second key
                parts (f1 = 1 AND f2 > 40).
              
Perform a range scan.
                Get the next distinct value of the first key part
                (f1 = 2).
              
                Construct the range based on the first and second key
                parts (f1 = 2 AND f2 > 40).
              
Perform a range scan.
Using this strategy decreases the number of accessed rows because MySQL skips the rows that do not qualify for each constructed range. This Skip Scan access method is applicable under the following conditions:
                Table T has at least one compound index with key parts
                of the form ([A_1, ..., A_k,]
                B_1, ..., B_m, C [, D_1, ...,
                D_n]). Key parts A and D may
                be empty, but B and C must be nonempty.
              
The query references only one table.
                The query does not use GROUP BY or
                DISTINCT.
              
The query references only columns in the index.
                The predicates on A_1, ...,
                A_k must be equality
                predicates and they must be constants. This includes the
                IN() operator.
              
                The query must be a conjunctive query; that is, an
                AND of OR
                conditions:
                (
              cond1(key_part1)
                OR
                cond2(key_part1))
                AND
                (cond1(key_part2)
                OR ...) AND ...
There must be a range condition on C.
Conditions on D columns are permitted. Conditions on D must be in conjunction with the range condition on C.
            Use of Skip Scan is indicated in EXPLAIN
            output as follows:
          
                Using index for skip scan in the
                Extra column indicates that the loose
                index Skip Scan access method is used.
              
                If the index can be used for Skip Scan, the index should
                be visible in the possible_keys
                column.
              
            Use of Skip Scan is indicated in optimizer trace output by a
            "skip scan" element of this form:
          
"skip_scan_range": {
  "type": "skip_scan",
  "index": index_used_for_skip_scan,
  "key_parts_used_for_access": [key_parts_used_for_access],
  "range": [range]
}
            You may also see a
            "best_skip_scan_summary" element. If Skip
            Scan is chosen as the best range access variant, a
            "chosen_range_access_summary" is written.
            If Skip Scan is chosen as the overall best access method, a
            "best_access_path" element is present.
          
            Use of Skip Scan is subject to the value of the
            skip_scan flag of the
            optimizer_switch system
            variable. See Section 10.9.2, “Switchable Optimizations”. By
            default, this flag is on. To disable it,
            set skip_scan to
            off.
          
            In addition to using the
            optimizer_switch system
            variable to control optimizer use of Skip Scan session-wide,
            MySQL supports optimizer hints to influence the optimizer on
            a per-statement basis. See
            Section 10.9.3, “Optimizer Hints”.
          
The optimizer is able to apply the range scan access method to queries of this form:
SELECT ... FROM t1 WHERE ( col_1, col_2 ) IN (( 'a', 'b' ), ( 'c', 'd' ));
Previously, for range scans to be used, it was necessary to write the query as:
SELECT ... FROM t1 WHERE ( col_1 = 'a' AND col_2 = 'b' ) OR ( col_1 = 'c' AND col_2 = 'd' );
For the optimizer to use a range scan, queries must satisfy these conditions:
                On the left side of the
                IN() predicate, the row
                constructor contains only column references.
              
                On the right side of the
                IN() predicate, row
                constructors contain only runtime constants, which are
                either literals or local column references that are
                bound to constants during execution.
              
                On the right side of the
                IN() predicate, there is
                more than one row constructor.
              
For more information about the optimizer and row constructors, see Section 10.2.1.22, “Row Constructor Expression Optimization”
            To control the memory available to the range optimizer, use
            the
            range_optimizer_max_mem_size
            system variable:
          
A value of 0 means “no limit.”
                With a value greater than 0, the optimizer tracks the
                memory consumed when considering the range access
                method. If the specified limit is about to be exceeded,
                the range access method is abandoned and other methods,
                including a full table scan, are considered instead.
                This could be less optimal. If this happens, the
                following warning occurs (where
                N is the current
                range_optimizer_max_mem_size
                value):
              
Warning    3170    Memory capacity of N bytes for
                   'range_optimizer_max_mem_size' exceeded. Range
                   optimization was not done for this query.
                For UPDATE and
                DELETE statements, if the
                optimizer falls back to a full table scan and the
                sql_safe_updates system
                variable is enabled, an error occurs rather than a
                warning because, in effect, no key is used to determine
                which rows to modify. For more information, see
                Using Safe-Updates Mode (--safe-updates).
              
            For individual queries that exceed the available range
            optimization memory and for which the optimizer falls back
            to less optimal plans, increasing the
            range_optimizer_max_mem_size
            value may improve performance.
          
To estimate the amount of memory needed to process a range expression, use these guidelines:
                For a simple query such as the following, where there is
                one candidate key for the range access method, each
                predicate combined with OR
                uses approximately 230 bytes:
              
SELECT COUNT(*) FROM t
WHERE a=1 OR a=2 OR a=3 OR .. . a=N;
                Similarly for a query such as the following, each
                predicate combined with AND
                uses approximately 125 bytes:
              
SELECT COUNT(*) FROM t
WHERE a=1 AND b=1 AND c=1 ... N;
                For a query with IN()
                predicates:
              
SELECT COUNT(*) FROM t WHERE a IN (1,2, ...,M) AND b IN (1,2, ...,N);
                Each literal value in an
                IN() list counts as a
                predicate combined with OR.
                If there are two IN()
                lists, the number of predicates combined with
                OR is the product of the
                number of literal values in each list. Thus, the number
                of predicates combined with
                OR in the preceding case is
                M ×
                N.