MySQL 8.4 Reference Manual Including MySQL NDB Cluster 8.4
The persistent optimizer statistics feature improves plan stability by storing statistics to disk and making them persistent across server restarts so that the optimizer is more likely to make consistent choices each time for a given query.
        Optimizer statistics are persisted to disk when
        innodb_stats_persistent=ON or
        when individual tables are defined with
        STATS_PERSISTENT=1.
        innodb_stats_persistent is
        enabled by default.
      
Formerly, optimizer statistics were cleared when restarting the server and after some other types of operations, and recomputed on the next table access. Consequently, different estimates could be produced when recalculating statistics leading to different choices in query execution plans and variation in query performance.
        Persistent statistics are stored in the
        mysql.innodb_table_stats and
        mysql.innodb_index_stats tables. See
        Section 17.8.10.1.5, “InnoDB Persistent Statistics Tables”.
      
If you prefer not to persist optimizer statistics to disk, see Section 17.8.10.2, “Configuring Non-Persistent Optimizer Statistics Parameters”
          The innodb_stats_auto_recalc
          variable, which is enabled by default, controls whether
          statistics are calculated automatically when a table undergoes
          changes to more than 10% of its rows. You can also configure
          automatic statistics recalculation for individual tables by
          specifying the STATS_AUTO_RECALC clause
          when creating or altering a table.
        
          Because of the asynchronous nature of automatic statistics
          recalculation, which occurs in the background, statistics may
          not be recalculated instantly after running a DML operation
          that affects more than 10% of a table, even when
          innodb_stats_auto_recalc is
          enabled. Statistics recalculation can be delayed by few
          seconds in some cases. If up-to-date statistics are required
          immediately, run ANALYZE TABLE
          to initiate a synchronous (foreground) recalculation of
          statistics.
        
          If innodb_stats_auto_recalc
          is disabled, you can ensure the accuracy of optimizer
          statistics by executing the ANALYZE
          TABLE statement after making substantial changes to
          indexed columns. You might also consider adding
          ANALYZE TABLE to setup scripts
          that you run after loading data, and running
          ANALYZE TABLE on a schedule at
          times of low activity.
        
          When an index is added to an existing table, or when a column
          is added or dropped, index statistics are calculated and added
          to the innodb_index_stats table regardless
          of the value of
          innodb_stats_auto_recalc.
        
          For a histogram with AUTO UPDATE enabled
          (see
          Histogram Statistics Analysis),
          automatic recalculation of persistent statistics also causes
          the histogram to be updated.
        
          innodb_stats_persistent,
          innodb_stats_auto_recalc, and
          innodb_stats_persistent_sample_pages
          are global variables. To override these system-wide settings
          and configure optimizer statistics parameters for individual
          tables, you can define STATS_PERSISTENT,
          STATS_AUTO_RECALC, and
          STATS_SAMPLE_PAGES clauses in
          CREATE TABLE or
          ALTER TABLE statements.
        
              STATS_PERSISTENT specifies whether to
              enable
              persistent
              statistics for an InnoDB table.
              The value DEFAULT causes the persistent
              statistics setting for the table to be determined by the
              innodb_stats_persistent
              setting. A value of 1 enables
              persistent statistics for the table, while a value of
              0 disables the feature. After enabling
              persistent statistics for an individual table, use
              ANALYZE TABLE to calculate
              statistics after table data is loaded.
            
              STATS_AUTO_RECALC specifies whether to
              automatically recalculate
              persistent
              statistics. The value DEFAULT
              causes the persistent statistics setting for the table to
              be determined by the
              innodb_stats_auto_recalc
              setting. A value of 1 causes statistics
              to be recalculated when 10% of table data has changed. A
              value 0 prevents automatic
              recalculation for the table. When using a value of 0, use
              ANALYZE TABLE to
              recalculate statistics after making substantial changes to
              the table.
            
              STATS_SAMPLE_PAGES specifies the number
              of index pages to sample when cardinality and other
              statistics are calculated for an indexed column, by an
              ANALYZE TABLE operation,
              for example.
            
          All three clauses are specified in the following
          CREATE TABLE example:
        
CREATE TABLE `t1` ( `id` int(8) NOT NULL auto_increment, `data` varchar(255), `date` datetime, PRIMARY KEY (`id`), INDEX `DATE_IX` (`date`) ) ENGINE=InnoDB, STATS_PERSISTENT=1, STATS_AUTO_RECALC=1, STATS_SAMPLE_PAGES=25;
          The optimizer uses estimated
          statistics about key
          distributions to choose the indexes for an execution plan,
          based on the relative
          selectivity of the
          index. Operations such as ANALYZE
          TABLE cause InnoDB to sample
          random pages from each index on a table to estimate the
          cardinality of the
          index. This sampling technique is known as a
          random dive.
        
          The
          innodb_stats_persistent_sample_pages
          controls the number of sampled pages. You can adjust the
          setting at runtime to manage the quality of statistics
          estimates used by the optimizer. The default value is 20.
          Consider modifying the setting when encountering the following
          issues:
        
              Statistics are not accurate enough and the
              optimizer chooses suboptimal plans, as shown in
              EXPLAIN output. You can
              check the accuracy of statistics by comparing the actual
              cardinality of an index (determined by running
              SELECT
              DISTINCT on the index columns) with the
              estimates in the
              mysql.innodb_index_stats table.
            
              If it is determined that statistics are not accurate
              enough, the value of
              innodb_stats_persistent_sample_pages
              should be increased until the statistics estimates are
              sufficiently accurate. Increasing
              innodb_stats_persistent_sample_pages
              too much, however, could cause
              ANALYZE TABLE to run
              slowly.
            
              ANALYZE TABLE is
              too slow. In this case
              innodb_stats_persistent_sample_pages
              should be decreased until ANALYZE
              TABLE execution time is acceptable. Decreasing
              the value too much, however, could lead to the first
              problem of inaccurate statistics and suboptimal query
              execution plans.
            
              If a balance cannot be achieved between accurate
              statistics and ANALYZE
              TABLE execution time, consider decreasing the
              number of indexed columns in the table or limiting the
              number of partitions to reduce
              ANALYZE TABLE complexity.
              The number of columns in the table's primary key is also
              important to consider, as primary key columns are appended
              to each nonunique index.
            
For related information, see Section 17.8.10.3, “Estimating ANALYZE TABLE Complexity for InnoDB Tables”.
          By default, InnoDB reads uncommitted data
          when calculating statistics. In the case of an uncommitted
          transaction that deletes rows from a table, delete-marked
          records are excluded when calculating row estimates and index
          statistics, which can lead to non-optimal execution plans for
          other transactions that are operating on the table
          concurrently using a transaction isolation level other than
          READ UNCOMMITTED. To avoid
          this scenario,
          innodb_stats_include_delete_marked
          can be enabled to ensure that delete-marked records are
          included when calculating persistent optimizer statistics.
        
          When
          innodb_stats_include_delete_marked
          is enabled, ANALYZE TABLE
          considers delete-marked records when recalculating statistics.
        
          innodb_stats_include_delete_marked
          is a global setting that affects all InnoDB
          tables, and it is only applicable to persistent optimizer
          statistics.
        
          The persistent statistics feature relies on the internally
          managed tables in the mysql database, named
          innodb_table_stats and
          innodb_index_stats. These tables are set up
          automatically in all install, upgrade, and build-from-source
          procedures.
        
Table 17.6 Columns of innodb_table_stats
| Column name | Description | 
|---|---|
database_name | 
              Database name | 
table_name | 
              Table name, partition name, or subpartition name | 
last_update | 
              A timestamp indicating the last time that InnoDB
                updated this row | 
            
n_rows | 
              The number of rows in the table | 
clustered_index_size | 
              The size of the primary index, in pages | 
sum_of_other_index_sizes | 
              The total size of other (non-primary) indexes, in pages | 
Table 17.7 Columns of innodb_index_stats
| Column name | Description | 
|---|---|
database_name | 
              Database name | 
table_name | 
              Table name, partition name, or subpartition name | 
index_name | 
              Index name | 
last_update | 
              A timestamp indicating the last time the row was updated | 
stat_name | 
              The name of the statistic, whose value is reported in the
                stat_value column | 
            
stat_value | 
              The value of the statistic that is named in stat_name
                column | 
            
sample_size | 
              The number of pages sampled for the estimate provided in the
                stat_value column | 
            
stat_description | 
              Description of the statistic that is named in the
                stat_name column | 
            
          The innodb_table_stats and
          innodb_index_stats tables include a
          last_update column that shows when index
          statistics were last updated:
        
mysql> SELECT * FROM innodb_table_stats \G
*************************** 1. row ***************************
           database_name: sakila
              table_name: actor
             last_update: 2014-05-28 16:16:44
                  n_rows: 200
    clustered_index_size: 1
sum_of_other_index_sizes: 1
...mysql> SELECT * FROM innodb_index_stats \G
*************************** 1. row ***************************
   database_name: sakila
      table_name: actor
      index_name: PRIMARY
     last_update: 2014-05-28 16:16:44
       stat_name: n_diff_pfx01
      stat_value: 200
     sample_size: 1
     ...
          The innodb_table_stats and
          innodb_index_stats tables can be updated
          manually, which makes it possible to force a specific query
          optimization plan or test alternative plans without modifying
          the database. If you manually update statistics, use the
          FLUSH TABLE
           statement to
          load the updated statistics.
        tbl_name
          Persistent statistics are considered local information,
          because they relate to the server instance. The
          innodb_table_stats and
          innodb_index_stats tables are therefore not
          replicated when automatic statistics recalculation takes
          place. If you run ANALYZE TABLE
          to initiate a synchronous recalculation of statistics, the
          statement is replicated (unless you suppressed logging for
          it), and recalculation takes place on replicas.
        
          The innodb_table_stats table contains one
          row for each table. The following example demonstrates the
          type of data collected.
        
          Table t1 contains a primary index (columns
          a, b) secondary index
          (columns c, d), and
          unique index (columns e,
          f):
        
CREATE TABLE t1 ( a INT, b INT, c INT, d INT, e INT, f INT, PRIMARY KEY (a, b), KEY i1 (c, d), UNIQUE KEY i2uniq (e, f) ) ENGINE=INNODB;
          After inserting five rows of sample data, table
          t1 appears as follows:
        
mysql> SELECT * FROM t1;
+---+---+------+------+------+------+
| a | b | c    | d    | e    | f    |
+---+---+------+------+------+------+
| 1 | 1 |   10 |   11 |  100 |  101 |
| 1 | 2 |   10 |   11 |  200 |  102 |
| 1 | 3 |   10 |   11 |  100 |  103 |
| 1 | 4 |   10 |   12 |  200 |  104 |
| 1 | 5 |   10 |   12 |  100 |  105 |
+---+---+------+------+------+------+
          To immediately update statistics, run
          ANALYZE TABLE (if
          innodb_stats_auto_recalc is
          enabled, statistics are updated automatically within a few
          seconds assuming that the 10% threshold for changed table rows
          is reached):
        
mysql> ANALYZE TABLE t1;
+---------+---------+----------+----------+
| Table   | Op      | Msg_type | Msg_text |
+---------+---------+----------+----------+
| test.t1 | analyze | status   | OK       |
+---------+---------+----------+----------+
          Table statistics for table t1 show the last
          time InnoDB updated the table statistics
          (2014-03-14 14:36:34), the number of rows
          in the table (5), the clustered index size
          (1 page), and the combined size of the
          other indexes (2 pages).
        
mysql> SELECT * FROM mysql.innodb_table_stats WHERE table_name like 't1'\G
*************************** 1. row ***************************
           database_name: test
              table_name: t1
             last_update: 2014-03-14 14:36:34
                  n_rows: 5
    clustered_index_size: 1
sum_of_other_index_sizes: 2
          The innodb_index_stats table contains
          multiple rows for each index. Each row in the
          innodb_index_stats table provides data
          related to a particular index statistic which is named in the
          stat_name column and described in the
          stat_description column. For example:
        
mysql>SELECT index_name, stat_name, stat_value, stat_descriptionFROM mysql.innodb_index_stats WHERE table_name like 't1';+------------+--------------+------------+-----------------------------------+ | index_name | stat_name | stat_value | stat_description | +------------+--------------+------------+-----------------------------------+ | PRIMARY | n_diff_pfx01 | 1 | a | | PRIMARY | n_diff_pfx02 | 5 | a,b | | PRIMARY | n_leaf_pages | 1 | Number of leaf pages in the index | | PRIMARY | size | 1 | Number of pages in the index | | i1 | n_diff_pfx01 | 1 | c | | i1 | n_diff_pfx02 | 2 | c,d | | i1 | n_diff_pfx03 | 2 | c,d,a | | i1 | n_diff_pfx04 | 5 | c,d,a,b | | i1 | n_leaf_pages | 1 | Number of leaf pages in the index | | i1 | size | 1 | Number of pages in the index | | i2uniq | n_diff_pfx01 | 2 | e | | i2uniq | n_diff_pfx02 | 5 | e,f | | i2uniq | n_leaf_pages | 1 | Number of leaf pages in the index | | i2uniq | size | 1 | Number of pages in the index | +------------+--------------+------------+-----------------------------------+
          The stat_name column shows the following
          types of statistics:
        
              size: Where
              stat_name=size, the
              stat_value column displays the total
              number of pages in the index.
            
              n_leaf_pages: Where
              stat_name=n_leaf_pages,
              the stat_value column displays the
              number of leaf pages in the index.
            
              n_diff_pfx:
              Where
              NNstat_name=n_diff_pfx01,
              the stat_value column displays the
              number of distinct values in the first column of the
              index. Where
              stat_name=n_diff_pfx02,
              the stat_value column displays the
              number of distinct values in the first two columns of the
              index, and so on. Where
              stat_name=n_diff_pfx,
              the NNstat_description column shows a
              comma separated list of the index columns that are
              counted.
            
          To further illustrate the
          n_diff_pfx
          statistic, which provides cardinality data, consider once
          again the NNt1 table example that was
          introduced previously. As shown below, the
          t1 table is created with a primary index
          (columns a, b), a
          secondary index (columns c,
          d), and a unique index (columns
          e, f):
        
CREATE TABLE t1 ( a INT, b INT, c INT, d INT, e INT, f INT, PRIMARY KEY (a, b), KEY i1 (c, d), UNIQUE KEY i2uniq (e, f) ) ENGINE=INNODB;
          After inserting five rows of sample data, table
          t1 appears as follows:
        
mysql> SELECT * FROM t1;
+---+---+------+------+------+------+
| a | b | c    | d    | e    | f    |
+---+---+------+------+------+------+
| 1 | 1 |   10 |   11 |  100 |  101 |
| 1 | 2 |   10 |   11 |  200 |  102 |
| 1 | 3 |   10 |   11 |  100 |  103 |
| 1 | 4 |   10 |   12 |  200 |  104 |
| 1 | 5 |   10 |   12 |  100 |  105 |
+---+---+------+------+------+------+
          When you query the index_name,
          stat_name, stat_value,
          and stat_description, where
          stat_name LIKE 'n_diff%', the following
          result set is returned:
        
mysql>SELECT index_name, stat_name, stat_value, stat_descriptionFROM mysql.innodb_index_statsWHERE table_name like 't1' AND stat_name LIKE 'n_diff%';+------------+--------------+------------+------------------+ | index_name | stat_name | stat_value | stat_description | +------------+--------------+------------+------------------+ | PRIMARY | n_diff_pfx01 | 1 | a | | PRIMARY | n_diff_pfx02 | 5 | a,b | | i1 | n_diff_pfx01 | 1 | c | | i1 | n_diff_pfx02 | 2 | c,d | | i1 | n_diff_pfx03 | 2 | c,d,a | | i1 | n_diff_pfx04 | 5 | c,d,a,b | | i2uniq | n_diff_pfx01 | 2 | e | | i2uniq | n_diff_pfx02 | 5 | e,f | +------------+--------------+------------+------------------+
          For the PRIMARY index, there are two
          n_diff% rows. The number of rows is equal
          to the number of columns in the index.
        
            For nonunique indexes, InnoDB appends the
            columns of the primary key.
          
              Where
              index_name=PRIMARY
              and
              stat_name=n_diff_pfx01,
              the stat_value is 1,
              which indicates that there is a single distinct value in
              the first column of the index (column
              a). The number of distinct values in
              column a is confirmed by viewing the
              data in column a in table
              t1, in which there is a single distinct
              value (1). The counted column
              (a) is shown in the
              stat_description column of the result
              set.
            
              Where
              index_name=PRIMARY
              and
              stat_name=n_diff_pfx02,
              the stat_value is 5,
              which indicates that there are five distinct values in the
              two columns of the index (a,b). The
              number of distinct values in columns a
              and b is confirmed by viewing the data
              in columns a and b
              in table t1, in which there are five
              distinct values: (1,1),
              (1,2), (1,3),
              (1,4) and (1,5). The
              counted columns (a,b) are shown in the
              stat_description column of the result
              set.
            
          For the secondary index (i1), there are
          four n_diff% rows. Only two columns are
          defined for the secondary index (c,d) but
          there are four n_diff% rows for the
          secondary index because InnoDB suffixes all
          nonunique indexes with the primary key. As a result, there are
          four n_diff% rows instead of two to account
          for the both the secondary index columns
          (c,d) and the primary key columns
          (a,b).
        
              Where index_name=i1
              and
              stat_name=n_diff_pfx01,
              the stat_value is 1,
              which indicates that there is a single distinct value in
              the first column of the index (column
              c). The number of distinct values in
              column c is confirmed by viewing the
              data in column c in table
              t1, in which there is a single distinct
              value: (10). The counted column
              (c) is shown in the
              stat_description column of the result
              set.
            
              Where index_name=i1
              and
              stat_name=n_diff_pfx02,
              the stat_value is 2,
              which indicates that there are two distinct values in the
              first two columns of the index (c,d).
              The number of distinct values in columns
              c an d is confirmed
              by viewing the data in columns c and
              d in table t1, in
              which there are two distinct values:
              (10,11) and (10,12).
              The counted columns (c,d) are shown in
              the stat_description column of the
              result set.
            
              Where index_name=i1
              and
              stat_name=n_diff_pfx03,
              the stat_value is 2,
              which indicates that there are two distinct values in the
              first three columns of the index
              (c,d,a). The number of distinct values
              in columns c, d, and
              a is confirmed by viewing the data in
              column c, d, and
              a in table t1, in
              which there are two distinct values:
              (10,11,1) and
              (10,12,1). The counted columns
              (c,d,a) are shown in the
              stat_description column of the result
              set.
            
              Where index_name=i1
              and
              stat_name=n_diff_pfx04,
              the stat_value is 5,
              which indicates that there are five distinct values in the
              four columns of the index (c,d,a,b).
              The number of distinct values in columns
              c, d,
              a and b is confirmed
              by viewing the data in columns c,
              d, a, and
              b in table t1, in
              which there are five distinct values:
              (10,11,1,1),
              (10,11,1,2),
              (10,11,1,3),
              (10,12,1,4), and
              (10,12,1,5). The counted columns
              (c,d,a,b) are shown in the
              stat_description column of the result
              set.
            
          For the unique index (i2uniq), there are
          two n_diff% rows.
        
              Where
              index_name=i2uniq
              and
              stat_name=n_diff_pfx01,
              the stat_value is 2,
              which indicates that there are two distinct values in the
              first column of the index (column e).
              The number of distinct values in column
              e is confirmed by viewing the data in
              column e in table
              t1, in which there are two distinct
              values: (100) and
              (200). The counted column
              (e) is shown in the
              stat_description column of the result
              set.
            
              Where
              index_name=i2uniq
              and
              stat_name=n_diff_pfx02,
              the stat_value is 5,
              which indicates that there are five distinct values in the
              two columns of the index (e,f). The
              number of distinct values in columns e
              and f is confirmed by viewing the data
              in columns e and f
              in table t1, in which there are five
              distinct values: (100,101),
              (200,102),
              (100,103),
              (200,104), and
              (100,105). The counted columns
              (e,f) are shown in the
              stat_description column of the result
              set.
            
          You can retrieve the index size for tables, partitions, or
          subpartitions can using the
          innodb_index_stats table. In the following
          example, index sizes are retrieved for table
          t1. For a definition of table
          t1 and corresponding index statistics, see
          Section 17.8.10.1.6, “InnoDB Persistent Statistics Tables Example”.
        
mysql>SELECT SUM(stat_value) pages, index_name,SUM(stat_value)*@@innodb_page_size sizeFROM mysql.innodb_index_stats WHERE table_name='t1'AND stat_name = 'size' GROUP BY index_name;+-------+------------+-------+ | pages | index_name | size | +-------+------------+-------+ | 1 | PRIMARY | 16384 | | 1 | i1 | 16384 | | 1 | i2uniq | 16384 | +-------+------------+-------+
          For partitions or subpartitions, you can use the same query
          with a modified WHERE clause to retrieve
          index sizes. For example, the following query retrieves index
          sizes for partitions of table t1:
        
mysql>SELECT SUM(stat_value) pages, index_name,SUM(stat_value)*@@innodb_page_size sizeFROM mysql.innodb_index_stats WHERE table_name like 't1#P%'AND stat_name = 'size' GROUP BY index_name;