MySQL 5.7 Reference Manual Including MySQL NDB Cluster 7.5 and NDB Cluster 7.6
Storage engines collect statistics about tables for use by the optimizer. Table statistics are based on value groups, where a value group is a set of rows with the same key prefix value. For optimizer purposes, an important statistic is the average value group size.
MySQL uses the average value group size in the following ways:
            To estimate how many rows must be read for each
            ref access
          
To estimate how many rows a partial join produces; that is, the number of rows that an operation of this form produces:
(...) JOINtbl_nameONtbl_name.key=expr
As the average value group size for an index increases, the index is less useful for those two purposes because the average number of rows per lookup increases: For the index to be good for optimization purposes, it is best that each index value target a small number of rows in the table. When a given index value yields a large number of rows, the index is less useful and MySQL is less likely to use it.
        The average value group size is related to table cardinality,
        which is the number of value groups. The
        SHOW INDEX statement displays a
        cardinality value based on N/S, where
        N is the number of rows in the table
        and S is the average value group
        size. That ratio yields an approximate number of value groups in
        the table.
      
        For a join based on the <=> comparison
        operator, NULL is not treated differently
        from any other value: NULL <=> NULL,
        just as  for any other
        N <=>
        NN.
      
        However, for a join based on the = operator,
        NULL is different from
        non-NULL values:
         is not true when
        expr1 =
        expr2expr1 or
        expr2 (or both) are
        NULL. This affects
        ref accesses for comparisons
        of the form : MySQL does not access
        the table if the current value of
        tbl_name.key =
        exprexpr is NULL,
        because the comparison cannot be true.
      
        For = comparisons, it does not matter how
        many NULL values are in the table. For
        optimization purposes, the relevant value is the average size of
        the non-NULL value groups. However, MySQL
        does not currently enable that average size to be collected or
        used.
      
        For InnoDB and MyISAM
        tables, you have some control over collection of table
        statistics by means of the
        innodb_stats_method and
        myisam_stats_method system
        variables, respectively. These variables have three possible
        values, which differ as follows:
      
            When the variable is set to nulls_equal,
            all NULL values are treated as identical
            (that is, they all form a single value group).
          
            If the NULL value group size is much
            higher than the average non-NULL value
            group size, this method skews the average value group size
            upward. This makes index appear to the optimizer to be less
            useful than it really is for joins that look for
            non-NULL values. Consequently, the
            nulls_equal method may cause the
            optimizer not to use the index for
            ref accesses when it
            should.
          
            When the variable is set to
            nulls_unequal, NULL
            values are not considered the same. Instead, each
            NULL value forms a separate value group
            of size 1.
          
            If you have many NULL values, this method
            skews the average value group size downward. If the average
            non-NULL value group size is large,
            counting NULL values each as a group of
            size 1 causes the optimizer to overestimate the value of the
            index for joins that look for non-NULL
            values. Consequently, the nulls_unequal
            method may cause the optimizer to use this index for
            ref lookups when other
            methods may be better.
          
            When the variable is set to
            nulls_ignored, NULL
            values are ignored.
          
        If you tend to use many joins that use
        <=> rather than =,
        NULL values are not special in comparisons
        and one NULL is equal to another. In this
        case, nulls_equal is the appropriate
        statistics method.
      
        The innodb_stats_method system
        variable has a global value; the
        myisam_stats_method system
        variable has both global and session values. Setting the global
        value affects statistics collection for tables from the
        corresponding storage engine. Setting the session value affects
        statistics collection only for the current client connection.
        This means that you can force a table's statistics to be
        regenerated with a given method without affecting other clients
        by setting the session value of
        myisam_stats_method.
      
        To regenerate MyISAM table statistics, you
        can use any of the following methods:
      
            Change the table to cause its statistics to go out of date
            (for example, insert a row and then delete it), and then set
            myisam_stats_method and
            issue an ANALYZE TABLE
            statement
          
        Some caveats regarding the use of
        innodb_stats_method and
        myisam_stats_method:
      
            You can force table statistics to be collected explicitly,
            as just described. However, MySQL may also collect
            statistics automatically. For example, if during the course
            of executing statements for a table, some of those
            statements modify the table, MySQL may collect statistics.
            (This may occur for bulk inserts or deletes, or some
            ALTER TABLE statements, for
            example.) If this happens, the statistics are collected
            using whatever value
            innodb_stats_method or
            myisam_stats_method has at
            the time. Thus, if you collect statistics using one method,
            but the system variable is set to the other method when a
            table's statistics are collected automatically later,
            the other method is used.
          
There is no way to tell which method was used to generate statistics for a given table.
            These variables apply only to InnoDB and
            MyISAM tables. Other storage engines have
            only one method for collecting table statistics. Usually it
            is closer to the nulls_equal method.