MySQL 8.4 Reference Manual Including MySQL NDB Cluster 8.4
        Partitioning by HASH is used primarily to
        ensure an even distribution of data among a predetermined number
        of partitions. With range or list partitioning, you must specify
        explicitly which partition a given column value or set of column
        values should be stored in; with hash partitioning, this
        decision is taken care of for you, and you need only specify a
        column value or expression based on a column value to be hashed
        and the number of partitions into which the partitioned table is
        to be divided.
      
        To partition a table using HASH partitioning,
        it is necessary to append to the CREATE
        TABLE statement a PARTITION BY HASH
        ( clause, where
        expr)expr is an expression that returns an
        integer. This can simply be the name of a column whose type is
        one of MySQL's integer types. In addition, you most likely
        want to follow this with PARTITIONS
        , where
        numnum is a positive integer
        representing the number of partitions into which the table is to
        be divided.
      
For simplicity, the tables in the examples that follow do not use any keys. You should be aware that, if a table has any unique keys, every column used in the partitioning expression for this table must be part of every unique key, including the primary key. See Section 26.6.1, “Partitioning Keys, Primary Keys, and Unique Keys”, for more information.
        The following statement creates a table that uses hashing on the
        store_id column and is divided into 4
        partitions:
      
CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH(store_id)
PARTITIONS 4;
        If you do not include a PARTITIONS clause,
        the number of partitions defaults to 1; using
        the PARTITIONS keyword without a number
        following it results in a syntax error.
      
        You can also use an SQL expression that returns an integer for
        expr. For instance, you might want to
        partition based on the year in which an employee was hired. This
        can be done as shown here:
      
CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;
        expr must return a nonconstant,
        nonrandom integer value (in other words, it should be varying
        but deterministic), and must not contain any prohibited
        constructs as described in
        Section 26.6, “Restrictions and Limitations on Partitioning”. You should also keep
        in mind that this expression is evaluated each time a row is
        inserted or updated (or possibly deleted); this means that very
        complex expressions may give rise to performance issues,
        particularly when performing operations (such as batch inserts)
        that affect a great many rows at one time.
      
The most efficient hashing function is one which operates upon a single table column and whose value increases or decreases consistently with the column value, as this allows for “pruning” on ranges of partitions. That is, the more closely that the expression varies with the value of the column on which it is based, the more efficiently MySQL can use the expression for hash partitioning.
        For example, where date_col is a column of
        type DATE, then the expression
        TO_DAYS(date_col) is said to vary
        directly with the value of date_col, because
        for every change in the value of date_col,
        the value of the expression changes in a consistent manner. The
        variance of the expression
        YEAR(date_col) with respect to
        date_col is not quite as direct as that of
        TO_DAYS(date_col), because not
        every possible change in date_col produces an
        equivalent change in
        YEAR(date_col). Even so,
        YEAR(date_col) is a good
        candidate for a hashing function, because it varies directly
        with a portion of date_col and there is no
        possible change in date_col that produces a
        disproportionate change in
        YEAR(date_col).
      
        By way of contrast, suppose that you have a column named
        int_col whose type is
        INT. Now consider the expression
        POW(5-int_col,3) + 6. This would
        be a poor choice for a hashing function because a change in the
        value of int_col is not guaranteed to produce
        a proportional change in the value of the expression. Changing
        the value of int_col by a given amount can
        produce widely differing changes in the value of the expression.
        For example, changing int_col from
        5 to 6 produces a change
        of -1 in the value of the expression, but
        changing the value of int_col from
        6 to 7 produces a change
        of -7 in the expression value.
      
        In other words, the more closely the graph of the column value
        versus the value of the expression follows a straight line as
        traced by the equation
        y= where
        cxc is some nonzero constant, the
        better the expression is suited to hashing. This has to do with
        the fact that the more nonlinear an expression is, the more
        uneven the distribution of data among the partitions it tends to
        produce.
      
In theory, pruning is also possible for expressions involving more than one column value, but determining which of such expressions are suitable can be quite difficult and time-consuming. For this reason, the use of hashing expressions involving multiple columns is not particularly recommended.
        When PARTITION BY HASH is used, the storage
        engine determines which partition of
        num partitions to use based on the
        modulus of the result of the expression. In other words, for a
        given expression expr, the partition
        in which the record is stored is partition number
        N, where
        N =
        MOD(expr,
        num)t1 is defined as follows, so that it has 4
        partitions:
      
CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
    PARTITION BY HASH( YEAR(col3) )
    PARTITIONS 4;
        If you insert a record into t1 whose
        col3 value is
        '2005-09-15', then the partition in which it
        is stored is determined as follows:
      
MOD(YEAR('2005-09-01'),4)
=  MOD(2005,4)
=  1
        MySQL 8.4 also supports a variant of
        HASH partitioning known as
        linear hashing which
        employs a more complex algorithm for determining the placement
        of new rows inserted into the partitioned table. See
        Section 26.2.4.1, “LINEAR HASH Partitioning”, for a description of
        this algorithm.
      
The user-supplied expression is evaluated each time a record is inserted or updated. It may also—depending on the circumstances—be evaluated when records are deleted.