18.6.4 NDB Cluster Replication Schema and Tables

Replication in NDB Cluster makes use of a number of dedicated tables in the mysql database on each MySQL Server instance acting as an SQL node in both the cluster being replicated and in the replica. This is true regardless of whether the replica is a single server or a cluster. These tables are created during the MySQL installation process by the mysql_install_db script, and include a table for storing the binary log's indexing data. Since the ndb_binlog_index table is local to each MySQL server and does not participate in clustering, it uses the MyISAM storage engine. This means that it must be created separately on each mysqld participating in the source cluster. (The binary log itself contains updates from all MySQL servers in the cluster to be replicated.) This table is defined as follows:

CREATE TABLE `ndb_binlog_index` (
    `Position` BIGINT(20) UNSIGNED NOT NULL,
    `File` VARCHAR(255) NOT NULL,
    `epoch` BIGINT(20) UNSIGNED NOT NULL,
    `inserts` INT(10) UNSIGNED NOT NULL,
    `updates` INT(10) UNSIGNED NOT NULL,
    `deletes` INT(10) UNSIGNED NOT NULL,
    `schemaops` INT(10) UNSIGNED NOT NULL,
    `orig_server_id` INT(10) UNSIGNED NOT NULL,
    `orig_epoch` BIGINT(20) UNSIGNED NOT NULL,
    `gci` INT(10) UNSIGNED NOT NULL,
    `next_position` bigint(20) unsigned NOT NULL,
    `next_file` varchar(255) NOT NULL,
    PRIMARY KEY (`epoch`,`orig_server_id`,`orig_epoch`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

The size of this table is dependent on the number of epochs per binary log file and the number of binary log files. The number of epochs per binary log file normally depends on the amount of binary log generated per epoch and the size of the binary log file, with smaller epochs resulting in more epochs per file. You should be aware that empty epochs produce inserts to the ndb_binlog_index table, even when the --ndb-log-empty-epochs option is OFF, meaning that the number of entries per file depends on the length of time that the file is in use; this relationship can be represented by the formula shown here:

[number of epochs per file] = [time spent per file] / TimeBetweenEpochs

A busy NDB Cluster writes to the binary log regularly and presumably rotates binary log files more quickly than a quiet one. This means that a “quiet” NDB Cluster with --ndb-log-empty-epochs=ON can actually have a much higher number of ndb_binlog_index rows per file than one with a great deal of activity.

When mysqld is started with the --ndb-log-orig option, the orig_server_id and orig_epoch columns store, respectively, the ID of the server on which the event originated and the epoch in which the event took place on the originating server, which is useful in NDB Cluster replication setups employing multiple sources. The SELECT statement used to find the closest binary log position to the highest applied epoch on the replica in a multi-source setup (see Section 18.6.10, “NDB Cluster Replication: Bidrectional and Circular Replication”) employs these two columns, which are not indexed. This can lead to performance issues when trying to fail over, since the query must perform a table scan, especially when the source has been running with --ndb-log-empty-epochs=ON. You can improve multi-source failover times by adding an index to these columns, as shown here:

ALTER TABLE mysql.ndb_binlog_index
    ADD INDEX orig_lookup USING BTREE (orig_server_id, orig_epoch);

Adding this index provides no benefit when replicating from a single source to a single replica, since the query used to get the binary log position in such cases makes no use of orig_server_id or orig_epoch.

See Section 18.6.8, “Implementing Failover with NDB Cluster Replication”, for more information about using the next_position and next_file columns.

The following figure shows the relationship of the NDB Cluster replication source server, its binary log injector thread, and the mysql.ndb_binlog_index table.

Figure 18.26 The Replication Source Cluster

Most concepts are described in the surrounding text. This complex image has three main areas. The top area is divided into three sections: MySQL Server (mysqld), NDBCLUSTER table handler, and mutex. A connection thread connects these, and receiver and injector threads connect the NDBCLUSTER table handler and mutex. The bottom area shows four data nodes (ndbd). They all produce events represented by arrows pointing to the receiver thread, and the receiver thread also points to the connection and injector threads. One node sends and receives to the mutex area. The arrow representing the injector thread points to a binary log as well as the ndb_binlog_index table, which is described in the surrounding text.

An additional table, named ndb_apply_status, is used to keep a record of the operations that have been replicated from the source to the replica. Unlike the case with ndb_binlog_index, the data in this table is not specific to any one SQL node in the (replica) cluster, and so ndb_apply_status can use the NDBCLUSTER storage engine, as shown here:

CREATE TABLE `ndb_apply_status` (
    `server_id`   INT(10) UNSIGNED NOT NULL,
    `epoch`       BIGINT(20) UNSIGNED NOT NULL,
    `log_name`    VARCHAR(255) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
    `start_pos`   BIGINT(20) UNSIGNED NOT NULL,
    `end_pos`     BIGINT(20) UNSIGNED NOT NULL,
    PRIMARY KEY (`server_id`) USING HASH
) ENGINE=NDBCLUSTER   DEFAULT CHARSET=latin1;

The ndb_apply_status table is populated only on replicas, which means that, on the source, this table never contains any rows; thus, there is no need to allot any DataMemory or IndexMemory to ndb_apply_status there.

Because this table is populated from data originating on the source, it should be allowed to replicate; any replication filtering or binary log filtering rules that inadvertently prevent the replica from updating ndb_apply_status, or that prevent the source from writing into the binary log may prevent replication between clusters from operating properly. For more information about potential problems arising from such filtering rules, see Replication and binary log filtering rules with replication between NDB Clusters.

The ndb_binlog_index and ndb_apply_status tables are created in the mysql database because they should not be explicitly replicated by the user. User intervention is normally not required to create or maintain either of these tables, since both are maintained by the NDB binary log (binlog) injector thread. This keeps the source mysqld process updated to changes performed by the NDB storage engine. The NDB binlog injector thread receives events directly from the NDB storage engine. The NDB injector is responsible for capturing all the data events within the cluster, and ensures that all events which change, insert, or delete data are recorded in the ndb_binlog_index table. The replica I/O thread transfers the events from the source's binary log to the replica's relay log.

Even though ndb_binlog_index and ndb_apply_status are created and maintained automatically, it is advisable to check for the existence and integrity of these tables as an initial step in preparing an NDB Cluster for replication. It is possible to view event data recorded in the binary log by querying the mysql.ndb_binlog_index table directly on the source. This can be also be accomplished using the SHOW BINLOG EVENTS statement on either the source or replica SQL node. (See Section 13.7.5.3, “SHOW BINLOG EVENTS Statement”.)

You can also obtain useful information from the output of SHOW ENGINE NDB STATUS.

Note

When performing schema changes on NDB tables, applications should wait until the ALTER TABLE statement has returned in the MySQL client connection that issued the statement before attempting to use the updated definition of the table.

If the ndb_apply_status table does not exist on the replica, ndb_restore re-creates it.

Conflict resolution for NDB Cluster Replication requires the presence of an additional mysql.ndb_replication table. Currently, this table must be created manually. For information about how to do this, see Section 18.6.11, “NDB Cluster Replication Conflict Resolution”.