18.6.4 MySQL Cluster Replication Schema and Tables

Replication in MySQL Cluster makes use of a number of dedicated tables in the mysql database on each MySQL Server instance acting as an SQL node in both the cluster being replicated and the replication slave (whether the slave is a single server or a cluster). These tables are created during the MySQL installation process by the mysql_install_db script, and include a table for storing the binary log's indexing data. Since the ndb_binlog_index table is local to each MySQL server and does not participate in clustering, it uses the MyISAM storage engine. This means that it must be created separately on each mysqld participating in the master cluster. (However, the binary log itself contains updates from all MySQL servers in the cluster to be replicated.) This table is defined as follows:

CREATE TABLE `ndb_binlog_index` (
    `Position` BIGINT(20) UNSIGNED NOT NULL,
    `File` VARCHAR(255) NOT NULL,
    `epoch` BIGINT(20) UNSIGNED NOT NULL,
    `inserts` INT(10) UNSIGNED NOT NULL,
    `updates` INT(10) UNSIGNED NOT NULL,
    `deletes` INT(10) UNSIGNED NOT NULL,
    `schemaops` INT(10) UNSIGNED NOT NULL,
    `orig_server_id` INT(10) UNSIGNED NOT NULL,
    `orig_epoch` BIGINT(20) UNSIGNED NOT NULL,
    `gci` INT(10) UNSIGNED NOT NULL,
    `next_position` bigint(20) unsigned NOT NULL,
    `next_file` varchar(255) NOT NULL,
    PRIMARY KEY (`epoch`,`orig_server_id`,`orig_epoch`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

The size of this table is dependent on the number of epochs per binary log file and the number of binary log files. The number of epochs per binary log file normally depends on the amount of binary log generated per epoch and the size of the binary log file, with smaller epochs resulting in more epochs per file. It should be noted that empty epochs produce result in inserts to the ndb_binlog_index table, even when the --ndb-log-empty-epochs option is OFF, meaning that the number of entries per file depends on the length of time that the file is in use; that is,

[number of epochs per file] = [time spent per file] / TimeBetweenEpochs  

A busy MySQL Cluster writes to the binary log regularly and presumably rotates binary log files more quickly than a quiet one. This means that a quiet MySQL Cluster with --ndb-log-empty-epochs=ON can actually have a much higher number of ndb_binlog_index rows per file than one with a great deal of activity.

When mysqld is started with the --ndb-log-orig option, the orig_server_id and orig_epoch columns store, respectively, the ID of the server on which the event originated and the epoch in which the event took place on the originating server, which is useful in MySQL Cluster replication setups employing multiple masters. The SELECT statement used to find the closest binlog position to the highest applied epoch on the slave in a multi-master setup (see Section 18.6.10, “MySQL Cluster Replication: Multi-Master and Circular Replication”) employs these two columns, which are not indexed. This can lead to performance issues when trying to fail over, since the query must perform a table scan, especially when the master has been running with --ndb-log-empty-epochs=ON. You can improve multi-master failover times by adding an index to these columns, as shown here:

ALTER TABLE mysql.ndb_binlog_index 
    ADD INDEX orig_lookup USING BTREE (orig_server_id, orig_epoch);

Adding this index provides no benefit when replicating from a single master to a single slave, since the query used to get the binary log position in such cases makes no use of orig_server_id or orig_epoch.

See Section 18.6.8, “Implementing Failover with MySQL Cluster Replication”, for more information about using the next_position and next_file columns.

The following figure shows the relationship of the MySQL Cluster replication master server, its binlog injector thread, and the mysql.ndb_binlog_index table.

The replication master cluster, the binlog injector thread, and the ndb_binlog_index table

An additional table, named ndb_apply_status, is used to keep a record of the operations that have been replicated from the master to the slave. Unlike the case with ndb_binlog_index, the data in this table is not specific to any one SQL node in the (slave) cluster, and so ndb_apply_status can use the NDBCLUSTER storage engine, as shown here:

CREATE TABLE `ndb_apply_status` (
    `server_id`   INT(10) UNSIGNED NOT NULL,
    `epoch`       BIGINT(20) UNSIGNED NOT NULL,
    `log_name`    VARCHAR(255) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
    `start_pos`   BIGINT(20) UNSIGNED NOT NULL,
    `end_pos`     BIGINT(20) UNSIGNED NOT NULL,
    PRIMARY KEY (`server_id`) USING HASH
) ENGINE=NDBCLUSTER   DEFAULT CHARSET=latin1;

The ndb_apply_status table is populated only on slaves, which means that, on the master, this table never contains any rows; thus, there is no need to allow for DataMemory or IndexMemory to be allotted to ndb_apply_status there.

Because this table is populated from data originating on the master, it should be allowed to replicate; any replication filtering or binary log filtering rules that inadvertently prevent the slave from updating ndb_apply_status or the master from writing into the binary log may prevent replication between clusters from operating properly. For more information about potential problems arising from such filtering rules, see Replication and binary log filtering rules with replication between MySQL Clusters.

The ndb_binlog_index and ndb_apply_status tables are created in the mysql database because they should not be explicitly replicated by the user. User intervention is normally not required to create or maintain either of these tables, since both ndb_binlog_index and the ndb_apply_status are maintained by the NDB binary log (binlog) injector thread. This keeps the master mysqld process updated to changes performed by the NDB storage engine. The NDB binlog injector thread receives events directly from the NDB storage engine. The NDB injector is responsible for capturing all the data events within the cluster, and ensures that all events which change, insert, or delete data are recorded in the ndb_binlog_index table. The slave I/O thread transfers the events from the master's binary log to the slave's relay log.

However, it is advisable to check for the existence and integrity of these tables as an initial step in preparing a MySQL Cluster for replication. It is possible to view event data recorded in the binary log by querying the mysql.ndb_binlog_index table directly on the master. This can be also be accomplished using the SHOW BINLOG EVENTS statement on either the replication master or slave MySQL servers. (See Section 13.7.5.3, “SHOW BINLOG EVENTS Syntax”.)

You can also obtain useful information from the output of SHOW ENGINE NDB STATUS.

The ndb_schema table is used to track schema changes made to NDB tables. It is defined as shown here:

CREATE TABLE ndb_schema (
    `db` VARBINARY(63) NOT NULL,
    `name` VARBINARY(63) NOT NULL,
    `slock` BINARY(32) NOT NULL,
    `query` BLOB NOT NULL,
    `node_id` INT UNSIGNED NOT NULL,
    `epoch` BIGINT UNSIGNED NOT NULL,
    `id` INT UNSIGNED NOT NULL,
    `version` INT UNSIGNED NOT NULL,
    `type` INT UNSIGNED NOT NULL,
    PRIMARY KEY USING HASH (db,name)
) ENGINE=NDB   DEFAULT CHARSET=latin1;

Unlike the two tables previously mentioned in this section, the ndb_schema table is not visible either to MySQL SHOW statements, or in any INFORMATION_SCHEMA tables; however, it can be seen in the output of ndb_show_tables, as shown here:

shell> ndb_show_tables -t 2
id    type                 state    logging database     schema   name
4     UserTable            Online   Yes     mysql        def      ndb_apply_status
5     UserTable            Online   Yes     ndbworld     def      City
6     UserTable            Online   Yes     ndbworld     def      Country
3     UserTable            Online   Yes     mysql        def      NDB$BLOB_2_3
7     UserTable            Online   Yes     ndbworld     def      CountryLanguage
2     UserTable            Online   Yes     mysql        def      ndb_schema

NDBT_ProgramExit: 0 - OK

It is also possible to SELECT from this table in mysql and other MySQL client applications, as shown here:

mysql> SELECT * FROM mysql.ndb_schema WHERE name='City' \G
*************************** 1. row ***************************
     db: ndbworld
   name: City
  slock:
  query: alter table City engine=ndb
node_id: 4
  epoch: 0
     id: 0
version: 0
   type: 7
1 row in set (0.00 sec)

This can sometimes be useful when debugging applications.

Note

When performing schema changes on NDB tables, applications should wait until the ALTER TABLE statement has returned in the MySQL client connection that issued the statement before attempting to use the updated definition of the table.

If the ndb_apply_status table or the ndb_schema table does not exist on the slave, ndb_restore re-creates the missing table or tables (Bug #14612).

Conflict resolution for MySQL Cluster Replication requires the presence of an additional mysql.ndb_replication table. Currently, this table must be created manually. For information about how to do this, see Section 18.6.11, “MySQL Cluster Replication Conflict Resolution”.