MySQL 8.0 Reference Manual Including MySQL NDB Cluster 8.0

21.6.5 Working with InnoDB ReplicaSet

You work with an InnoDB ReplicaSet in much the same way as you would work with an InnoDB cluster. For example as seen in Adding Instances to a ReplicaSet, you assign a ReplicaSet object to a variable and call operations that administer the ReplicaSet, such as ReplicaSet.addInstance() to add instances, which is the equivalent of Cluster.addInstance() in InnoDB cluster. Thus, much of the documentation at Section 21.5, “Working with InnoDB Cluster” also applies to InnoDB ReplicaSet. The following operations are supported by ReplicaSet objects:

For more information, see the linked InnoDB cluster sections.

The following operations are specific to InnoDB ReplicaSet and can only be called against a ReplicaSet object:

Planned Changes of the Replica Set Primary

Use the ReplicaSet.setPrimaryInstance() operation to safely perform a change of the primary of a ReplicaSet to another instance. The current primary is demoted to a secondary and made read-only, while the promoted instance becomes the new primary and is made read-write. All other secondary instances are updated to replicate from the new primary. MySQL Router instances which have been bootstrapped against the ReplicaSet automatically start redirecting read-write clients to the new primary.

For a safe change of the primary to be possible, all replica set instances must be reachable by MySQL Shell and have consistent GTID_EXECUTED sets. If the primary is not available, and there is no way to restore it, a forced failover might be the only option instead, see Forcing the Primary Instance in a Replica Set.

During a change of primary, the promoted instance is synchronized with the old primary, ensuring that all transactions present on the primary are applied before the topology change is committed. If this synchronization step takes too long or is not possible on any of the secondary instances, the operation is aborted. In such a situation, these problematic secondary instances must be either repaired or removed from the ReplicaSet for the fail over to be possible.

Forcing the Primary Instance in a Replica Set

Unlike InnoDB cluster, which supports automatic failover in the event of an unexpected failure of the primary, InnoDB ReplicaSet does not have automatic failure detection or a consensus based protocol such as that provided by Group Replication. If the primary is not available, a manual failover is required. An InnoDB ReplicaSet which has lost its primary is effectively read-only, and for any write changes to be possible a new primary must be chosen. In the event that you cannot connect to the primary, and you cannot use ReplicaSet.setPrimaryInstance() to safely perform a switchover to a new primary as described at Planned Changes of the Replica Set Primary, use the ReplicaSet.forcePrimaryInstance() operation to perform a forced failover of the primary. This is a last resort operation that must only be used in a disaster type scenario where the current primary is unavailable and cannot be restored in any way.

Warning

A forced failover is a potentially destructive action and must be used with caution.

If a target instance is not given (or is null), the most up-to-date instance is automatically selected and promoted to be the new primary. If a target instance is provided, it is promoted to a primary, while other reachable secondary instances are switched to replicate from the new primary. The target instance must have the most up-to-date GTID_EXECUTED set among reachable instances, otherwise the operation fails.

A failover is different from a planned primary change because it promotes a secondary instance without synchronizing with or updating the old primary. That has the following major consequences:

  • Any transactions that had not yet been applied by a secondary at the time the old primary failed are lost.

  • If the old primary is actually still running and processing transactions, there is a split-brain and the datasets of the old and new primaries diverge.

If the last known primary is still reachable, the ReplicaSet.forcePrimary() operation fails, to reduce the risk of split-brain situations. But it is the administrator's responsibility to ensure that the old primary it is not reachable by the other instances to prevent or minimize such scenarios.

After a forced failover, the old primary is considered invalid by the new primary and can no longer be part of the replica set. If at a later date you find a way to recover the instance, it must be removed from the ReplicaSet and re-added as a new instance. If there were any secondary instances that could not be switched to the new primary during the failover, they are also considered invalid.

Data loss is possible after a failover, because the old primary might have had transactions that were not yet replicated to the secondary being promoted. Moreover, if the instance that was presumed to have failed is still able to process transactions, for example because the network where it is located is still functioning but unreachable from MySQL Shell, it continues diverging from the promoted instances. Recovering once transaction sets on instances have diverged requires manual intervention and could not be possible in some situations, even if the failed instances can be recovered. In many cases, the fastest and simplest way to recover from a disaster that required a forced failover is by discarding such diverged transactions and re-provisioning a new instance from the newly promoted primary.

InnoDB ReplicaSet Locking

From version 8.0.20, AdminAPI uses a locking mechanism to avoid different operations from performing changes on an InnoDB ReplicaSet simultaneously. Previously, different instances of MySQL Shell could connect to an InnoDB ReplicaSet at the same time and execute AdminAPI operations simultaneously. This could lead to inconsistent instance states and errors, for example if ReplicaSet.addInstance() and ReplicaSet.setPrimary() were executed in parallel.

The InnoDB ReplicaSet operations have the following locking:

  • dba.upgradeMetadata() and dba.createReplicaSet() are globally exclusive operations. This means that if MySQL Shell executes these operations on an InnoDB ReplicaSet, no other operations can be executed against the InnoDB ReplicaSet or any of its instances.

  • ReplicaSet.forcePrimaryInstance() and ReplicaSet.setPrimaryInstance() are operations that change the primary. This means that if MySQL Shell executes these operations against an InnoDB ReplicaSet, no other operations which change the primary, or instance change operations can be executed until the first operation completes.

  • ReplicaSet.addInstance(), ReplicaSet.rejoinInstance(), and ReplicaSet.removeInstance() are operations that change an instance. This means that if MySQL Shell executes these operations on an instance, the instance is locked for any further instance change operations. However, this lock is only at the instance level and multiple instances in an InnoDB ReplicaSet can each execute one of this type of operation simultaneously. In other words, at most one instance change operation can be executed at a time, per instance in the InnoDB ReplicaSet.

  • dba.getReplicaSet() and ReplicaSet.status() are InnoDB ReplicaSet read operations and do not require any locking.

In practice, if you try to execute an InnoDB ReplicaSet related operation while another operation that cannot be executed concurrently is still running, you get an error indicating that a lock on a needed resource failed to be acquired. In this case, you should wait for the running operation which holds the lock to complete, and only then try to execute the next operation. For example:

mysql-js> rs.addInstance("admin@rs2:3306");

ERROR: The operation cannot be executed because it failed to acquire the lock on
instance 'rs1:3306'. Another operation requiring exclusive access to the
instance is still in progress, please wait for it to finish and try again.

ReplicaSet.addInstance: Failed to acquire lock on instance 'rs1:3306' (MYSQLSH
51400)

In this example, the ReplicaSet.addInstance() operation failed because the lock on the primary instance (rs1:3306) could not be acquired, for example because a ReplicaSet.setPrimaryInstance() operation (or other similar operation) was still running.