MySQL Shell 8.0

8.9 InnoDB ClusterSet Repair and Rejoin

Use this information if you need to repair a cluster in an InnoDB ClusterSet deployment. You can use the information here in any of the following situations:

Section 8.6, “InnoDB ClusterSet Status and Topology” explains how to check the status of an InnoDB Cluster and of the whole InnoDB ClusterSet deployment, and the situations in which a cluster might need repair. You can identify the following situations from the output of the clusterSet.status() command:

If the cluster is the primary cluster in the InnoDB ClusterSet deployment, before repairing it, you might need to carry out a controlled switchover or an emergency failover to demote it to a replica cluster. After that, you can take the cluster offline if necessary to repair it, and the InnoDB ClusterSet will remain available during that time.

Follow this procedure to repair an InnoDB Cluster that is part of an InnoDB ClusterSet deployment:

  1. Using MySQL Shell, connect to any member server in the primary cluster or in one of the replica clusters, using an InnoDB Cluster administrator account (created with cluster.setupAdminAccount()). You may also use the InnoDB Cluster server configuration account, which also has the required permissions. When the connection is established, get the ClusterSet object using a dba.getClusterSet() or cluster.getClusterSet() command. It is important to use an InnoDB Cluster administrator account or server configuration account so that the default user account stored in the ClusterSet object has the correct permissions. For example:

    mysql-js> \connect admin2@127.0.0.1:4410
    Creating a session to 'admin2@127.0.0.1:4410'
    Please provide the password for 'admin2@127.0.0.1:4410': ********
    Save password for 'admin2@127.0.0.1:4410'? [Y]es/[N]o/Ne[v]er (default No):
    Fetching schema names for autocompletion... Press ^C to stop.
    Closing old connection...
    Your MySQL connection id is 42
    Server version: 8.0.27-commercial MySQL Enterprise Server - Commercial
    No default schema selected; type \use <schema> to set one.
    <ClassicSession:admin2@127.0.0.1:4410>
    mysql-js> myclusterset = dba.getClusterSet()
    <ClusterSet:testclusterset>
    
  2. Check the status of the whole deployment using AdminAPI's clusterSet.status() command in MySQL Shell. Use the extended option to see exactly where and what the issues are. For example:

    mysql-js> myclusterset.status({extended: 1})
    

    For an explanation of the output, see Section 8.6, “InnoDB ClusterSet Status and Topology”.

  3. Still using an InnoDB Cluster administrator account (created with cluster.setupAdminAccount()) or InnoDB Cluster server configuration account, get the Cluster object using dba.getCluster(). You can either connect to any member server in the cluster you are repairing, or connect to any member of the InnoDB ClusterSet and use the name parameter on dba.getCluster() to specify the cluster you want. For example:

    mysql-js> cluster2 = dba.getClusterSet()
    <Cluster:clustertwo>
    
  4. Check the status of the cluster using AdminAPI's cluster.status() command in MySQL Shell. Use the extended option to get the most details about the cluster. For example:

    mysql-js> cluster2.status({extended: 2})
    

    For an explanation of the output, see Checking a cluster's Status with Cluster.status().

  5. Following an emergency failover, and there is a risk of the transaction sets differing between parts of the ClusterSet, you have to fence the cluster either from write traffic or all traffic. Section 8.9.1, “Fencing Clusters in an InnoDB ClusterSet” explains how, to fence and unfence a cluster, from MySQL Shell 8.0.28.

  6. If the set of transactions (the GTID set) on the cluster is inconsistent, fix this first. The clusterSet.status() command warns you if a replica cluster's GTID set is inconsistent with the GTID set on the primary cluster in the InnoDB ClusterSet. A replica cluster in this state has the global status OK_NOT_CONSISTENT. You also need to check the GTID set on a former primary cluster, or a replica cluster, that has been marked as invalidated during a controlled switchover or emergency failover procedure. A cluster with extra transactions compared to the other clusters in the ClusterSet can continue to function acceptably in the ClusterSet while it stays active. However, a cluster with extra transactions cannot rejoin the ClusterSet. Section 8.9.2, “Inconsistent Transaction Sets (GTID Sets) in InnoDB ClusterSet Clusters” explains how to check for and resolve issues with the transactions on a server.

  7. If there is a technical issue with a member server in the cluster, or with the overall membership of the cluster (such as insufficient fault tolerance or a loss of quorum), you can work with individual member servers or adjust the cluster membership to resolve this. Section 8.9.3, “Repairing Member Servers and Clusters in an InnoDB ClusterSet” explains what operations are available to work with the member servers in a cluster.

  8. If you cannot repair a cluster, you can remove it from the InnoDB ClusterSet using a clusterSet.removeCluster() command. For instructions to do this, see Section 8.9.4, “Removing a Cluster from an InnoDB ClusterSet”. A removed InnoDB Cluster cannot be added back into an InnoDB ClusterSet deployment. If you want to use the server instances in the deployment again, you will need to set up a new cluster using them.

  9. When you have repaired a cluster or carried out the required maintenance, you can rejoin it to the InnoDB ClusterSet using a clusterSet.rejoin() command. This command validates that the cluster is able to rejoin, updates and starts the ClusterSet replication channel, and removes any invalidated status from the cluster. For instructions to do this, see Section 8.9.5, “Rejoining a Cluster to an InnoDB ClusterSet”.