MySQL 8.0 Reference Manual Including MySQL NDB Cluster 8.0

5.6.7.8 Remote Cloning Operation Failure Handling

This section describes failure handing at different stages of a cloning operation.

  1. Prerequisites are checked (see Remote Cloning Prerequisites).

    • If a failure occurs during the prerequisite check, the CLONE INSTANCE operation reports an error.

  2. A backup lock is taken to block DDL operations.

    • If the cloning operation is unable to obtain a DDL lock within the time limit specified by the clone_ddl_timeout variable, an error is reported.

  3. User-created data (schemas, tables, tablespaces) and binary logs on the recipient are removed before data is cloned to the recipient data directory.

    • When user created data is removed from the recipient during a remote cloning operation, existing data in the recipient data directory is not saved and may be lost if a failure occurs. If the data to be replaced on the recipient is of importance, a backup should be taken before initiating a remote cloning operation.

      For informational purposes, warnings are printed to the server error log to specify when data removal starts and finishes:

      [Warning] [MY-013453] [InnoDB] Clone removing all user data for provisioning:
      Started...
      
      [Warning] [MY-013453] [InnoDB] Clone removing all user data for provisioning:
      Finished
      

      If a failure occurs while removing data, the recipient may be left with a partial set of schemas, tables, and tablespaces that existed before the cloning operation. Any time during the execution of a cloning operation or after a failure, the server is always in a consistent state.

  4. Data is cloned from the donor. User-created data, dictionary metadata, and other system data are cloned.

    • If a failure occurs while cloning data, the cloning operation is rolled back and all cloned data removed. At this stage, the previously existing data on the recipient has also been removed, which leaves the recipient with no user data.

      Should this scenario occur, you can either rectify the cause of the failure and re-execute the cloning operation, or forgo the cloning operation and restore the recipient data from a backup taken before the cloning operation.

  5. The server is restarted automatically (applies to remote cloning operations that do not clone to a named directory). During startup, typical server startup tasks are performed.

    • If the automatic server restart fails, you can restart the server manually to complete the cloning operation.

Before MySQL 8.0.24, if a network error occurs during a cloning operation, the operation resumes if the error is resolved within five minutes. From MySQL 8.0.24, the operation resumes if the error is resolved within the time specified by the clone_donor_timeout_after_network_failure variable. The clone_donor_timeout_after_network_failure default setting is 5 minutes but a range of 0 to 30 minutes is supported. If the operation does not resume within the allotted time, it aborts and returns an error. Configuring a longer timeout allows more time for resolving network issues but also increases the size of the delta on the donor instance, which can increases clone recovery time as well as the replication lag if the clone is to become a replica or replication group member.

Prior to MySQL 8.0.24, donor threads use the MySQL Server wait_timeout setting when listening for Clone protocol commands. As a result, a low wait_timeout setting could cause a long running remote cloning operation to timeout. From MySQL 8.0.24, the Clone idle timeout is set to the default wait_timeout setting, which is 28800 seconds (8 hours).