17.1.1.8 Setting Up Replication with Existing Data

When setting up replication with existing data, you must decide how best to get the data from the source to the replica before starting the replication service.

The basic process for setting up replication with existing data is as follows:

With the MySQL source running, create a user to be used by the replica when connecting to the source during replication. See Section 17.1.1.3, “Creating a User for Replication”.
If you have not already configured the server_id system variable and enabled binary logging on the source server, you must shut it down to configure these options. See Section 17.1.1.1, “Setting the Replication Source Configuration”.
If you have to shut down your source server, this is a good opportunity to take a snapshot of its databases. You should obtain the source status (see Section 17.1.1.4, “Obtaining the Replication Source Binary Log Coordinates”) before taking down the source, updating the configuration and taking a snapshot. For information on how to create a snapshot using raw data files, see Section 17.1.1.6, “Creating a Data Snapshot Using Raw Data Files”.
If your source server is already correctly configured, obtain its status (see Section 17.1.1.4, “Obtaining the Replication Source Binary Log Coordinates”) and then use mysqldump to take a snapshot (see Section 17.1.1.5, “Creating a Data Snapshot Using mysqldump”) or take a raw snapshot of the live server using the guide in Section 17.1.1.6, “Creating a Data Snapshot Using Raw Data Files”.
Update the configuration of the replica. See Section 17.1.1.2, “Setting the Replica Configuration”.
The next step depends on how you created the snapshot of data on the source.
If you used mysqldump:
1. Start the replica, using the --skip-slave-start option so that replication does not start.
2. Import the dump file:
```
shell> mysql < fulldb.dump
```
If you created a snapshot using the raw data files:
1. Extract the data files into the replica's data directory. For example:
```
shell> tar xvf dbdump.tar
```
  You may need to set permissions and ownership on the files so that the replica server can access and modify them.
2. Start the replica, using the --skip-slave-start option so that replication does not start.
Configure the replica with the replication coordinates from the source. This tells the replica the binary log file and position within the file where replication needs to start. Also, configure the replica with the login credentials and host name of the source. For more information on the CHANGE MASTER TO statement required, see Section 17.1.1.10, “Setting the Source Configuration on the Replica”.
Start the replication threads:
```
mysql> START SLAVE;
```

After you have performed this procedure, the replica should connect to the source and catch up on any updates that have occurred since the snapshot was taken.

If you have forgotten to set the server_id system variable for the source, replicas cannot connect to it.

If you have forgotten to set the server_id system variable for the replica, you get the following error in the replica's error log:

Warning: You should set server-id to a non-0 value if master_host
is set; we will force server id to 2, but this MySQL server will
not act as a slave.

You also find error messages in the replica's error log if it is not able to replicate for any other reason.

The replica uses information stored in its connection metadata repository to keep track of how much of the source's binary log it has processed. The repository can be in the form of files or a table, as determined by the value set for the master_info_repository system variable. When a replica runs with master_info_repository=FILE, you can find in its data directory two files, named master.info and relay-log.info. If master_info_repository=TABLE instead, this information is saved in the table master_slave_info in the mysql system database. In either case, do not remove or edit the files or table unless you know exactly what you are doing and fully understand the implications. Even in that case, it is preferred that you use the CHANGE MASTER TO statement to change replication parameters. The replica can use the values specified in the statement to update the status files automatically. See Section 17.2.2, “Relay Log and Replication Metadata Repositories”, for more information.

Note

The contents of the connection metadata repository override some of the server options specified on the command line or in my.cnf. See Section 17.1.4, “Replication and Binary Logging Options and Variables”, for more details.

A single snapshot of the source suffices for multiple replicas. To set up additional replicas, use the same source snapshot and follow the replica portion of the procedure just described.