Chapter 6 Configuring Geo-replication

Geo-replication provides a facility to mirror data across geographically distributed clusters for the purpose of disaster recovery. The mirroring process is asynchronous, in that it is run periodically and only files that have been modified are replicated. This feature can be used across a LAN, WAN or across the Internet. Since replication is asynchronous, the architecture follows a master-slave model, where changes on the master volume are replicated to a slave volume. In the event of a disaster, data can be restored from a slave volume.

This chapter describes how geo-replication works, the minimum requirements to implement it and what you need to do to configure your environment to achieve it. Some general usage commands are also covered here.

6.1 About Geo-replication

Geo-replication's primary use case is for disaster recovery. It performs asynchronous copies of data from one Gluster volume to another hosted in a separate geographically distinct cluster. Copies are performed using a backend rsync process over an SSH connection.

Geo-replication assumes that there is a master cluster where data is hosted in a configured volume. Changes made to data on the master cluster are copied, periodically, to a volume configured on a slave cluster.

In the event of failure of the master cluster, data can be restored from the slave, or the slave can be promoted to replace the master cluster.

Gluster is agnostic about the actual physical location or network between separate clusters where geo-replication is configured. You can configure geo-replication on the same LAN, on a WAN, across the Internet, or even between virtual machines hosted on the same system. As long as the clusters involved can access each other via SSH across a network, geo-replication can be configured.

Geo-replication also facilitates the possibility of multi-site mirroring, where data from a master cluster is copied to multiple slave clusters; and also cascading mirroring, where data that is replicated to a slave cluster can in turn be mirrored to another slave.

In this documentation, we focus on a basic, tested configuration where data on a standard Gluster volume is geo-replicated to a volume hosted on a single slave cluster.

For more information on geo-replication, see the upstream documentation at https://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication/