Chapter 7 Configuring Geo-replication

Geo-replication provides a facility to mirror data across geographically distributed clusters for the purpose of disaster recovery. The mirroring process is asynchronous, in that it is run periodically and only files that have been modified are replicated. This feature can be used across a LAN, WAN or across the Internet. Since replication is asynchronous, the architecture follows a primary-secondary storage model, where changes on the primary volume are replicated to a secondary volume. In the event of a disaster, data can be restored from a secondary volume.

This chapter describes geo-replication processes, the minimum requirements, and the steps, including commands used, to configure your environment to deploy the feature.

7.1 About Geo-replication

Geo-replication's paramount use case is for disaster recovery. It performs asynchronous copies of data from one Gluster volume to another hosted in a separate geographically distinct cluster. Copies are performed by using a backend rsync process over an SSH connection.

Geo-replication assumes that a primary cluster hosts data in a configured volume. Changes made to data on the primary cluster are copied, periodically, to a volume configured on a secondary cluster.

In the event of failure of the primary cluster, data can be restored from the secondary cluster, which can also be promoted to replace the primary cluster.

Gluster is agnostic about the actual physical location or network between separate clusters where geo-replication is configured. You can configure geo-replication on the same LAN, on a WAN, across the Internet, or even between virtual machines hosted on the same system. Provided that the clusters involved can access each other through SSH across a network, geo-replication can be configured.

Geo-replication also facilitates the possibility of multi-site mirroring, where data from a primary cluster is copied to multiple secondary clusters, and also cascading mirroring, where data that is replicated to a secondary cluster can in turn be mirrored to another secondary cluster.

This documentation focuses on a basic, tested configuration where data on a standard Gluster volume is geo-replicated to a volume hosted on a single secondary cluster.

For more information on geo-replication, see the upstream documentation at https://docs.gluster.org/en/latest/Administrator%20Guide/Geo%20Replication/

7.2 General Requirements for Geo-Replication

Geo-replication requires at least two gluster clusters to be configured to the level where a volume is accessible on each cluster. One cluster is treated as the primary cluster, and the second is treated as the secondary cluster. The primary cluster may already be in use prior to the configuration of a secondary cluster.

Clusters must be set up and configured following the requirements set out in Chapter 2, Installing Gluster Storage for Oracle Linux. Notably, all nodes in both clusters must have network synchronized time using an NTP service, such as chrony.

Geo-replication is performed over SSH between primary and secondary clusters. Access should be available for all nodes in all clusters that take part in geo-replication, so that failover between nodes is handled appropriately. Networking and firewall configuration must facilitate this access.

Ensure that the secondary cluster volume that you are using for mirroring is clean of data and has sufficient capacity to mirror the content on the primary cluster.

Install the glusterfs-geo-replication package on all nodes in both clusters. For example:

sudo yum install glusterfs-geo-replication

Other general software requirements are met by default on most Oracle Linux systems, but you should check that you have the latest available versions of the following packages:

  • rsync

  • openssh

  • glusterfs

  • glusterfs-fuse

Important

When using geo-replication in conjunction with the Gluster snapshot functionality, ensure that snapshots are not out of order on both primary and secondary nodes during restoration. Therefore, when performing snapshot operations, always pause geo-replication first. After the session is paused, either take a snapshot of a volume or restore a previous snapshot. These operations must be done on on both primary and secondary nodes. Then, resume the geo-replicatoin session.

7.3 Setting Up Secondary Nodes for Geo-Replication

If the SSH connections that are used to synchornize the data are required to connect to the root user, then the geo-replication service can connect from the primary cluster to the secondary cluster by using that root account. However, exposing ssh access for a root user is discouraged for security reasons. On production systems, you should instead create a user and group specifically for the purpose of handling these connections on each of the secondary node systems.

sudo useradd georep

The georep is the user you intend to use for connections between nodes. You can set the variable to an existing group that has been created for this user. Alternatively, you can create a separate group to which you add the new user.

You must set the password for this user on at least one secondary host to enable you to copy an ssh key to the host later during the configuration.

You can now configure the gluster mountbroker to automatically handle mounting the volume that you intend to use for mirroring with the appropriate permissions. For this task, you run the gluster-mountbroker on any single secondary node system.

  1. Set up the gluster mountbroker for the new account that you have created:

    sudo gluster-mountbroker setup /var/mountbroker-root georep

    This command sets up a root folder for all volumes handled by the mountbroker. Typically, the folder is set to /var/mountbroker-root, but you can define any location on your secondary nodes. The command creates the directory on all nodes. Substitute georep with the group that has access permissions to this folder. Typically, the georep matches the the username that you previously created. However, if more than one user can access this data, you might need to define a broader group.

  2. Add a volume on your existing secondary cluster to the mountbroker.

    sudo gluster-mountbroker add secondaryvol georep

    The secondaryvol is the volume that you intend to use on your secondary cluster.

  3. Check the status of the mountbroker to determine whether everything is set up correctly:

    sudo gluster-mountbroker status
  4. Restart the glusterd service on all of the secondary cluster nodes:

    sudo systemctl restart glusterd

7.4 Configuring the Geo-replication Session

This procedure creates the ssh key which is then copied to the authorized_keys file of each node of the secondary cluster.

  1. On a node in the primary cluster, create an initial ssh key.

    sudo ssh-keygen
  2. At the passphrase prompt, press Enter.

  3. Copy the created key to a node in your secondary cluster for the georep user.

    sudo ssh-copy-id georep@secondary-node1.example.com
  4. On the same primary cluster node where you created the ssh key, run the following command:

    sudo gluster-georep-sshkey generate

    The command creates a separate key that is copied to all of the primary nodes. The key is then used for subsequent communications with any secondary nodes during geo-replication.

  5. Still on the same node, create the geo-replication session.

    sudo gluster volume geo-replication myvolume georep@secondary-node1.example.com::secondaryvol create push-pem
    myvolume

    Volume on the primary cluster to be replicated

    secondary-node1.example.com

    FQDN or IP address of the node on the secondary cluster.

    secondaryvol

    Volume on the secondary cluster to be used for replication.

    The command copies the public keys for the primary cluster nodes to the node of the secondary cluster.

  6. On the secondary node, configure the environment to use these keys for the georep.

    sudo /usr/libexec/glusterfs/set_geo_rep_pem_keys.sh georep myvolume secondaryvol

7.5 Starting, Stopping and Checking Status of Geo-replication

After configuring the geo-replication session, start the replication with the following command:

sudo gluster volume geo-replication myvolume georep@secondary-node1.example.com::secondaryvol start

Replace start with stop to stop geo-replication for the volume.

To see the status of all running geo-replication, run this command:

sudo gluster volume geo-replication status

The command output shows which cluster nodes in each cluster are being used for the geo-replication synchronization. The status of a geo-replication session can be any of the following:

  • Initializing: The session is starting and the first connections are being made.

  • Created: The geo-replication session is created, but not started.

  • Active: The gsync daemon in this node is active and syncing the data. Only one replica pair on the primary cluster is ever in Active state and handling data synchronization. If the Active node fails, a Passive node is promoted to Active.

  • Passive: A replica pair of the active node. Passive nodes in the cluster are on standby to be promoted to Active status if the currently Active node fails.

  • Faulty: Faulty status can indicate a temporary or critical problem in the geo-replication session. In some cases, the Faulty status might autorectify as the geo-replication service attempt to restart on a node. If a Faulty status persists, check log files to try to resolve the issue. You can determine which log file you should look at with the config command, for example:

    sudo gluster volume geo-replication myvolume georep@secondary-node1.example.com::secondaryvol config log-file
  • Stopped: The geo-replication session is stopped.