This section provides an example of a disaster recovery scenario and actions an administrator might perform.
Company X has two geographically separated clusters, cluster-paris in Paris, and cluster-newyork in New York. These clusters are configured as partner clusters. The cluster in Paris is configured as the primary cluster and the cluster in New York is the secondary.
The cluster-paris cluster fails temporarily as a result of power outages during a windstorm. An administrator can expect the following events:
The heartbeat communication is lost between cluster-paris and cluster-newyork. Because heartbeat notification was configured during the creation of the partnership, a heartbeat-loss notification email is sent to the administrator.
For information about the configuring partnerships and heartbeat notification, see Creating and Modifying a Partnership.
The administrator receives the notification email and follows the company procedure to verify that the disconnect occurred because of a situation that requires a takeover by the secondary cluster. Because a takeover might take a long time, depending on the requirements of the applications being protected, Company X does not allow takeovers unless the primary cluster cannot be repaired within two hours.
For information about verifying a disconnect on a system, see one of following data replication guides:
Because the cluster-paris cluster cannot be brought online again for at least another day, the administrator runs a geopg takeover command on a node in the cluster in New York. This command starts the protection group on the secondary cluster cluster-newyork in New York.
For information about performing a takeover on a system, see one of the following data replication guides:
After the takeover, the secondary cluster cluster-newyork becomes the new primary cluster. The failed cluster in Paris is still configured to be the primary cluster. Therefore, when the cluster-paris cluster restarts, the cluster detects that the primary cluster was down and lost contact with the partner cluster. Then, the cluster-paris cluster enters an error state that requires administrative action to clear. You might also be required to recover and resynchronize data on the cluster.
For information about recovering data after a takeover, see one of the following data replication guides: