Sun Cluster Geographic Edition System Administration Guide

Chapter 1 Introduction to Administering the Sun Cluster Geographic Edition Software

Sun Cluster Geographic Edition software protects applications from unexpected disruptions by using multiple clusters that are geographically separated. These clusters contain identical copies of the Sun Cluster Geographic Edition infrastructure, which manage replicated data between the clusters. Sun Cluster Geographic Edition software is a layered extension of the Sun Cluster software.

This chapter contains the following topics:

Sun Cluster Geographic Edition Administration Tasks

Familiarize yourself with the planning information in the Sun Cluster Geographic Edition Installation Guide and the Sun Cluster Geographic Edition Overview before beginning administration tasks. This guide contains the standard tasks that are used to administer and maintain the Sun Cluster Geographic Edition configurations.

For general Sun Cluster, data service, and hardware administration tasks, refer to the Sun Cluster documentation.

You can perform all administration tasks on a cluster that is running the Sun Cluster Geographic Edition software without causing any nodes or the cluster to fail. The Sun Cluster Geographic Edition software can be installed, configured, started, used, stopped, and uninstalled on an operational cluster.


Note –

You might be required to take nodes or the cluster offline for preparatory actions, such as installing data replication software and performing Sun Cluster administrative tasks. Refer to the appropriate product documentation for administration restrictions.


Sun Cluster Geographic Edition Administration Tools

You can perform administrative tasks on a cluster that is running Sun Cluster Geographic Edition software by using a graphical user interface (GUI) or the command-line interface (CLI).

The procedures in this guide describe how to perform administrative tasks using the CLI.

Graphical User Interface

Sun Cluster software supports the SunPlexTM Manager, a GUI tool that you can use to perform various administrative tasks on your cluster. For specific information about how to use SunPlex Manager, see the Sun Cluster online help.


Note –

To administer Sun Cluster Geographic Edition software using the GUI, the root passwords must be the same on all nodes of both clusters in the partnership.


You can only use the GUI to administer Sun Cluster Geographic Edition software after the software infrastructure has been enabled using the geoadm start command. Use the CLI to issue the geoadm start and geoadm stop commands. For information on enabling and disabling the Sun Cluster Geographic Edition infrastructure, see Chapter 3, Administering the Sun Cluster Geographic Edition Infrastructure.

The GUI does not support creating custom heartbeats outside of a partnership. If you want to specify a custom heartbeat in a partnership join operation, use the CLI to execute the geops join-partnership command.

Command-Line Interface

Table 1–1 lists the commands that you can use to administer the Sun Cluster Geographic Edition software. For more information about each command, refer to the Sun Cluster Geographic Edition Reference Manual.

Table 1–1 Sun Cluster Geographic Edition CLI

Command 

Description 

geoadm

Enables or disables the Sun Cluster Geographic Edition software on the local cluster and prints the runtime status of the local cluster 

geohb

Configures and manages the heartbeat mechanism that is provided with the Sun Cluster Geographic Edition software  

geops

Creates and manages the partnerships between clusters 

geopg

Configures and manages protection groups 

Overview of Disaster Recovery Administration

This section provides an example of a disaster recovery scenario and actions an administrator might perform.

Company X has two geographically separated clusters, cluster-paris in Paris, and cluster-newyork in New York. These clusters are configured as partner clusters. The cluster in Paris is configured as the primary cluster and the cluster in New York is the secondary.

The cluster-paris cluster fails temporarily as a result of power outages during a windstorm. For an administrator, the following events occur:

  1. The heartbeat communication is lost between cluster-paris and cluster-newyork. Because heartbeat notification was configured during the creation of the partnership, a heartbeat-loss notification email is sent to the administrator.

    For information about the configuring partnerships and heartbeat notification, see Creating and Modifying a Partnership.

  2. The administrator receives the notification email and follows the company procedure to verify the disconnect occurred because of a situation that requires a takeover by the secondary cluster. Because a takeover is expensive, Company X does not allow takeovers unless the primary cluster cannot be repaired within two hours.

    For information about verifying a disconnect on a system that uses Sun StorEdge Availability Suite 3.2.1, see Detecting Cluster Failure on a System That Uses Sun StorEdge Availability Suite 3.2.1 Data Replication.

    For information about verifying a disconnect on a system that uses Hitachi TrueCopy, see Detecting Cluster Failure on a System That Uses Hitachi TrueCopy Data Replication.

  3. Because the cluster-paris cluster cannot be brought online again for at least another day, the administrator executes a geopg takeover command on a New York node, which starts the protection group on the secondary cluster cluster-newyork in New York.

    For information about performing a takeover on a system that uses Sun StorEdge Availability Suite 3.2.1 data replication, see Forcing a Takeover on Systems That Use Sun StorEdge Availability Suite 3.2.1. For information about performing a takeover on a system that uses Hitachi TrueCopy data replication, see Forcing a Takeover on a System That Uses Hitachi TrueCopy Data Replication.

  4. After the takeover, the secondary cluster cluster-newyork becomes the new primary cluster. The failed cluster in Paris is still configured to be primary, so when cluster-paris restarts, the cluster detects that it was down and lost contact with the partner cluster. Then, cluster-paris enters an error state that requires administrative action to repair. The cluster might also need to recover and resynchronize data.

    For information about recovering data after a takeover on a system that uses Sun StorEdge Availability Suite 3.2.1 data replication, see Recovering Sun StorEdge Availability Suite 3.2.1 Data After a Takeover. For information about performing a takeover on a system that uses Hitachi TrueCopy data replication, see Failback of Services to the Original Primary Cluster on a System That Uses Hitachi TrueCopy Replication.