Sun Cluster Geographic Edition Overview

Recovering From a Disaster

Disaster tolerance is the ability of a system to restore an application on a secondary cluster when the primary cluster fails. Disaster tolerance is based on data replication and failover. The Sun Cluster Geographic Edition software enables disaster tolerance by redundantly deploying the following:

Data replication is the process of continuously copying data from the primary cluster to the secondary cluster. Through data replication, the secondary cluster has a recent copy of the data on the primary cluster. The secondary cluster can be geographically separated from the primary cluster.

Failover is the automatic relocation of a resource group or device group from a primary cluster to a secondary cluster. If the primary cluster fails, the application and the data are immediately available on the secondary cluster.

The Sun Cluster Geographic Edition software supports two types of migration of services: a switchover and a takeover. A switchover is a planned migration of services from the primary cluster to the secondary cluster. During a switchover, the primary cluster is connected to the secondary cluster and coordinates the migration of services with the secondary cluster. This coordination enables the data replication to complete and ensures that services can be transferred from the primary cluster to the secondary cluster without loss or corruption of data.

A takeover is an emergency migration of services from the primary cluster to the secondary cluster. A system administrator can initiate a takeover to recover from a disaster. Unlike a switchover, the primary cluster is not connected to the secondary cluster during a takeover. Therefore, the primary cluster cannot coordinate with the secondary cluster to migrate the services. Because of this lack of coordination, the risk of data loss and data corruption in a takeover is higher than it is with a switchover. The Sun Cluster Geographic Edition software uses dedicated recovery procedures during a takeover to minimize data loss and data corruption.