Sun Cluster Geographic Edition Data Replication Guide for EMC Symmetrix Remote Data Facility

Recovering From a Switchover Failure on a System That Uses EMC Symmetrix Remote Data Facility Replication

Basic Sun Cluster Geographic Edition operations such as geopg switchover, perform a symrdf swap operation at the EMC Symmetrix Remote Data Facility data replication level. In EMC Symmetrix Remote Data Facility terminology, a switchover is called a swap. The symrdf swap operation requires significantly more time for static RDF than dynamic RDF. Therefore, you might need to increase the value of the timeout property of the protection group when using static RDF.

If all of the EMC Symmetrix Remote Data Facility commands return a value of 0, the switchover is successful. In some cases, a command might return an error code (a value other than 0). These cases are considered switchover failures.

If a switchover failure occurs, the secondary volumes might not be fully synchronized with the primary volumes. Sun Cluster Geographic Edition software does not start the applications on the new intended primary cluster in a switchover failure scenario.

The remainder of this section describes the initial conditions that lead to a switchover failure and how to recover from a switchover failure.

This section contains information about the following topics:

Switchover Failure Conditions

This section describes a switchover failure scenario. In this scenario, cluster-paris is the original primary cluster and cluster-newyork is the original secondary cluster.

A switchover switches the services from cluster-paris to cluster-newyork as follows:


phys-newyork-1# geopg switchover -f -m cluster-newyork srdfpg

While processing the geopg switchover command, the symrdf swap command runs and returns errors for the EMC Symmetrix Remote Data Facility device group, devgroup1. As a result, the geopg switchover command returns the following failure message:


Processing operation.... this may take a while ....
"Switchover" failed for the following reason:
			Switchover failed for SRDF DG devgroup1

After this failure message has been issued, the two clusters are in the following states:


cluster-paris:
		srdfpg role: Secondary
cluster-newyork:
		srdfpg role: Secondary

phys-newyork-1# symdg list

                          D E V I C E      G R O U P S                       

                                                             Number of
    Name               Type     Valid  Symmetrix ID  Devs   GKs  BCVs  VDEVs

    devgroup1         RDF1    Yes    000187401215     2     0     0      0
    devgroup2         RDF2    Yes    000187401215     6     0     0      0

Recovering From Switchover Failure

This section describes procedures to recover from the failure scenario described in the previous section. These procedures bring the application online on the appropriate cluster.

  1. Place the EMC Symmetrix Remote Data Facility device group, devgroup1, in the Split state.

    Use the symrdf split commands to place the device groups that are in the protection group on both cluster-paris and cluster-newyork in the Split state.


    phys-newyork-1# symrdf -g devgroup1 split
    
  2. Make one of the clusters Primary for the protection group.

    Make the original primary cluster, cluster-paris, Primary for the protection group if you intend to start the application on the original primary cluster. The application uses the current data on the original primary cluster.

    Make the original secondary cluster, cluster-newyork, Primary for the protection group if you intend to start the application on the original secondary cluster. The application uses the current data on the original secondary cluster.


    Caution – Caution –

    Because the symrdf swap command did not perform a swap, the data volumes on cluster-newyork might not be synchronized with the data volumes on cluster-paris. If you intend to start the application with the same data as appears on the original primary cluster, you must not make the original secondary cluster Primary.


ProcedureHow to Make the Original Primary Cluster Primary for an EMC Symmetrix Remote Data Facility Protection Group

  1. Deactivate the protection group on the original primary cluster.


    phys-paris-1# geopg stop -e Local srdfpg
    
  2. Resynchronize the configuration of the protection group.

    This command updates the configuration of the protection group on cluster-paris with the configuration information of the protection group on cluster-newyork.


    phys-paris-1# geopg update srdfpg
    

    After the geopg update command run successfully, srdfpg has the following role on each cluster:


    cluster-paris:
    		srdfpg role: Primary
    cluster-newyork:
    		srdfpg role: secondary
  3. Determine whether the device group has the RDF1 role on the original primary cluster.


    phys-paris-1# symdg list | grep devgroup1 
    
  4. If the device group does not have the RDF1 role on the original primary cluster, run the symrdf swap command so that the device group, devgroup1, resumes the RDF1 role.


    phys-paris-1# symrdf -g devgroup1 failover
    
    phys-paris-1# symrdf -g devgroup1 swap
    

    Confirm that the swap was successful by using the symrdf list command to view the device group information.


    phys-paris-1# symdg list
                            D E V I C E      G R O U P S                       
    
                                                                 Number of
        Name               Type     Valid  Symmetrix ID  Devs   GKs  BCVs  VDEVs
    
        devgroup1         RDF1    Yes    000187401215     6     0     0      0
        devgroup2         RDF1    Yes    000187401215     2     0     0      0
  5. Activate the protection group on both clusters in the partnership.


    phys-paris-1# geopg start -e Global srdfpg
    

    This command starts the application on cluster-paris. Data replication starts from cluster-paris to cluster-newyork.

ProcedureHow to Make the Original Secondary Cluster Primary for an EMC Symmetrix Remote Data Facility Protection Group

  1. Resynchronize the configuration of the protection group.

    This command updates the configuration of the protection group on cluster-newyork with the configuration information of the protection group on cluster-paris.


    phys-newyork-1# geopg update srdfpg
    

    After the geopg update command runs successfully, srdfpg has the following role on each cluster:


    cluster-paris:
    		srdfpg role: Secondary
    cluster-newyork:
    		srdfpg role: Primary
  2. Run the symrdf swap command so that the device group, devgroup2, has the RDF2 role.


    phys-paris-1# symrdf -g devgroup2 failover
    
    phys-paris-1# symrdf -g devgroup2 swap
    

    Confirm that the swap was successful by using the symrdf list command to view the device group information.


    phys-paris-1# symdg list
    
                              D E V I C E      G R O U P S                       
    
                                                                 Number of
        Name               Type     Valid  Symmetrix ID  Devs   GKs  BCVs  VDEVs
    
        devgroup1        RDF2    Yes    000187401215     6     0     0      
        devgroup2        RDF2    Yes    000187401215     2     0     0      0
  3. Activate the protection group on both clusters in the partnership.


    phys-newyork-1# geopg start -e Global srdfpg
    

    This command starts the application on cluster-newyork. Data replication starts from cluster-newyork to cluster-paris.


    Caution – Caution –

    This command overwrites the data on cluster-paris.