Sun Cluster Geographic Edition Data Replication Guide for EMC Symmetrix Remote Data Facility

Chapter 3 Migrating Services That Use EMC Symmetrix Remote Data Facility Data Replication

This chapter provides information about migrating services for maintenance or as a result of cluster failure. This chapter contains the following sections:

Detecting Cluster Failure on a System That Uses EMC Symmetrix Remote Data Facility Data Replication

This section describes the internal processes that occur when failure is detected on a primary or a secondary cluster.

Detecting Primary Cluster Failure

When the primary cluster for a protection group fails, the secondary cluster in the partnership detects the failure. The cluster that fails might be a member of more than one partnership, resulting in multiple failure detections.

The following actions take place when a primary cluster failure occurs. During a failure, the appropriate protection groups are in the Unknown state on the cluster that failed.

Detecting Secondary Cluster Failure

When a secondary cluster for a protection group fails, a cluster in the same partnership detects the failure. The cluster that failed might be a member of more than one partnership, resulting in multiple failure detections.

During failure detection, the following actions take place:

Migrating Services That Use EMC Symmetrix Remote Data Facility Data Replication With a Switchover

Perform a switchover of an EMC Symmetrix Remote Data Facility protection group when you want to migrate services to the partner cluster in an orderly fashion. Basic Sun Cluster Geographic Edition operations such as geopg switchover, perform a symrdf swap operation. The symrdf swap operation requires significantly more time for static RDF than dynamic RDF. Therefore, you might need to increase the value of the timeout property of the protection group when using static RDF.

A switchover consists of the following:


Note –

You cannot perform personality swaps if you are running EMC Symmetrix Remote Data Facility/Asynchronous data replication.


This section contains information about the following topics:

Validations That Occur Before a Switchover

When a switchover is initiated by using the geopg switchover command, the data replication subsystem runs several validations on both clusters. The switchover is performed only if the validation step succeeds on both clusters.

First, the replication subsystem checks that the EMC Symmetrix Remote Data Facility device group is in a valid aggregate RDF pair state. Then, it checks that the local device group type on the target primary cluster, cluster-newyork, is RDF2. The symrdf -g device-group-name -query command returns the local device group's state. These values correspond to a RDF1 or RDF2 state. The following table describes the EMC Symmetrix Remote Data Facility command that is run on the new primary cluster, cluster-newyork.

Table 3–1 EMC Symmetrix Remote Data Facility Switchover Validations on the New Primary Cluster

RDF Pair State 

EMC Symmetrix Remote Data Facility Switchover Command That Is Run on cluster-newyork

Synchronized

Suspends the RDF link.  

R1Updated, Failedover, Suspended

The symrdf swap command switches the role.

Other RDF pair states 

No command is run. 

Results of a Switchover From a Replication Perspective

After a successful switchover, at the data replication level the roles of the primary and secondary volumes have been switched. The pre-switchover RDF1 volumes become the RDF2 volumes. The pre-switchover RDF2 volumes become the RDF1 volumes. Data replication continues from the new RDF1 volumes to the new RDF2 volumes.

The Local-role property of the protection group is also switched regardless of whether the application could be brought online on the new primary cluster as part of the switchover operation. On the cluster on which the protection group had a Local role of Secondary, the Local-role property of the protection group becomes Primary. On the cluster on which the protection group had a Local-role of Primary, the Local-role property of the protection group becomes Secondary.

ProcedureHow to Switch Over an EMC Symmetrix Remote Data Facility Protection Group From Primary to Secondary

Before You Begin

For a successful switchover, data replication must be active between the primary and the secondary clusters and data volumes on the two clusters must be synchronized.

Before you switch over a protection group from the primary cluster to the secondary cluster, ensure that the following conditions are met:


Caution – Caution –

If you have configured the Cluster_dgs property, only applications that belong to the protection group can write to the device groups specified in the Cluster_dgs property.


  1. Log in to a cluster node.

    You must be assigned the Geo Management RBAC rights profile to complete this procedure. For more information about RBAC, see Sun Cluster Geographic Edition Software and RBAC in Sun Cluster Geographic Edition System Administration Guide.

  2. Initiate the switchover.

    The application resource groups that are a part of the protection group are stopped and started during the switchover.


    # geopg switchover [-f] -m newprimarycluster protectiongroupname
    
    -f

    Forces the command to perform the operation without asking you for confirmation

    -m newprimarycluster

    Specifies the name of the cluster that is to be the new primary cluster for the protection group

    protectiongroupname

    Specifies the name of the protection group


Example 3–1 Forcing a Switchover From Primary to Secondary

This example performs a switchover to the secondary cluster.


# geopg switchover -f -m cluster-newyork srdfpg

Forcing a Takeover on a System That Uses EMC Symmetrix Remote Data Facility Data Replication

Perform a takeover when applications need to be brought online on the secondary cluster regardless of whether the data is completely consistent between the primary volume and the secondary volume. The information in this section assumes that the protection group has been started.

The following steps occur after a takeover is initiated:

For more details about takeover and the effects of the geopg takeover command, see Overview of Disaster Recovery Administration in Sun Cluster Geographic Edition System Administration Guide.

For details about the possible conditions of the primary and secondary cluster before and after takeover, see Appendix C, Takeover Postconditions, in Sun Cluster Geographic Edition System Administration Guide.

The following sections describe the steps you must perform to force a takeover by a secondary cluster.

Validations That Occur Before a Takeover

When a takeover is initiated by using the geopg takeover command, the data replication subsystem runs several validations on both clusters. These step are conducted on the original primary cluster only if the primary cluster can be reached. If validation on the original primary cluster fails, the takeover still occurs.

First, the replication subsystem checks that the EMC Symmetrix Remote Data Facility device group is in a valid aggregate RDF pair state. The EMC Symmetrix Remote Data Facility commands that are used for the takeover are described in the following table.

Table 3–2 EMC Symmetrix Remote Data Facility Takeover Validations on the New Primary Cluster

Aggregate RDF Pair State 

Protection Group Local Role 

EMC Symmetrix Remote Data Facility Takeover Commands That Are Run on cluster-newyork

FailedOver

Primary 

symrdf $option $dg write_disable r2

symrdf -g dg suspend

symrdf $option $dg rw_enable r1

FailedOver

Secondary 

No command is run. 

Synchronized, Suspended, R1 Updated, Partitioned

All 

symrdf -g dg failover

Results of a Takeover From a Replication Perspective

From a replication perspective, after a successful takeover, the Local-role property of the protection group is changed to reflect the new role, regardless of whether the application could be brought online on the new primary cluster as part of the takeover operation. On cluster-newyork, where the protection group had a Local-role of Secondary, the Local-role property of the protection group becomes Primary. On cluster-paris, where the protection group had a Local-role of Primary, the following might occur:

If the takeover is successful, the applications are brought online. You do not need to run a separate geopg start command.


Caution – Caution –

After a successful takeover, data replication between the new primary cluster, cluster-newyork, and the old primary cluster, cluster-paris, is stopped. If you want to run a geopg start command, you must use the -n option to prevent replication from resuming.


ProcedureHow to Force Immediate Takeover of EMC Symmetrix Remote Data Facility Services by a Secondary Cluster

Before You Begin

Before you force the secondary cluster to assume the activity of the primary cluster, ensure that the following conditions are met:

  1. Log in to a node in the secondary cluster.

    You must be assigned the Geo Management RBAC rights profile to complete this procedure. For more information about RBAC, see Sun Cluster Geographic Edition Software and RBAC in Sun Cluster Geographic Edition System Administration Guide.

  2. Initiate the takeover.


    # geopg takeover [-f] protectiongroupname
    
    -f

    Forces the command to perform the operation without your confirmation

    protectiongroupname

    Specifies the name of the protection group


Example 3–2 Forcing a Takeover by a Secondary Cluster

This example forces the takeover of srdfpg by the secondary cluster cluster-newyork.

The phys-newyork-1 cluster is the first node of the secondary cluster. For a reminder of which node is phys-newyork-1, see Example Sun Cluster Geographic Edition Cluster Configuration in Sun Cluster Geographic Edition System Administration Guide.


phys-newyork-1# geopg takeover -f srdfpg

Next Steps

For information about the state of the primary and secondary clusters after a takeover, see Appendix C, Takeover Postconditions, in Sun Cluster Geographic Edition System Administration Guide.

Recovering Services to a Cluster on a System That Uses EMC Symmetrix Remote Data Facility Replication

After a successful takeover operation, the secondary cluster, cluster-newyork, becomes the primary for the protection group and the services are online on the secondary cluster. After the recovery of the original primary cluster, cluster-paris, the services can be brought online again on the original primary by using a process called failback.

Sun Cluster Geographic Edition software supports the following two kinds of failback:

If you want to leave the new primary, cluster-newyork, as the primary cluster and the original primary cluster, cluster-paris, as the secondary after the original primary restarts, you can resynchronize and revalidate the protection group configuration without performing a switchover or takeover.

This section contains information about the following topics:

ProcedureHow to Resynchronize and Revalidate the Protection Group Configuration

Use this procedure to resynchronize and revalidate data on the original primary cluster, cluster-paris, with the data on the current primary cluster, cluster-newyork.

Before You Begin

Before you resynchronize and revalidate the protection group configuration, a takeover has occurred on cluster-newyork. The clusters now have the following roles:

  1. Resynchronize the original primary cluster, cluster-paris, with the current primary cluster, cluster-newyork.

    cluster-paris forfeits its own configuration and replicates the cluster-newyork configuration locally. Resynchronize both the partnership and protection group configurations.

    1. On cluster-paris, resynchronize the partnership.


      phys-paris-1# geops update partnershipname
      
      partnershipname

      Specifies the name of the partnership


      Note –

      You need to perform this step only once, even if you are resynchronizing multiple protection groups.


      For more information about synchronizing partnerships, see Resynchronizing a Partnership in Sun Cluster Geographic Edition System Administration Guide.

    2. On cluster-paris, resynchronize each protection group.

      Because the role of the protection group on cluster-newyork is primary, this step ensures that the role of the protection group on cluster-paris is secondary.


      phys-paris-1# geopg update protectiongroupname
      
      protectiongroupname

      Specifies the name of the protection group

      For more information about synchronizing protection groups, see Resynchronizing an EMC Symmetrix Remote Data Facility Protection Group.

  2. On cluster-paris, validate the cluster configuration for each protection group.


    phys-paris-1# geopg validate protectiongroupname 
    
    protectiongroupname

    Specifies a unique name that identifies a single protection group

    For more information, see How to Validate an EMC Symmetrix Remote Data Facility Protection Group.

  3. On cluster-paris, activate each protection group.

    Because the protection group on cluster-paris has a role of secondary, the geopg start command does not restart the application on cluster-paris.


    phys-paris-1# geopg start -n -e local protectiongroupname
    
    -e local

    Specifies the scope of the command.

    By specifying a local scope, the command operates on the local cluster only.

    -n

    Specifies that data replication should not be used for this protection group. If this option is omitted, data replication starts at the same time as the protection group.

    protectiongroupname

    Specifies the name of the protection group.

    Because the protection group has a role of secondary, the data is synchronized from the current primary, cluster-newyork, to the current secondary, cluster-paris.

    For more information about the geopg start command, see How to Activate an EMC Symmetrix Remote Data Facility Protection Group.

  4. Confirm that the protection group configuration is OK.

    First, confirm that the state of the protection group on cluster-newyork is OK. The protection group has a local state of OK when the EMC Symmetrix Remote Data Facility device groups on cluster-newyork have a Synchronized EMC Symmetrix Remote Data Facility pair state.


    phys-newyork-1# geoadm status
    

    Refer to the Protection Group section of the output.

    Next, confirm that all resources in the replication resource group, protectiongroupname-rep-rg, report a status of OK.


    phys-newyork-1# clresource status -g protectiongroupname-rep-rg
    

ProcedureHow to Perform a Failback-Switchover on a System That Uses EMC Symmetrix Remote Data Facility Replication

Use this procedure to restart an application on the original primary cluster, cluster-paris, after the data on this cluster has been resynchronized with the data on the current primary cluster, cluster-newyork.


Note –

The failback procedures apply only to clusters in a partnership. You need to perform the following procedure only once per partnership.


Before You Begin

Before you perform a failback-switchover, a takeover has occurred on cluster-newyork. The clusters have the following roles:

  1. Resynchronize the original primary cluster, cluster-paris, with the current primary cluster, cluster-newyork.

    cluster-paris forfeits its own configuration and replicates the cluster-newyork configuration locally. Resynchronize both the partnership and protection group configurations.

    1. On cluster-paris, resynchronize the partnership.


      phys-paris-1# geops update partnershipname
      
      partnershipname

      Specifies the name of the partnership


      Note –

      You need to perform this step only once per partnership, even if you are performing a failback-switchover for multiple protection groups in the partnership.


      For more information about synchronizing partnerships, see Resynchronizing a Partnership in Sun Cluster Geographic Edition System Administration Guide.

    2. Determine whether the protection group on the original primary cluster, cluster-paris, is active.


      phys-paris-1# geoadm status
      
    3. If the protection group on the original primary cluster is active, stop it.


      phys-paris-1# geopg stop -e local protectiongroupname
      
    4. Verify that the protection group is stopped.


      phys-paris-1# geoadm status
      
    5. On cluster-paris, resynchronize each protection group.

      Because the local role of the protection group on cluster-newyork is now primary, this steps ensures that the role of the protection group on cluster-paris becomes secondary.


      phys-paris-1# geopg update protectiongroupname
      
      protectiongroupname

      Specifies the name of the protection group

      For more information about synchronizing protection groups, see Resynchronizing an EMC Symmetrix Remote Data Facility Protection Group.

  2. On cluster-paris, validate the cluster configuration for each protection group.

    Ensure that the protection group is not in an error state. A protection group cannot be started when it is in a error state.


    phys-paris-1# geopg validate protectiongroupname 
    
    protectiongroupname

    Specifies a unique name that identifies a single protection group

    For more information, see How to Validate an EMC Symmetrix Remote Data Facility Protection Group.

  3. On cluster-paris, activate each protection group.

    Because the protection group on cluster-paris has a role of secondary, the geopg start command does not restart the application on cluster-paris.


    phys-paris-1# geopg start -e local protectiongroupname
    
    -e local

    Specifies the scope of the command.

    By specifying a local scope, the command operates on the local cluster only.

    protectiongroupname

    Specifies the name of the protection group.


    Note –

    Do not use the -n option when performing a failback-switchover because the data needs to be synchronized from the current primary, cluster-newyork, to the current secondary, cluster-paris.


    Because the protection group has a role of secondary, the data is synchronized from the current primary, cluster-newyork, to the current secondary, cluster-paris.

    For more information about the geopg start command, see How to Activate an EMC Symmetrix Remote Data Facility Protection Group.

  4. Confirm that the data is completely synchronized.

    The data is completely synchronized when the state of the protection group on cluster-newyork is OK. The protection group has a local state of OK when the EMC Symmetrix Remote Data Facility device groups on cluster-newyork have a Synchronized RDF pair state.

    To confirm that the state of the protection group on cluster-newyork is OK, use the following command:


    phys-newyork-1# geoadm status
    

    Refer to the Protection Group section of the output.

  5. On both partner clusters, ensure that the protection group is activated.


    # geoadm status
    
  6. On either cluster, perform a switchover from cluster-newyork to cluster-paris for each protection group.


    # geopg switchover [-f] -m clusterparis protectiongroupname
    

    For more information, see How to Switch Over an EMC Symmetrix Remote Data Facility Protection Group From Primary to Secondary.

    cluster-paris resumes its original role as primary cluster for the protection group.

  7. Ensure that the switchover was performed successfully.

    Verify that the protection group is now primary on cluster-paris and secondary on cluster-newyork and that the state for “Data replication” and “Resource groups” is OK on both clusters.


    # geoadm status
    

    Check the runtime status of the application resource group and data replication for each EMC Symmetrix Remote Data Facility protection group.


    # clresourcegroup status -v protectiongroupname
    

    Refer to the Status and Status Message fields that are presented for the data replication device group you want to check. For more information about these fields, see Table 2–1.

    For more information about the runtime status of data replication, see Checking the Runtime Status of EMC Symmetrix Remote Data Facility Data Replication.

ProcedureHow to Perform a Failback-Takeover on a System That Uses EMC Symmetrix Remote Data Facility Replication

Use this procedure to restart an application on the original primary cluster, cluster-paris and use the current data on the original primary cluster. Any updates that occurred on the secondary cluster, cluster-newyork, while it was acting as primary are discarded.

The failback procedures apply only to clusters in a partnership. You need to perform the following procedure only once per partnership.


Note –

To resume using the data on the original primary, cluster-paris, you must not have replicated data from the new primary, cluster-newyork, to the original primary cluster, cluster-paris, at any point after the takeover operation on cluster-newyork. To prevent data replication between the new primary and the original primary, you must have used the -n option whenever you used the geopg start command.


Before You Begin

Ensure that the clusters have the following roles:

  1. Resynchronize the original primary cluster, cluster-paris, with the original secondary cluster, cluster-newyork.

    cluster-paris forfeits its own configuration and replicates the cluster-newyork configuration locally.

    1. On cluster-paris, resynchronize the partnership.


      phys-paris-1# geops update partnershipname
      
      partnershipname

      Specifies the name of the partnership


      Note –

      You need to perform this step only once per partnership, even if you are performing a failback-takeover for multiple protection groups in the partnership.


      For more information about synchronizing partnerships, see Resynchronizing a Partnership in Sun Cluster Geographic Edition System Administration Guide.

    2. Determine whether the protection group on the original primary cluster, cluster-paris, is active.


      phys-paris-1# geoadm status
      
    3. If the protection group on the original primary cluster is active, stop it.


      phys-paris-1# geopg stop -e local protectiongroupname
      
    4. Verify that the protection group is stopped.


      phys-paris-1# geoadm status
      
    5. On cluster-paris, resynchronize each protection group.

      Because the local role of the protection group on cluster-newyork is now primary, this steps ensures that the role of the protection group on cluster-paris becomes secondary.


      phys-paris-1# geopg update protectiongroupname
      
      protectiongroupname

      Specifies the name of the protection group

      For more information about resynchronizing protection groups, see How to Resynchronize a Protection Group.

  2. On cluster-paris, validate the configuration for each protection group.

    Ensure that the protection group is not in an error state. A protection group cannot be started when it is in a error state.


    phys-paris-1# geopg validate protectiongroupname 
    
    protectiongroupname

    Specifies a unique name that identifies a single protection group

    For more information, see How to Validate an EMC Symmetrix Remote Data Facility Protection Group.

  3. On cluster-paris, activate each protection group in the secondary role without data replication.

    Because the protection group on cluster-paris has a role of secondary, the geopg start command does not restart the application on cluster-paris.


    Note –

    You must use the -n option which specifies that data replication should not be used for this protection group. If this option is omitted, data replication starts at the same time as the protection group.



    phys-paris-1# geopg start -e local -n protectiongroupname
    
    -e local

    Specifies the scope of the command.

    By specifying a local scope, the command operates on the local cluster only.

    -n

    Specifies that data replication should not be used for this protection group. If this option is omitted, data replication starts at the same time as the protection group.

    protectiongroupname

    Specifies the name of the protection group

    For more information, see How to Activate an EMC Symmetrix Remote Data Facility Protection Group.

    Replication from cluster-newyork to cluster-paris is not started because the -n option is used on cluster-paris.

  4. On cluster-paris, initiate a takeover for each protection group.


    phys-paris-1# geopg takeover [-f] protectiongroupname
    
    -f

    Forces the command to perform the operation without your confirmation

    protectiongroupname

    Specifies the name of the protection group

    For more information about the geopg takeover command, see How to Force Immediate Takeover of EMC Symmetrix Remote Data Facility Services by a Secondary Cluster.

    The protection group on cluster-paris now has the primary role, and the protection group on cluster-newyork has the role of secondary. The application services are now online on cluster-paris.

  5. On cluster-newyork, activate each protection group.

    At the end of step 4, the local state of the protection group on cluster-newyork is Offline. To start monitoring the local state of the protection group, you must activate the protection group on cluster-newyork.

    Because the protection group on cluster-newyork has a role of secondary, the geopg start command does not restart the application on cluster-newyork.


    phys-newyork-1# geopg start -e local [-n] protectiongroupname
    
    -e local

    Specifies the scope of the command.

    By specifying a local scope, the command operates on the local cluster only.

    -n

    Prevents the start of data replication at protection group startup.

    If you omit this option, the data replication subsystem starts at the same time as the protection group.

    protectiongroupname

    Specifies the name of the protection group.

    For more information about the geopg start command, see How to Activate an EMC Symmetrix Remote Data Facility Protection Group.

  6. Ensure that the takeover was performed successfully.

    Verify that the protection group is now primary on cluster-paris and secondary on cluster-newyork and that the state for “Data replication” and “Resource groups” is OK on both clusters.


    # geoadm status
    

    Note –

    If you used the -n option in step 5 to prevent data replication from starting, the “Data replication” status will not be in the OK state.


    Check the runtime status of the application resource group and data replication for each EMC Symmetrix Remote Data Facility protection group.


    # clresourcegroup status -v protectiongroupname
    

    Refer to the Status and Status Message fields that are presented for the data replication device group you want to check. For more information about these fields, see Table 2–1.

    For more information about the runtime status of data replication, see Checking the Runtime Status of EMC Symmetrix Remote Data Facility Data Replication.

Recovering From a Switchover Failure on a System That Uses EMC Symmetrix Remote Data Facility Replication

Basic Sun Cluster Geographic Edition operations such as geopg switchover, perform a symrdf swap operation at the EMC Symmetrix Remote Data Facility data replication level. In EMC Symmetrix Remote Data Facility terminology, a switchover is called a swap. The symrdf swap operation requires significantly more time for static RDF than dynamic RDF. Therefore, you might need to increase the value of the timeout property of the protection group when using static RDF.

If all of the EMC Symmetrix Remote Data Facility commands return a value of 0, the switchover is successful. In some cases, a command might return an error code (a value other than 0). These cases are considered switchover failures.

If a switchover failure occurs, the secondary volumes might not be fully synchronized with the primary volumes. Sun Cluster Geographic Edition software does not start the applications on the new intended primary cluster in a switchover failure scenario.

The remainder of this section describes the initial conditions that lead to a switchover failure and how to recover from a switchover failure.

This section contains information about the following topics:

Switchover Failure Conditions

This section describes a switchover failure scenario. In this scenario, cluster-paris is the original primary cluster and cluster-newyork is the original secondary cluster.

A switchover switches the services from cluster-paris to cluster-newyork as follows:


phys-newyork-1# geopg switchover -f -m cluster-newyork srdfpg

While processing the geopg switchover command, the symrdf swap command runs and returns errors for the EMC Symmetrix Remote Data Facility device group, devgroup1. As a result, the geopg switchover command returns the following failure message:


Processing operation.... this may take a while ....
"Switchover" failed for the following reason:
			Switchover failed for SRDF DG devgroup1

After this failure message has been issued, the two clusters are in the following states:


cluster-paris:
		srdfpg role: Secondary
cluster-newyork:
		srdfpg role: Secondary

phys-newyork-1# symdg list

                          D E V I C E      G R O U P S                       

                                                             Number of
    Name               Type     Valid  Symmetrix ID  Devs   GKs  BCVs  VDEVs

    devgroup1         RDF1    Yes    000187401215     2     0     0      0
    devgroup2         RDF2    Yes    000187401215     6     0     0      0

Recovering From Switchover Failure

This section describes procedures to recover from the failure scenario described in the previous section. These procedures bring the application online on the appropriate cluster.

  1. Place the EMC Symmetrix Remote Data Facility device group, devgroup1, in the Split state.

    Use the symrdf split commands to place the device groups that are in the protection group on both cluster-paris and cluster-newyork in the Split state.


    phys-newyork-1# symrdf -g devgroup1 split
    
  2. Make one of the clusters Primary for the protection group.

    Make the original primary cluster, cluster-paris, Primary for the protection group if you intend to start the application on the original primary cluster. The application uses the current data on the original primary cluster.

    Make the original secondary cluster, cluster-newyork, Primary for the protection group if you intend to start the application on the original secondary cluster. The application uses the current data on the original secondary cluster.


    Caution – Caution –

    Because the symrdf swap command did not perform a swap, the data volumes on cluster-newyork might not be synchronized with the data volumes on cluster-paris. If you intend to start the application with the same data as appears on the original primary cluster, you must not make the original secondary cluster Primary.


ProcedureHow to Make the Original Primary Cluster Primary for an EMC Symmetrix Remote Data Facility Protection Group

  1. Deactivate the protection group on the original primary cluster.


    phys-paris-1# geopg stop -e Local srdfpg
    
  2. Resynchronize the configuration of the protection group.

    This command updates the configuration of the protection group on cluster-paris with the configuration information of the protection group on cluster-newyork.


    phys-paris-1# geopg update srdfpg
    

    After the geopg update command run successfully, srdfpg has the following role on each cluster:


    cluster-paris:
    		srdfpg role: Primary
    cluster-newyork:
    		srdfpg role: secondary
  3. Determine whether the device group has the RDF1 role on the original primary cluster.


    phys-paris-1# symdg list | grep devgroup1 
    
  4. If the device group does not have the RDF1 role on the original primary cluster, run the symrdf swap command so that the device group, devgroup1, resumes the RDF1 role.


    phys-paris-1# symrdf -g devgroup1 failover
    
    phys-paris-1# symrdf -g devgroup1 swap
    

    Confirm that the swap was successful by using the symrdf list command to view the device group information.


    phys-paris-1# symdg list
                            D E V I C E      G R O U P S                       
    
                                                                 Number of
        Name               Type     Valid  Symmetrix ID  Devs   GKs  BCVs  VDEVs
    
        devgroup1         RDF1    Yes    000187401215     6     0     0      0
        devgroup2         RDF1    Yes    000187401215     2     0     0      0
  5. Activate the protection group on both clusters in the partnership.


    phys-paris-1# geopg start -e Global srdfpg
    

    This command starts the application on cluster-paris. Data replication starts from cluster-paris to cluster-newyork.

ProcedureHow to Make the Original Secondary Cluster Primary for an EMC Symmetrix Remote Data Facility Protection Group

  1. Resynchronize the configuration of the protection group.

    This command updates the configuration of the protection group on cluster-newyork with the configuration information of the protection group on cluster-paris.


    phys-newyork-1# geopg update srdfpg
    

    After the geopg update command runs successfully, srdfpg has the following role on each cluster:


    cluster-paris:
    		srdfpg role: Secondary
    cluster-newyork:
    		srdfpg role: Primary
  2. Run the symrdf swap command so that the device group, devgroup2, has the RDF2 role.


    phys-paris-1# symrdf -g devgroup2 failover
    
    phys-paris-1# symrdf -g devgroup2 swap
    

    Confirm that the swap was successful by using the symrdf list command to view the device group information.


    phys-paris-1# symdg list
    
                              D E V I C E      G R O U P S                       
    
                                                                 Number of
        Name               Type     Valid  Symmetrix ID  Devs   GKs  BCVs  VDEVs
    
        devgroup1        RDF2    Yes    000187401215     6     0     0      
        devgroup2        RDF2    Yes    000187401215     2     0     0      0
  3. Activate the protection group on both clusters in the partnership.


    phys-newyork-1# geopg start -e Global srdfpg
    

    This command starts the application on cluster-newyork. Data replication starts from cluster-newyork to cluster-paris.


    Caution – Caution –

    This command overwrites the data on cluster-paris.


Recovering From an EMC Symmetrix Remote Data Facility Data Replication Error

When an error occurs at the data replication level, the error is reflected in the status of the resource in the replication resource group of the relevant device group. This changed status appears in the Data Replication status field in the output of the geoadm status command for that protection group.

This section contains information about the following topics:

ProcedureHow to Detect Data Replication Errors

  1. Check the status of the replication resources by using the scstat -g command.


    # clresource status -v sc_geo_dr-SRDF-protectiongroupname-srdf dgname
    

    For information about how different Resource status values map to actual replication pair states, see Table 2–4.

    Running the clresource status command might return the following:


    …
    -- Resources --
    
                Resource Name       Node Name           State     Status Message
                -------------       ---------           -----     --------------
      Resource: sc_geo_dr-SRDF-srdfpg-devgroup1 pemc1  Online    Online - Partitioned
      Resource: sc_geo_dr-SRDF-srdfpg-devgroup1 pemc2  Offline   Offline
    …
  2. Display the aggregate resource status for all device groups in the protection group by using the geoadm status command.

    For example, the output of the clresource status command in the preceding example indicates that the EMC Symmetrix Remote Data Facility device group, devgroup1, is in the Suspended state on cluster-paris. Table 2–4 indicates that the Suspended state corresponds to a resource status of FAULTED. So, the data replication state of the protection group is also FAULTED. This state is reflected in the output of the geoadm status command, which displays the state of the protection group as Error.


    phys-paris-1# geoadm status
    Cluster: cluster-paris
    
    Partnership "paris-newyork-ps"  : OK
       Partner clusters             : cluster-newyork
       Synchronization              : OK      
       ICRM Connection              : OK
    
       Heartbeat "paris-to-newyork" monitoring "cluster-newyork": OK 
          Heartbeat plug-in "ping_plugin"             : Inactive
          Heartbeat plug-in "tcp_udp_plugin"          : OK
    
    Protection group "srdfpg"   : Error
          Partnership         : paris-newyork-ps
          Synchronization     : OK
    
          Cluster cluster-paris    : Error
             Role                  : Primary
             PG activation state   : Activated
             Configuration         : OK
             Data replication      : Error
             Resource groups       : OK 
       
          Cluster cluster-newyork  : Error
             Role                  : Secondary
             PG activation state   : Activated
             Configuration         : OK
             Data replication      : Error
             Resource groups       : OK
     

ProcedureHow to Recover From an EMC Symmetrix Remote Data Facility Data Replication Error

To recover from an error state, you might perform some or all of the steps in the following procedure.

  1. Use the procedures in the EMC Symmetrix Remote Data Facility documentation to determine the causes of the FAULTED state.

  2. Recover from the faulted state by using the EMC Symmetrix Remote Data Facility procedures.

    If the recovery procedures change the state of the device group, this state is automatically detected by the resource and is reported as a new protection group state.

  3. Revalidate the protection group configuration.


    phys-paris-1# geopg validate protectiongroupname 
    
    protectiongroupname

    Specifies the name of the EMC Symmetrix Remote Data Facility protection group

    If the geopg validate command determines if the configuration is valid, the state of the protection group changes to reflect that fact. If the configuration is not valid, geopg validate returns a failure message.

  4. Review the status of the protection group configuration.


    phys-paris-1# geopg list protectiongroupname 
    
    protectiongroupname

    Specifies the name of the EMC Symmetrix Remote Data Facility protection group

  5. Review the runtime status of the protection group.


    phys-paris-1# geoadm status