Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Cluster Geographic Edition Data Replication Guide for EMC Symmetrix Remote Data Facility Oracle Solaris Cluster 4.1 |
1. Replicating Data With EMC Symmetrix Remote Data Facility Software
2. Administering SRDF Protection Groups
3. Migrating Services That Use SRDF Data Replication
Detecting Cluster Failure on a System That Uses SRDF Data Replication
Detecting Primary Cluster Failure
Detecting Secondary Cluster Failure
Migrating Services That Use SRDF Data Replication With a Switchover
Validations That Occur Before a Switchover
Results of a Switchover From a Replication Perspective
How to Switch Over an SRDF Protection Group From Primary to Secondary
Forcing a Takeover on a System That Uses SRDF Data Replication
Validations That Occur Before a Takeover
Results of a Takeover From a Replication Perspective
How to Force Immediate Takeover of SRDF Services by a Secondary Cluster
Recovering Services to a Cluster on a System That Uses SRDF Replication
How to Resynchronize and Revalidate the Protection Group Configuration
How to Perform a Failback-Switchover on a System That Uses SRDF Replication
How to Perform a Failback-Takeover on a System That Uses SRDF Replication
Recovering From a Switchover Failure on a System That Uses SRDF Replication
Recovering From Switchover Failure
How to Make the Original Primary Cluster Primary for an SRDF Protection Group
How to Make the Original Secondary Cluster Primary for an SRDF Protection Group
Recovering From an SRDF Data Replication Error
How to Detect Data Replication Errors
How to Recover From an SRDF Data Replication Error
After a successful takeover operation, the secondary cluster, cluster-newyork, becomes the primary for the protection group and the services are online on the secondary cluster. After the recovery of the original primary cluster, cluster-paris, the services can be brought online again on the original primary by using a process called failback.
Geographic Edition software supports the following two kinds of failback:
Failback-switchover. During a failback-switchover, applications are brought online again on the original primary cluster, cluster-paris, after the data of the original primary cluster was resynchronized with the data on the secondary cluster, cluster-newyork.
For a reminder of which clusters are cluster-paris and cluster-newyork, see Example Geographic Edition Cluster Configuration in Oracle Solaris Cluster Geographic Edition System Administration Guide.
Failback-takeover. During a failback-takeover, applications are brought online again on the original primary cluster, cluster-paris, and use the current data on the original primary cluster. Any updates that occurred on the secondary cluster, cluster-newyork, while it was acting as primary, are discarded.
If you want to leave the new primary, cluster-newyork, as the primary cluster and the original primary cluster, cluster-paris, as the secondary after the original primary restarts, you can resynchronize and revalidate the protection group configuration without performing a switchover or takeover.
This section contains information about the following topics:
How to Resynchronize and Revalidate the Protection Group Configuration
How to Perform a Failback-Switchover on a System That Uses SRDF Replication
How to Perform a Failback-Takeover on a System That Uses SRDF Replication
Use this procedure to resynchronize and revalidate data on the original primary cluster, cluster-paris, with the data on the current primary cluster, cluster-newyork.
Before You Begin
Before you resynchronize and revalidate the protection group configuration, a takeover has occurred on cluster-newyork. The clusters now have the following roles:
If the original primary cluster, cluster-paris, has been down, confirm that the cluster is booted and that the Geographic Edition infrastructure is enabled on the cluster. For more information about booting a cluster, see Booting a Cluster in Oracle Solaris Cluster Geographic Edition System Administration Guide.
The protection group on cluster-newyork has the primary role.
The protection group on cluster-paris has either the primary role or secondary role, depending on whether the protection group could be reached during the takeover.
cluster-paris forfeits its own configuration and replicates the cluster-newyork configuration locally. Resynchronize both the partnership and protection group configurations.
phys-paris-1# geops update partnershipname
Specifies the name of the partnership
Note - You need to perform this step only once, even if you are resynchronizing multiple protection groups.
For more information about synchronizing partnerships, see Resynchronizing a Partnership in Oracle Solaris Cluster Geographic Edition System Administration Guide.
Because the role of the protection group on cluster-newyork is primary, this step ensures that the role of the protection group on cluster-paris is secondary.
phys-paris-1# geopg update protectiongroupname
Specifies the name of the protection group
For more information about synchronizing protection groups, see Resynchronizing an SRDF Protection Group.
phys-paris-1# geopg validate protectiongroupname
Specifies a unique name that identifies a single protection group
For more information, see How to Validate an SRDF Protection Group.
Because the protection group on cluster-paris has a role of secondary, the geopg start command does not restart the application on cluster-paris.
phys-paris-1# geopg start -n -e local protectiongroupname
Specifies the scope of the command.
By specifying a local scope, the command operates on the local cluster only.
Specifies that data replication should not be used for this protection group. If this option is omitted, data replication starts at the same time as the protection group.
Specifies the name of the protection group.
Because the protection group has a role of secondary, the data is synchronized from the current primary, cluster-newyork, to the current secondary, cluster-paris.
For more information about the geopg start command, see How to Activate an SRDF Protection Group.
First, confirm that the state of the protection group on cluster-newyork is OK. The protection group has a local state of OK when the SRDF device groups on cluster-newyork have a Synchronized SRDF pair state.
phys-newyork-1# geoadm status
Refer to the Protection Group section of the output.
Next, confirm that all resources in the replication resource group, protectiongroupname-rep-rg, report a status of OK.
phys-newyork-1# clresource status -g protectiongroupname-rep-rg
Use this procedure to restart an application on the original primary cluster, cluster-paris, after the data on this cluster has been resynchronized with the data on the current primary cluster, cluster-newyork.
Note - The failback procedures apply only to clusters in a partnership. You need to perform the following procedure only once per partnership.
Before You Begin
Before you perform a failback-switchover, a takeover has occurred on cluster-newyork. The clusters have the following roles:
If the original primary cluster, cluster-paris, has been down, confirm that the cluster is booted and that the Geographic Edition infrastructure is enabled on the cluster. For more information about booting a cluster, see Booting a Cluster in Oracle Solaris Cluster Geographic Edition System Administration Guide.
The protection group on cluster-newyork has the primary role.
The protection group on cluster-paris has either the primary role or secondary role, depending on whether cluster-paris can be reached during the takeover from cluster-newyork.
This task is necessary to finish recovery if the cluster had experienced a complete site failure.
phys-paris-1# symrdf -g devicegroup query
phys-paris-1# symrdf -g devicegroup failover
cluster-paris forfeits its own configuration and replicates the cluster-newyork configuration locally. Resynchronize both the partnership and protection group configurations.
phys-paris-1# geops update partnershipname
Specifies the name of the partnership
Note - You need to perform this step only once per partnership, even if you are performing a failback-switchover for multiple protection groups in the partnership.
For more information about synchronizing partnerships, see Resynchronizing a Partnership in Oracle Solaris Cluster Geographic Edition System Administration Guide.
phys-paris-1# geoadm status
phys-paris-1# geopg stop -e local protectiongroupname
phys-paris-1# geoadm status
Because the local role of the protection group on cluster-newyork is now primary, this steps ensures that the role of the protection group on cluster-paris becomes secondary.
phys-paris-1# geopg update protectiongroupname
Specifies the name of the protection group
For more information about synchronizing protection groups, see Resynchronizing an SRDF Protection Group.
Ensure that the protection group is not in an error state. A protection group cannot be started when it is in a error state.
phys-paris-1# geopg validate protectiongroupname
Specifies a unique name that identifies a single protection group
For more information, see How to Validate an SRDF Protection Group.
Because the protection group on cluster-paris has a role of secondary, the geopg start command does not restart the application on cluster-paris.
phys-paris-1# geopg start -e local protectiongroupname
Specifies the scope of the command.
By specifying a local scope, the command operates on the local cluster only.
Specifies the name of the protection group.
Note - Do not use the -n option when performing a failback-switchover because the data needs to be synchronized from the current primary, cluster-newyork, to the current secondary, cluster-paris.
Because the protection group has a role of secondary, the data is synchronized from the current primary, cluster-newyork, to the current secondary, cluster-paris.
For more information about the geopg start command, see How to Activate an SRDF Protection Group.
The data is completely synchronized when the state of the protection group on cluster-newyork is OK. The protection group has a local state of OK when the SRDF device groups on cluster-newyork have a Synchronized RDF pair state.
To confirm that the state of the protection group on cluster-newyork is OK, use the following command:
phys-newyork-1# geoadm status
Refer to the Protection Group section of the output.
# geoadm status
# geopg switchover [-f] -m cluster-paris protectiongroupname
For more information, see How to Switch Over an SRDF Protection Group From Primary to Secondary.
cluster-paris resumes its original role as primary cluster for the protection group.
Verify that the protection group is now primary on cluster-paris and secondary on cluster-newyork and that the state for “Data replication” and “Resource groups” is OK on both clusters.
# geoadm status
Check the runtime status of the application resource group and data replication for each SRDF protection group.
# clresourcegroup status -v protectiongroupname
Refer to the Status and Status Message fields that are presented for the data replication device group you want to check. For more information about these fields, see Table 2-1.
For more information about the runtime status of data replication, see Checking the Runtime Status of SRDF Data Replication.
Use this procedure to restart an application on the original primary cluster, cluster-paris and use the current data on the original primary cluster. Any updates that occurred on the secondary cluster, cluster-newyork, while it was acting as primary are discarded.
The failback procedures apply only to clusters in a partnership. You need to perform the following procedure only once per partnership.
Note - To resume using the data on the original primary, cluster-paris, you must not have replicated data from the new primary, cluster-newyork, to the original primary cluster, cluster-paris, at any point after the takeover operation on cluster-newyork. To prevent data replication between the new primary and the original primary, you must have used the -n option whenever you used the geopg start command.
Before You Begin
Ensure that the clusters have the following roles:
If the original primary cluster, cluster-paris, has been down, confirm that the cluster is booted and that the Geographic Edition infrastructure is enabled on the cluster. For more information about booting a cluster, see Booting a Cluster in Oracle Solaris Cluster Geographic Edition System Administration Guide.
The protection group on cluster-newyork has the primary role.
The protection group on cluster-paris has either the primary role or secondary role, depending on whether cluster-paris can be reached during the takeover from cluster-newyork.
This task is necessary to finish recovery if the cluster had experienced a complete site failure.
phys-paris-1# symrdf -g devicegroup query
phys-paris-1# symrdf -g devicegroup failover
cluster-paris forfeits its own configuration and replicates the cluster-newyork configuration locally.
phys-paris-1# geops update partnershipname
Specifies the name of the partnership
Note - You need to perform this step only once per partnership, even if you are performing a failback-takeover for multiple protection groups in the partnership.
For more information about synchronizing partnerships, see Resynchronizing a Partnership in Oracle Solaris Cluster Geographic Edition System Administration Guide.
phys-paris-1# geoadm status
phys-paris-1# geopg stop -e local protectiongroupname
phys-paris-1# geoadm status
Because the local role of the protection group on cluster-newyork is now primary, this steps ensures that the role of the protection group on cluster-paris becomes secondary.
phys-paris-1# geopg update protectiongroupname
Specifies the name of the protection group
For more information about resynchronizing protection groups, see How to Resynchronize a Protection Group.
Ensure that the protection group is not in an error state. A protection group cannot be started when it is in a error state.
phys-paris-1# geopg validate protectiongroupname
Specifies a unique name that identifies a single protection group
For more information, see How to Validate an SRDF Protection Group.
Because the protection group on cluster-paris has a role of secondary, the geopg start command does not restart the application on cluster-paris.
Note - You must use the -n option which specifies that data replication should not be used for this protection group. If this option is omitted, data replication starts at the same time as the protection group.
phys-paris-1# geopg start -e local -n protectiongroupname
Specifies the scope of the command.
By specifying a local scope, the command operates on the local cluster only.
Specifies that data replication should not be used for this protection group. If this option is omitted, data replication starts at the same time as the protection group.
Specifies the name of the protection group
For more information, see How to Activate an SRDF Protection Group.
Replication from cluster-newyork to cluster-paris is not started because the -n option is used on cluster-paris.
phys-paris-1# geopg takeover [-f] protectiongroupname
Forces the command to perform the operation without your confirmation
Specifies the name of the protection group
For more information about the geopg takeover command, see How to Force Immediate Takeover of SRDF Services by a Secondary Cluster.
The protection group on cluster-paris now has the primary role, and the protection group on cluster-newyork has the role of secondary. The application services are now online on cluster-paris.
At the end of step 4, the local state of the protection group on cluster-newyork is Offline. To start monitoring the local state of the protection group, you must activate the protection group on cluster-newyork.
Because the protection group on cluster-newyork has a role of secondary, the geopg start command does not restart the application on cluster-newyork.
phys-newyork-1# geopg start -e local [-n] protectiongroupname
Specifies the scope of the command.
By specifying a local scope, the command operates on the local cluster only.
Prevents the start of data replication at protection group startup.
If you omit this option, the data replication subsystem starts at the same time as the protection group.
Specifies the name of the protection group.
For more information about the geopg start command, see How to Activate an SRDF Protection Group.
Verify that the protection group is now primary on cluster-paris and secondary on cluster-newyork and that the state for “Data replication” and “Resource groups” is OK on both clusters.
# geoadm status
Note - If you used the -n option in step 5 to prevent data replication from starting, the “Data replication” status will not be in the OK state.
Check the runtime status of the application resource group and data replication for each SRDF protection group.
# clresourcegroup status -v protectiongroupname
Refer to the Status and Status Message fields that are presented for the data replication device group you want to check. For more information about these fields, see Table 2-1.
For more information about the runtime status of data replication, see Checking the Runtime Status of SRDF Data Replication.