How to Perform a Failback Switchover or Failback Takeover

Language:

Follow this procedure to restart an application on the original primary cluster, cluster-paris.

Note - You can also accomplish some steps in this procedure by using the Oracle Solaris Cluster Manager browser interface. Click Partnerships, click the partnership name, highlight the protection group name, and click the button for the action you want to perform. For more information about Oracle Solaris Cluster Manager see Chapter 13, Using the Oracle Solaris Cluster Manager Browser Interface in Oracle Solaris Cluster 4.3 System Administration Guide.

The failback procedures apply only to clusters in a partnership. Perform the following procedure only once for each partnership.

Before You Begin

Ensure that the clusters have the following roles:

The protection group on cluster-newyork is assigned the primary role.
The protection group on cluster-paris has either the primary role or the secondary role, depending on whether the protection group could be reached during the takeover.

If the original primary cluster, cluster-paris, failed, confirm that the cluster is restarted and that the Geographic Edition infrastructure is enabled on the cluster.
For more information about restarting a cluster, see Booting a Cluster in Oracle Solaris Cluster 4.3 Geographic Edition System Administration Guide.
For HA for Oracle Database, on the original primary cluster, verify that the SUNW.oracle_server resource is in a healthy state.
If HA for Oracle Database is not running on the original primary cluster, omit this step.
- If the resource is not in a faulted state, unmonitor the database resource.
```
# clresource unmonitor oracle_server-rs
```
- If the resource is in a faulted state or repeatedly restarts, perform the following steps:
  1. Disable the HA for Oracle Database resource or resource group.
    - If the dataguard_role property is set to STANDBY, disable the HA for Oracle Database resource.
      A STANDBY value is set if the takeover was performed when the old primary was running at the time of the takeover.
```
phys-paris-1# clresource disable oracle_server-rs
```
    - If the dataguard_role property is set to PRIMARY, disable the HA for Oracle Database resource group.
      A PRIMARY value is set if the takeover was performed when the old primary was down during the takeover.
```
phys-paris-1# clresourcegroup quiesce -k oracle_server-rg
phys-paris-1# clresource disable oracle_server-rs
phys-paris-1# clresourcegroup offline oracle_server-rg
phys-paris-1# clresourcegroup online oracle_server-rg
```
      Note - When the cluster restarts, an attempt is made to start a database that needs to be reinstated. Therefore, you must disable the resource as soon as possible. You might need to quiesce the HA for Oracle Database resource group if the RGM has already attempted to bring it online.
      If the Oracle Database resource is in the stop_failed state, clear the stop_failed flag by using the following command.
      
      phys-paris-1# clresource clear oracle_server-rs
  2. Determine whether the database is shut down on the cluster nodes.
  3. If the database is not shut down, become the Oracle user on that node and stop the database by using one of the following methods:
```
First method:
phys-paris-1$ srvctl stop database -d database_name

Second method:
phys-paris-1$ ORACLE_SID=db_SID export ORACLE_SID
phys-paris-1$ sqlplus /nolog
SQL> connect sys/sysdba password as sysdba
SQL> shutdown immediate
SQL> exit
```
  4. Start mount the database.
```
phys-paris-1$ sqlplus /nolog
SQL> connect sys/sysdba passord as sysdba
SQL> startup mount
…
SQL> exit
```
Reinstate the old Oracle Data Guard primary database to become the standby for the current primary database.
If you issue the dgmgrl command from the old primary cluster, include the new primary's database service name in the connection string.
```
phys-newyork-1$ dgmgrl
DGMGRL> connect sys/password[@new_primary_service_name]
DGMGRL> reinstate database old_primary_database_name
...
DGMGRL> exit
```
Note - If the database cannot be reinstated, you might need to re-create it or otherwise recover the database by using an appropriate method. For instructions, refer to Using Flashback Database After a Failover in Oracle Data Guard Concepts and Administration.
To perform a failback takeover instead of a failback switchover, flashback your primary database to the point at which the original takeover occurred.

For HA for Oracle Database, update and re-enable the HA for Oracle Database resource on the original primary cluster.

If HA for Oracle Database is not running on the original primary cluster, omit this step.

phys-paris-1# clresource set -p dataguard_role=STANDBY oracle_server-rs
phys-paris-1# clresource enable oracle_server-rs
Restore monitoring if monitoring of the resource was previously disabled
phys-paris-1# clresource monitor oracle_server-rs

If the original primary cluster was down at the point of failure, update the original primary cluster to be the secondary.
1. From a node of the original primary cluster, stop the protection group.
  If the original primary cluster was down at the time of takeover, the protection group should already be stopped.
```
phys-paris-1# geopg stop -e local protection-group
```
  –e local
  
  Specifies the scope of the command. By specifying a local scope, the command operates on the local cluster only.
  
  protection-group
  
  Specifies the name of the protection group.
2. Verify that the protection group is stopped.
```
phys-paris-1# geoadm status
```
3. Update the protection group.
```
phys-paris-1# geopg update protection-group
```
  The roles are now correct, but both clusters are marked as deactivated.
  
  For more information about synchronizing protection groups, see Resynchronizing a Protection Group in Oracle Solaris Cluster 4.3 Geographic Edition System Administration Guide.
From one node in each cluster, locally validate the configuration for each protection group.
Note - Ensure that the protection group is not in an Error state. You cannot start a protection group when it is in an Error state.
```
phys-paris-1# geopg validate protection-group
phys-newyork-1# geopg validate protection-group
```
For more information, see How to Validate an Oracle Data Guard Protection Group.
From one node in either cluster, globally activate the protection group on both clusters.
```
phys-node-n# geopg start -e global protection-group
```
From one node in either cluster, switch over the protection group to the original primary.
```
phys-node-n# geopg switchover -f -m cluster-paris protection-group
```
For more information, see Migrating Replication Services by Switching Over Protection Groups in Oracle Solaris Cluster 4.3 Geographic Edition System Administration Guide.

The cluster-paris cluster resumes its original role as primary cluster for the protection group.
Ensure that the switchover was performed successfully.
```
phys-node-n# geoadm status
```
Verify that the protection group is now primary on cluster-paris and secondary on cluster-newyork and that the states that are shown for the Data replication and the Resource groups properties are OK on both clusters.