JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster Geographic Edition Data Replication Guide for Oracle Data Guard     Oracle Solaris Cluster 4.0
search filter icon
search icon

Document Information

Preface

1.  Replicating Data With Oracle Data Guard Software

2.  Administering Oracle Data Guard Protection Groups

3.  Migrating Services That Use Oracle Data Guard Data Replication

Detecting Cluster Failure on a System That Uses Oracle Data Guard Data Replication

Detecting Primary Cluster Failure

Detecting Failure of the Standby Cluster

Migrating Services That Use Oracle Data Guard With a Switchover

How to Switch Over an Oracle Data Guard Protection Group From the Primary to the Standby Cluster

Actions Performed by the Geographic Edition Software During a Switchover

Forcing a Takeover on Systems That Use Oracle Data Guard

How to Force Immediate Takeover of Oracle Data Guard Services by a Standby Cluster

Actions Performed by the Geographic Edition Software During a Takeover

Recovering Oracle Data Guard Data After a Takeover

How to Resynchronize and Revalidate the Protection Group Configuration

How to Perform a Failback Switchover or Failback Takeover

Recovering From an Oracle Data Guard Data Replication Error

How to Recover From a Data Replication Error

A.  Geographic Edition Properties for Oracle Data Guard Broker Configurations

Index

Recovering Oracle Data Guard Data After a Takeover

After a successful takeover operation, the standby cluster, cluster-newyork, becomes the primary for the protection group, and the services are online on the standby cluster. After the recovery of the original primary cluster, the services can be brought online again on the original primary cluster by using a process called failback.

Geographic Edition software supports the following two kinds of failback:

If you want to leave the new primary, cluster-newyork, as the primary cluster and the original primary cluster, cluster-paris, as the standby cluster after the original primary cluster starts again , you can resynchronize and revalidate the protection group configuration. You can resynchronize and revalidate the protection group without performing a switchover or takeover.

This section describes how to perform the following procedures:

How to Resynchronize and Revalidate the Protection Group Configuration

Follow this procedure to resynchronize and revalidate data on the original primary cluster, cluster-paris, with the data on the current primary cluster cluster-newyork.

Before You Begin

Before you resynchronize and revalidate the protection group configuration, a takeover has occurred on cluster-newyork. The clusters now have the following roles:

  1. If the original primary cluster, cluster-paris, has been down, confirm that the cluster is booted and that the Geographic Edition infrastructure is enabled on the cluster.

    For more information about booting a cluster, see Booting a Cluster in Oracle Solaris Cluster Geographic Edition System Administration Guide.

  2. When HA for Oracle is configured, ensure that the Oracle database is not restarted prematurely.
    1. Disable the HA for Oracle resource or resource group.
      • If the dataguard_role property is set to STANDBY, disable the HA for Oracle resource.

        A STANDBY value is set if the takeover was performed when the old primary was running at the time of the takeover.

        # clresource disable oracle_server-rs
      • If the dataguard_role property is set to PRIMARY, disable the HA for Oracle resource group.

        A PRIMARY value is set if the takeover was performed when the old primary was down during the takeover.

        # clresourcegroup quiesce -k oracle_server-rg
        # clresource disable oracle_server-rs
        # clresourcegroup offline oracle_server-rg
        # clresourcegroup online oracle_server-rg

        When the cluster restarts, an attempt is made to start a database that needs to be reinstated. Therefore, you must disable the resource as soon as possible. You might need to quiesce the HA for Oracle resource group if the RGM has already attempted to bring it online.

      If the RGM has already attempted to start the Oracle server resource and failed, you might need to clear the start_failed flag by using the following command.

      # clresource clear -f start_failed oracle_server-rs
    2. Verify that the database is shut down on the cluster nodes.

      If the resource is not shut down, become the Oracle user on that node and stop the database by using one of the following methods:

      $ srvctl stop database -d database_name
      $ ORACLE_SID=db_SID export ORACLE_SID
      $ sqlplus /nolog
      SQL> connect sys/sysdba passord as sysdba
      SQL> shutdown immediate
      SQL> exit
    3. Restart and reinstate the database.
      $ sqlplus /nolog
      SQL> connect sys/sysdba passord as sysdba
      SQL> startup mount
      …
      SQL> exit
      
      $ dgmgrl
      If issued from the old primary, include the new primary name and the password
      DGMGRL> connect sys/password[@new_primary_service_name]
      DGMGRL> reinstate database old_primary_database_name
      ...
      DGMGRL> exit

      If the database cannot be reinstated, you might need to re-create it or otherwise recover the database using an appropriate method.

    4. Update and re-enable the HA for Oracle resource.
      # clresource set -p dataguard_role=STANDBY oracle_server-rs
      # clresource enable oracle_server-rs
  3. Resynchronize the original primary cluster, cluster-paris, with the current primary cluster cluster-newyork.

    The cluster cluster-paris forfeits its own configuration and replicates the cluster-newyork configuration locally. Resynchronize both the partnership and protection group configurations.

    1. On cluster-paris, deactivate the protection group on the local cluster.
      phys-paris-1# geopg stop -e local protectiongroupname
      -e local

      Specifies the scope of the command.

      By specifying a local scope, the command operates on the local cluster only.


      Note - The property values, such as global and local, are not case sensitive.


      protectiongroupname

      Specifies the name of the protection group.

      If the protection group is already deactivated, the state of the resource group in the protection group is probably Error because the application resource groups are managed and offline.

      If you deactivate the protection group, the application resource groups are no longer managed, clearing the Error state.

    2. On cluster-paris, resynchronize the partnership.
      phys-paris-1# geops update partnershipname

      Note - You need to perform this step only once, even if you are resynchronizing multiple protection groups.


      For more information about synchronizing partnerships, see Resynchronizing a Partnership in Oracle Solaris Cluster Geographic Edition System Administration Guide.

    3. On cluster-paris, resynchronize each protection group.

      Because the role of the protection group on cluster-newyork is primary, this step ensures that the role of the protection group on cluster-paris is secondary.

      phys-paris-1# geopg update protectiongroupname

      For more information about synchronizing protection groups, see Resynchronizing an Oracle Data Guard Protection Group.

  4. On cluster-paris, validate the configuration for each protection group.
    phys-paris-1# geopg validate protectiongroupname

    For more information, see How to Validate an Oracle Data Guard Protection Group.

  5. On cluster-paris, activate each protection group.

    When you activate a protection group, the protection group's application resource groups are also brought online.

    phys-paris-1# geopg start -e global protectiongroupname
    -e global

    Specifies the scope of the command.

    By specifying a global scope, the command operates on both clusters where the protection group is located.


    Note - The property values, such as global and local, are not case sensitive.


    protectiongroupname

    Specifies the name of the protection group.


    Caution

    Caution - Do not use the -n option because the data needs to be synchronized from the current primary cluster, cluster-newyork, to the current standby cluster, cluster-paris.

    Because the protection group has a role of secondary, the data is synchronized from the current primary cluster, cluster-newyork, to the current standby cluster, cluster-paris.

    For more information about the geopg start command, see How to Activate an Oracle Data Guard Protection Group.


  6. Confirm that all data is synchronized.
    1. Confirm that the state of the protection group on cluster-newyork is OK.
      phys-newyork-1# geoadm status

      Refer to the Protection Group section of the output.

    2. Confirm that all resources in the replication resource group, ODGprotectiongroupname-odg-rep-rg, report a status of OK.
      phys-newyork-1# clresource status -v ODGprotectiongroupname-odg-rep-rs

How to Perform a Failback Switchover or Failback Takeover

Follow this procedure to restart an application on the original primary cluster, cluster-paris.

The failback procedures apply only to clusters in a partnership. Perform the following procedure only once for each partnership.

Before You Begin

Ensure that the clusters have the following roles:

  1. If the original primary cluster, cluster-paris, failed, confirm that the cluster is restarted and that the Geographic Edition infrastructure is enabled on the cluster.

    For more information about restarting a cluster, see Booting a Cluster in Oracle Solaris Cluster Geographic Edition System Administration Guide.

  2. For HA for Oracle, on the original primary cluster, verify that the SUNW.oracle_server resource is in a healthy state.

    If HA for Oracle is not running on the original primary cluster, omit this step.

    If the resource is in a faulted state or repeatedly restarts, perform the following steps:

    1. Disable the HA for Oracle resource or resource group.
      • If the dataguard_role property is set to STANDBY, disable the HA for Oracle resource.

        A STANDBY value is set if the takeover was performed when the old primary was running at the time of the takeover.

        # clresource disable oracle_server-rs
      • If the dataguard_role property is set to PRIMARY, disable the HA for Oracle resource group.

        A PRIMARY value is set if the takeover was performed when the old primary was down during the takeover.

        # clresourcegroup quiesce -k oracle_server-rg
        # clresource disable oracle_server-rs
        # clresourcegroup offline oracle_server-rg
        # clresourcegroup online oracle_server-rg

        Note - When the cluster restarts, an attempt is made to start a database that needs to be reinstated. Therefore, you must disable the resource as soon as possible. You might need to quiesce the HA for Oracle resource group if the RGM has already attempted to bring it online.

        If the RGM has already attempted to start the Oracle server resource and failed, you might need to clear the start_failed flag by using the following command.

        # clresource clear -f start_failed oracle_server-rs

    2. Determine whether the database is shut down on the cluster nodes.
    3. If the database is not shut down, become the Oracle user on that node and stop the database by using one of the following methods:
      First method:
      $ srvctl stop database -d database_name
      
      Second method:
      $ ORACLE_SID=db_SID export ORACLE_SID
      $ sqlplus /nolog
      SQL> connect sys/sysdba password as sysdba
      SQL> shutdown immediate
      SQL> exit
    4. Restart and reinstate the database.
      $ sqlplus /nolog
      SQL> connect sys/sysdba passord as sysdba
      SQL> startup mount
      …
      SQL> exit
  3. Reinstate the old Oracle Data Guard primary database to become the standby for the current primary database.

    If you issue the dgmgrl command from the old primary cluster, include the new primary's database service name in the connection string.

    $ dgmgrl
    DGMGRL> connect sys/password[@new_primary_service_name]
    DGMGRL> reinstate database old_primary_database_name
    ...
    DGMGRL> exit

    Note - If the database cannot be reinstated, you might need to re-create it or otherwise recover the database by using an appropriate method. For instructions, refer to Using Flashback Database After a Failover in Oracle Data Guard Concepts and Administration.


  4. To perform a failback takeover instead of a failback switchover, flashback your primary database to the point at which the original takeover occurred.
  5. For HA for Oracle, update and re-enable the HA for Oracle resource on the original primary cluster.

    If HA for Oracle is not running on the original primary cluster, omit this step.

    # clresource set -p dataguard_role=STANDBY oracle_server-rs
    # clresource enable oracle_server-rs
  6. If the original primary cluster was down at the point of failure, update the original primary cluster to be the secondary.
    1. From a node of the original primary cluster, stop the protection group.

      If the original primary cluster was down at the time of takeover, the protection group should already be stopped.

      phys-paris-1# geopg stop -e local protectiongroupname
      -e local

      Specifies the scope of the command. By specifying a local scope, the command operates on the local cluster only.

      protectiongroupname

      Specifies the name of the protection group.

    2. Verify that the protection group is stopped.
      phys-paris-1# geoadm status
    3. Update the protection group.
      phys-paris-1# geopg update protectiongroupname

      The roles are now correct, but both clusters are marked as deactivated.

      For more information about synchronizing protection groups, see Resynchronizing an Oracle Data Guard Protection Group.

  7. From one node in each cluster, locally validate the configuration for each protection group.

    Note - Ensure that the protection group is not in an Error state. You cannot start a protection group when it is in an Error state.


    phys-paris-1# geopg validate protectiongroupname
    phys-newyork-1# geopg validate protectiongroupname

    For more information, see How to Validate an Oracle Data Guard Protection Group.

  8. From one node in either cluster, globally activate the protection group on both clusters.
    phys-node-n# geopg start -e global protectiongroupname
  9. From one node in either cluster, switch over the protection group to the original primary.
    phys-node-n# geopg switchover -f -m cluster-paris protectiongroupname

    For more information, see How to Switch Over an Oracle Data Guard Protection Group From the Primary to the Standby Cluster.

    The cluster-paris cluster resumes its original role as primary cluster for the protection group.

  10. Ensure that the switchover was performed successfully.
    phys-node-n# geoadm status

    Verify that the protection group is now primary on cluster-paris and secondary on cluster-newyork and that the states that are shown for the Data replication and the Resource groups properties are OK on both clusters.