JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster Geographic Edition Remote Replication Guide for Sun ZFS Storage Appliance     Oracle Solaris Cluster 4.1
search filter icon
search icon

Document Information

Preface

1.  Configuring and Administering Sun ZFS Storage Appliance Protection Groups

2.  Migrating Services That Use Sun ZFS Storage Appliance Remote Replication

Detecting Cluster Failure on a System That Uses Sun ZFS Storage Appliance Remote Replication

Detecting Primary Cluster Failure

Detecting Secondary Cluster Failure

Migrating Services That Use Sun ZFS Storage Appliance Remote Replication With a Switchover

How to Switch Over Sun ZFS Storage Appliance Remote Replication From the Primary Cluster to the Secondary Cluster

Actions Performed by the Geographic Edition Software During a Switchover

Forcing a Takeover on a System That Uses Sun ZFS Storage Appliance Remote Replication

How to Force Immediate Takeover of Sun ZFS Storage Appliance Services by a Secondary Cluster

Recovering Services to a Cluster on a System That Uses Sun ZFS Storage Appliance Replication

Overview of Recovering Services

How to Resynchronize and Revalidate the Protection Group Configuration

How to Perform a Failback-Switchover on a System That Uses Sun ZFS Storage Appliance Replication

How to Perform a Failback-Takeover on a System That Uses Sun ZFS Storage Appliance Replication

Recovering From a Sun ZFS Storage Appliance Remote Replication Error

How to Detect Remote Replication Errors

How to Recover From a Sun ZFS Storage Appliance Remote Replication Error

Index

Recovering From a Sun ZFS Storage Appliance Remote Replication Error

When an error occurs at the replication level, the error is reflected in the status of the resource in the replication resource group of the relevant replication component. This changed status appears in the Remote Replication status field in the output of the geoadm status command for that protection group.

This section contains the following procedures:

How to Detect Remote Replication Errors

  1. Check the status of the replication resources by using the clresource status command.
    phys-paris-1# clresource status -v zfssa-rep-rs
    zfssa-rep-rs

    Specifies the name of the Sun ZFS Storage Appliance resource.

    For information about how different Resource status values map to actual replication pair states, see Table 1-3.

    Running the clresource status command might return output similar to the following example:

    …
    -- Resources --
    
                Resource Name       Node Name           State     Status Message
                -------------       ---------           -----     --------------
      Resource: zfssa-rep-rs           phys-paris-1        Online    Faulted  - The 
    most recent replication update was canceled by an administrator.
      Resource: zfssa-rep-rs           phys-paris-2        Offline   Offline
    …
  2. Display the aggregate resource status for all components in the protection group by using the geoadm status command.

    For example, the output of the clresource status command in the preceding example indicates that the Sun ZFS Storage Appliance replication state of the protection group is in the Faulted state on cluster-paris.

    phys-paris-1# geoadm status
    Cluster: cluster-paris
    
    Partnership "paris-newyork-ps"  : OK
       Partner clusters             : cluster-newyork
       Synchronization              : OK      
       ICRM Connection              : OK
    
       Heartbeat "paris-to-newyork" monitoring "cluster-newyork": OK 
          Heartbeat plug-in "ping_plugin"             : Inactive
          Heartbeat plug-in "tcp_udp_plugin"          : OK
    
    Protection group "zfssa-pg"   : Error
          Partnership         : paris-newyork-ps
          Synchronization     : OK
    
          Cluster cluster-paris    : Error
             Role                  : Primary
             PG activation state   : Activated
             Configuration         : OK
             Data replication      : Error
             Resource groups       : OK 
       
          Cluster cluster-newyork  : Error
             Role                  : Secondary
             PG activation state   : Activated
             Configuration         : OK
             Data replication      : Error
             Resource groups       : OK
     

How to Recover From a Sun ZFS Storage Appliance Remote Replication Error

To recover from an error state, you might perform some or all of the steps in the following procedure.

  1. Use the procedures in the Sun ZFS Storage Appliance documentation to determine the causes of the Faulted state.
  2. Recover from the Faulted state by using the Sun ZFS Storage Appliance procedures.

    If the recovery procedures change the state of the component, this state is automatically detected by the resource and is reported as a new protection group state.

  3. Revalidate the protection group configuration.
    phys-paris-1# geopg validate pg-name
    pg-name

    Specifies the name of the Sun ZFS Storage Appliance protection group.

    • If the geopg validate command determines that the configuration is valid, the state of the protection group changes to reflect that fact.

    • If the configuration is not valid, the geopg validate command returns a failure message.

  4. Review the status of the protection group configuration.
    phys-paris-1# geopg list pg-name
  5. Review the runtime status of the protection group.
    phys-paris-1# geoadm status