JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster Geographic Edition System Administration Guide     Oracle Solaris Cluster 3.3 3/13
search filter icon
search icon

Document Information

Preface

1.  Introduction to Administering the Geographic Edition Software

2.  Before You Begin

3.  Administering the Geographic Edition Infrastructure

4.  Administering Access and Security

5.  Administering Cluster Partnerships

6.  Administering Heartbeats

7.  Administering Protection Groups

8.  Monitoring and Validating the Geographic Edition Software

9.  Customizing Switchover and Takeover Actions

10.  Script-Based Plug-Ins

A.  Standard Geographic Edition Properties

B.  Legal Names and Values of Geographic Edition Entities

C.  Disaster Recovery Administration Example

D.  Takeover Postconditions

E.  Troubleshooting Geographic Edition Software

Troubleshooting Monitoring and Logging

Troubleshooting Migration Problems

Resource Groups Not Brought Online on Expected Node

Solution or Workaround

Resolving Problems With Application Resource Group Failover When Communication Lost With the Storage Device

Solution or Workaround

Troubleshooting Cluster Start and Restart

Validating Protection Groups in an Error State

Solution or Workaround

Restarting the Common Agent Container

Solution or Workaround

Matching the Nodelist Property of a Protection Group to Those of Its Device Group and Resource Group (Sun StorageTek Availability Suite)

Solution or Workaround

F.  Deployment Example: Replicating Data With MySQL

G.  Error Return Codes for Script-Based Plug-Ins

Index

Troubleshooting Migration Problems

This section provides information about the following problems that you might encounter when services are migrated by using Geographic Edition software:

Resource Groups Not Brought Online on Expected Node

Geographic Edition operations that bring resource groups online do so by passing the request to the Oracle Solaris Cluster framework and report the success of the request. When the cluster framework attempts to bring a resource group online on a cluster node, the cluster framework might encounter an error that would cause it to retry the operation on another node of the cluster. This might result in the following message being written to the log file:

resource group failed to start on chosen node; it may end up failing over to other node(s)

Solution or Workaround

If the Geographic Edition operation seems to have completed successfully, but the resource group did not come online as expected, review the log file and correct any underlying cluster problems.

Resolving Problems With Application Resource Group Failover When Communication Lost With the Storage Device

When a loss of communication occurs between a node on which the application is online and the storage device, some application resource groups might not failover gracefully to the nodes from which the storage is accessible. The application resource group might result in a ERROR_STOP_FAILED state.

Solution or Workaround

The Oracle Solaris Cluster infrastructure does not initiate a switchover when I/O errors occur in a volume or its underlying devices. Because no switchover or failover occurs, the device service remains online on this node despite the fact that storage has been rendered inaccessible.

If this problem occurs, restart the application resource group on the correct nodes by using the standard Oracle Solaris Cluster procedures. Refer to Clearing the STOP_FAILED Error Flag on Resources in Oracle Solaris Cluster Data Services Planning and Administration Guide about recovering from the ERROR_STOP_FAILED state and restarting the application.

The Geographic Edition software detects state changes in the application resource group and displays the states in the output of the geoadm status command. For more information about using this command, see Monitoring the Runtime Status of the Geographic Edition Software.