JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Cluster Concepts Guide     Oracle Solaris Cluster
search filter icon
search icon

Document Information

Preface

1.  Introduction and Overview

2.  Key Concepts for Hardware Service Providers

3.  Key Concepts for System Administrators and Application Developers

Administrative Interfaces

Cluster Time

High-Availability Framework

Zone Membership

Cluster Membership Monitor

Failfast Mechanism

Cluster Configuration Repository (CCR)

Campus Clusters

Global Devices

Device IDs and DID Pseudo Driver

Device Groups

Device Group Failover

Multiported Device Groups

Global Namespace

Local and Global Namespaces Example

Cluster File Systems

Using Cluster File Systems

HAStoragePlus Resource Type

syncdir Mount Option

Disk Path Monitoring

DPM Overview

Monitoring Disk Paths

Using the cldevice Command to Monitor and Administer Disk Paths

Using Oracle Solaris Cluster Manager to Monitor Disk Paths

Using the clnode set Command to Manage Disk Path Failure

Quorum and Quorum Devices

About Quorum Vote Counts

About Quorum Configurations

Adhering to Quorum Device Requirements

Adhering to Quorum Device Best Practices

Recommended Quorum Configurations

Quorum in Two-Host Configurations

Quorum in Greater Than Two-Host Configurations

Atypical Quorum Configurations

Bad Quorum Configurations

Load Limits

Data Services

Data Service Methods

Failover Data Services

Scalable Data Services

Load-Balancing Policies

Failback Settings

Data Services Fault Monitors

Developing New Data Services

Characteristics of Scalable Services

Data Service API and Data Service Development Library API

Using the Cluster Interconnect for Data Service Traffic

Resources, Resource Groups, and Resource Types

Resource Group Manager (RGM)

Resource and Resource Group States and Settings

Resource and Resource Group Properties

Support for Oracle Solaris Zones

Support for Global-Cluster Non-Voting Nodes (Solaris Zones) Directly Through the RGM

Criteria for Using Support for Solaris Zones Directly Through the RGM

Requirements for Using Support for Solaris Zones Directly Through the RGM

Additional Information About Support for Solaris Zones Directly Through the RGM

Support for Solaris Zones on Oracle Solaris Cluster Nodes Through Oracle Solaris Cluster HA for Solaris Zones

Criteria for Using Oracle Solaris Cluster HA for Solaris Zones

Requirements for Using Oracle Solaris Cluster HA for Solaris Zones

Additional Information About Oracle Solaris Cluster HA for Solaris Zones

Service Management Facility

System Resource Usage

System Resource Monitoring

Control of CPU

Viewing System Resource Usage

Data Service Project Configuration

Determining Requirements for Project Configuration

Setting Per-Process Virtual Memory Limits

Failover Scenarios

Two-Host Cluster With Two Applications

Two-Host Cluster With Three Applications

Failover of Resource Group Only

Public Network Adapters and IP Network Multipathing

SPARC: Dynamic Reconfiguration Support

SPARC: Dynamic Reconfiguration General Description

SPARC: DR Clustering Considerations for CPU Devices

SPARC: DR Clustering Considerations for Memory

SPARC: DR Clustering Considerations for Disk and Tape Drives

SPARC: DR Clustering Considerations for Quorum Devices

SPARC: DR Clustering Considerations for Cluster Interconnect Interfaces

SPARC: DR Clustering Considerations for Public Network Interfaces

Index

High-Availability Framework

The Oracle Solaris Cluster software makes all components on the “path” between users and data highly available, including network interfaces, the applications themselves, the file system, and the multihost devices. In general, a cluster component is highly available if it survives any single (software or hardware) failure in the system. Failures that are caused by bugs or data corruption within the application itself are excluded.

The following table shows the kinds of Oracle Solaris Cluster component failures (both hardware and software) and the kinds of recovery that are built into the high-availability framework.

Table 3-1 Levels of Oracle Solaris Cluster Failure Detection and Recovery

Failed Cluster Component
Software Recovery
Hardware Recovery
Data service
HA API, HA framework
Not applicable
Public network adapter
IP network multipathing
Multiple public network adapter cards
Cluster file system
Primary and secondary replicas
Multihost devices
Mirrored multihost device
Volume management (Solaris Volume Manager and Veritas Volume Manager)
Hardware RAID-5 (for example, Oracle's Sun StorEdge A3x00)
Global device
Primary and secondary replicas
Multiple paths to the device, cluster transport junctions
Private network
HA transport software
Multiple private hardware-independent networks
Host
CMM, failfast driver
Multiple hosts
Zone
HA API, HA framework
Not applicable

Oracle Solaris Cluster software's high-availability framework detects a node failure quickly and creates a new equivalent server for the framework resources on a remaining node in the cluster. At no time are all framework resources unavailable. Framework resources that are unaffected by a failed node are fully available during recovery. Furthermore, framework resources of the failed node become available as soon as they are recovered. A recovered framework resource does not have to wait for all other framework resources to complete their recovery.

Most highly available framework resources are recovered transparently to the applications (data services) that are using the resource. The semantics of framework resource access are fully preserved across node failure. The applications cannot detect that the framework resource server has been moved to another node. Failure of a single node is completely transparent to programs on remaining nodes by using the files, devices, and disk volumes that are available to this node. This transparency exists if an alternative hardware path exists to the disks from another host. An example is the use of multihost devices that have ports to multiple hosts.

Zone Membership

Oracle Solaris Cluster software also tracks zone membership by detecting when a zone boots up or halts. These changes also trigger a reconfiguration. A reconfiguration can redistribute cluster resources among the nodes in the cluster.

Cluster Membership Monitor

To ensure that data is kept safe from corruption, all nodes must reach a consistent agreement on the cluster membership. When necessary, the CMM coordinates a cluster reconfiguration of cluster services (applications) in response to a failure.

The CMM receives information about connectivity to other nodes from the cluster transport layer. The CMM uses the cluster interconnect to exchange state information during a reconfiguration.

After detecting a change in cluster membership, the CMM performs a synchronized configuration of the cluster. In a synchronized configuration, cluster resources might be redistributed, based on the new membership of the cluster.

Failfast Mechanism

The failfast mechanism detects a critical problem on either a global-cluster voting node or global-cluster non-voting node. The action that Oracle Solaris Cluster takes when failfast detects a problem depends on whether the problem occurs in a voting node or a non-voting node.

If the critical problem is located in a voting node, Oracle Solaris Cluster forcibly shuts down the node. Oracle Solaris Cluster then removes the node from cluster membership.

If the critical problem is located in a non-voting node, Oracle Solaris Cluster reboots that non-voting node.

If a node loses connectivity with other nodes, the node attempts to form a cluster with the nodes with which communication is possible. If that set of nodes does not form a quorum, Oracle Solaris Cluster software halts the node and “fences” the node from the shared disks, that is, prevents the node from accessing the shared disks.

You can turn off fencing for selected disks or for all disks.


Caution

Caution - If you turn off fencing under the wrong circumstances, your data can be vulnerable to corruption during application failover. Examine this data corruption possibility carefully when you are considering turning off fencing. If your shared storage device does not support the SCSI protocol, such as a Serial Advanced Technology Attachment (SATA) disk, or if you want to allow access to the cluster's storage from hosts outside the cluster, turn off fencing.


If one or more cluster-specific daemons die, Oracle Solaris Cluster software declares that a critical problem has occurred. Oracle Solaris Cluster software runs cluster-specific daemons on both voting nodes and non-voting nodes. If a critical problem occurs, Oracle Solaris Cluster either shuts down and removes the node or reboots the non-voting node where the problem occurred.

When a cluster-specific daemon that runs on a non-voting node fails, a message similar to the following is displayed on the console.

cl_runtime: NOTICE: Failfast: Aborting because "pmfd" died in zone "zone4" (zone id 3)
35 seconds ago.

When a cluster-specific daemon that runs on a voting node fails and the node panics, a message similar to the following is displayed on the console.

panic[cpu1]/thread=2a10007fcc0: Failfast: Aborting because "pmfd" died in zone "global" (zone id 0)
35 seconds ago.
409b8 cl_runtime:__0FZsc_syslog_msg_log_no_argsPviTCPCcTB+48 (70f900, 30, 70df54, 407acc, 0)
%l0-7: 1006c80 000000a 000000a 10093bc 406d3c80 7110340 0000000 4001 fbf0

After the panic, the Oracle Solaris host might reboot and the node might attempt to rejoin the cluster. Alternatively, if the cluster is composed of SPARC based systems, the host might remain at the OpenBoot PROM (OBP) prompt. The next action of the host is determined by the setting of the auto-boot? parameter. You can set auto-boot? with the eeprom command, at the OpenBoot PROM ok prompt. See the eeprom(1M) man page.

Cluster Configuration Repository (CCR)

The CCR uses a two-phase commit algorithm for updates: An update must be successfully completed on all cluster members or the update is rolled back. The CCR uses the cluster interconnect to apply the distributed updates.


Caution

Caution - Although the CCR consists of text files, never edit the CCR files yourself. Each file contains a checksum record to ensure consistency between nodes. Updating CCR files yourself can cause a node or the entire cluster to stop working.


The CCR relies on the CMM to guarantee that a cluster is running only when quorum is established. The CCR is responsible for verifying data consistency across the cluster, performing recovery as necessary, and facilitating updates to the data.