Oracle® Solaris Cluster Concepts Guide

Exit Print View

Updated: July 2014, E39575-01
 
 

High-Availability Framework

The Oracle Solaris Cluster software makes all components on the “path” between users and data highly available, including network interfaces, the applications themselves, the file system, and the multihost devices. In general, a cluster component is highly available if it survives any single (software or hardware) failure in the system. Failures that are caused by data corruption within the application itself are excluded.

The following table shows the kinds of Oracle Solaris Cluster component failures (both hardware and software) and the kinds of recovery that are built into the high-availability framework.

Table 3-1  Levels of Oracle Solaris Cluster Failure Detection and Recovery
Failed Cluster Component
Software Recovery
Hardware Recovery
Data service
HA API, HA framework
Not applicable
Public network adapter
IPMP
Multiple public network adapter cards
Cluster file system
Primary and secondary replicas
Multihost devices
Mirrored multihost device
Volume management (Solaris Volume Manager)
Hardware RAID-5
Global device
Primary and secondary replicas
Multiple paths to the device, cluster transport junctions
Private network
HA transport software
Multiple private hardware-independent networks
Node
CMM, failfast driver
Multiple nodes
Zone
HA API, HA framework
Not applicable

Oracle Solaris Cluster software's high-availability framework detects a node failure quickly and migrates the framework resources on a remaining node in the cluster. At no time are all framework resources unavailable. Framework resources on a failed node are fully available during recovery. Furthermore, framework resources of the failed node become available as soon as they are recovered. A recovered framework resource does not have to wait for all other framework resources to complete their recovery.

Highly available framework resources are recovered transparently to most of the applications (data services) that are using the resource. The semantics of framework resource access are fully preserved across node failure. The applications cannot detect that the framework resource server has been moved to another node. Failure of a single node is completely transparent to programs on remaining nodes by using the files, devices, and disk volumes that are available to this node. This transparency exists if an alternative hardware path exists to the disks from another node. An example is the use of multihost devices that have ports to multiple nodes.