Sun Cluster 3.1 Concepts Guide

Introduction to the SunPlex System

The SunPlex system extends the Solaris operating environment into a cluster operating system. A cluster, or plex, is a collection of loosely coupled computing nodes that provides a single client view of network services or applications, including databases, web services, and file services.

Each cluster node is a standalone server that runs its own processes. These processes communicate with one another to form what looks like (to a network client) a single system that cooperatively provides applications, system resources, and data to users.

A cluster offers several advantages over traditional single-server systems. These advantages include support for failover and scalable services, capacity for modular growth, and low entry price compared to traditional hardware fault-tolerant systems.

The goals of the SunPlex system are:

High Availability Versus Fault Tolerance

The SunPlex system is designed as a highly available (HA) system, that is, a system that provides near continuous access to data and applications.

By contrast, fault-tolerant hardware systems provide constant access to data and applications, but at a higher cost because of specialized hardware. Additionally, fault-tolerant systems usually do not account for software failures.

The SunPlex system achieves high availability through a combination of hardware and software. Redundant cluster interconnects, storage, and public networks protect against single points of failure. The cluster software continuously monitors the health of member nodes and prevents failing nodes from participating in the cluster to protect against data corruption. Also, the cluster monitors services and their dependent system resources, and fails over or restarts services in case of failures.

Refer to High Availability FAQs for questions and answers on high availability.

Failover and Scalability in the SunPlex System

The SunPlex system enables you to implement either failover or scalable services. In general, a failover service provides only high availability (redundancy), whereas a scalable service provides high availability along with increased performance. A single cluster can support both failover and scalable services.

Failover Services

Failover is the process by which the cluster automatically relocates a service from a failed primary node to a designated secondary node. With failover, Sun Cluster software provides high availability.

When a failover occurs, clients might see a brief interruption in service and might need to reconnect after the failover has finished. However, clients are not aware of the physical server that provides the service.

Scalable Services

While failover is concerned with redundancy, scalability provides constant response time or throughput without regard to load. A scalable service leverages the multiple nodes in a cluster to concurrently run an application, thus providing increased performance. In a scalable configuration, each node in the cluster can provide data and process client requests.

Refer to Data Services for more specific information on failover and scalable services.