Sun Cluster Overview for Solaris OS

Making Applications Highly Available With Sun Cluster

A cluster is a collection of loosely coupled computing nodes that provides a single client view of network services or applications, including databases, web services, and file services.

In a clustered environment, the nodes are connected by an interconnect and work together as a single entity to provide increased availability and performance.

Highly available clusters provide nearly continuous access to data and applications by keeping the cluster running through failures that would normally bring down a single server system. No single failure, hardware, software, or network, can cause a cluster to fail. By contrast, fault-tolerant hardware systems provide constant access to data and applications, but at a higher cost because of specialized hardware. Fault-tolerant systems usually have no provision for software failures.

Each Sun Cluster system is a collection of tightly coupled nodes that provide a single administration view of network services and applications. The Sun Cluster system achieves high availability through a combination of the following hardware and software:

Availability Management

An application is highly available if it survives any single software or hardware failure in the system. Failures that are caused by bugs or data corruption within the application itself are excluded. The following apply to highly available applications:

Failover and Scalable Services and Parallel Applications

Failover and scalable services and parallel applications enable you to make your applications highly available and to improve an application's performance on a cluster.

A failover service provides high availability through redundancy. When a failure occurs, you can configure an application that is running to either restart on the same node, or be moved to another node in the cluster, without user intervention.

To increase performance, a scalable service leverages the multiple nodes in a cluster to concurrently run an application. In a scalable configuration, each node in the cluster can provide data and process client requests.

Parallel databases enable multiple instances of the database server to do the following:

For more information about failover and scalable services and parallel applications, see Data Service Types.

IP Network Multipathing

Clients make data requests to the cluster through the public network. Each Solaris host is connected to at least one public network through one or multiple public network adapters.

IP network multipathing enables a server to have multiple network ports connected to the same subnet. First, IP network multipathing software provides resilience from network adapter failure by detecting the failure or repair of a network adapter. The software then simultaneously switches the network address to and from the alternative adapter. When more than one network adapter is functional, IP network multipathing increases data throughput by spreading outbound packets across adapters.

Storage Management

Multihost storage makes disks highly available by connecting the disks to multiple Solaris hosts. Multiple hosts enable multiple paths to access the data. If one path fails, another one is available to take its place.

Multihost disks enable the following cluster processes:

Volume Management Support

A volume manager enables you to manage large numbers of disks and the data on those disks. Volume managers can increase storage capacity and data availability by offering the following features:

Sun Cluster systems support the following volume managers:

Solaris I/O Multipathing (MPxIO)

Solaris I/O multipathing (MPxIO), which was formerly named Sun StorEdge Traffic Manager, is fully integrated in the Solaris Operating System I/O framework. Solaris I/O multipathing enables you to represent and manage devices that are accessible through multiple I/O controller interfaces within a single instance of the Solaris operating system.

The Solaris I/O multipathing architecture provides the following features:

Hardware Redundant Array of Independent Disks Support

Sun Cluster systems support the use of hardware Redundant Array of Independent Disks (RAID) and host-based software RAID. Hardware RAID uses the storage array's or storage system's hardware redundancy to ensure that independent hardware failures do not impact data availability. If you mirror across separate storage arrays, host-based software RAID ensures that independent hardware failures do not impact data availability when an entire storage array is offline. Although you can use hardware RAID and host-based software RAID concurrently, you need only one RAID solution to maintain a high degree of data availability.

Cluster File System Support

Because one of the inherent properties of clustered systems is shared resources, a cluster requires a file system that addresses the need for files to be shared coherently. In a Sun Cluster file system, a cluster file system enables users or applications to access any file on any node of the cluster by using remote or local standard UNIX APIs.

Sun Cluster systems support the following cluster file systems:

Sun Cluster software supports the following as highly available failover local file systems:

If an application is moved from one node to another node, no change is required for the application to access the same files. No changes need to be made to existing applications to fully utilize the cluster file system.

Campus Clusters

Standard Sun Cluster systems provide high availability and reliability from a single location. If your application must remain available after unpredictable disasters such as an earthquake, flood, or power outage, you can configure your cluster as a campus cluster.

Campus clusters enable you to locate cluster components, such as Solaris hosts and shared storage, in separate rooms that are several kilometers apart. You can separate your hosts and shared storage and locate them in different facilities around your corporate campus or elsewhere within several kilometers. When a disaster strikes one location, the surviving hosts can take over service for the failed host. This enables applications and data to remain available for your users. For additional information about campus cluster configurations, see the Sun Cluster 3.1 - 3.2 Hardware Administration Manual for Solaris OS.