Sun Cluster Overview for Solaris OS

Chapter 1 Introduction to Sun Cluster

The SunPlex system is an integrated hardware and Sun Cluster software solution that is used to create highly available and scalable services. This chapter provides a high-level overview of Sun Cluster features.

This chapter contains the following sections:

Making Applications Highly Available With Sun Cluster

A cluster is two or more systems, or nodes, that work together as a single, continuously available system to provide applications, system resources, and data to users. Each node on a cluster is a fully functional standalone system. However, in a clustered environment, the nodes are connected by an interconnect and work together as a single entity to provide increased availability and performance.

Highly available clusters provide nearly continuous access to data and applications by keeping the cluster running through failures that would normally bring down a single server system. No single failure—hardware, software, or network—can cause a cluster to fail. By contrast, fault-tolerant hardware systems provide constant access to data and applications, but at a higher cost because of specialized hardware. Fault-tolerant systems usually have no provision for software failures.

Each Sun Cluster system is a collection of tightly coupled nodes that provide a single administration view of network services and applications. The Sun Cluster system achieves high availability through a combination of the following hardware and software:

Availability Management

An application is highly available if it survives any single software or hardware failure in the system. Failures that are caused by bugs or data corruption within the application itself are excluded. The following apply to highly available applications:

Failover and Scalable Services and Parallel Applications

Failover and scalable services and parallel applications enable you to make your applications highly available and to improve an application's performance on a cluster.

A failover service provides high availability through redundancy. When a failure occurs, you can configure an application that is running to either restart on the same node, or be moved to another node in the cluster, without user intervention.

To increase performance, a scalable service leverages the multiple nodes in a cluster to concurrently run an application. In a scalable configuration, each node in the cluster can provide data and process client requests.

Parallel databases enable multiple instances of the database server to do the following:

For more information about failover and scalable services and parallel applications, see Data Service Types.

IP Network Multipathing

Clients make data requests to the cluster through the public network. Each cluster node is connected to at least one public network through one or multiple public network adapters.

IP network multipathing enables a server to have multiple network ports connected to the same subnet. First, IP network multipathing software provides resilience from network adapter failure by detecting the failure or repair of a network adapter. The software then simultaneously switches the network address to and from the alternative adapter. When more than one network adapter is functional, IP network multipathing increases data throughput by spreading outbound packets across adapters.

Storage Management

Multihost storage makes disks highly available by connecting the disks to multiple nodes. Multiple nodes enable multiple paths to access the data, if one path fails, another one is available to take its place.

Multihost disks enable the following cluster processes:

Volume Management Support

A volume manager enables you to manage large numbers of disks and the data on those disks. Volume managers can increase storage capacity and data availability by offering the following features:

Sun Cluster systems support the following volume managers:

Sun StorEdge Traffic Manager

Sun StorEdge Traffic Manager software is fully integrated starting with the Solaris Operating System 8 core I/O framework. Sun StorEdge Traffic Manager software enables you more effectively to represent and manage devices that are accessible through multiple I/O controller interfaces within a single instance of the Solaris operating environment. The Sun StorEdge Traffic Manager architecture enables the following:

Hardware Redundant Array of Independent Disks Support

Sun Cluster systems support the use of hardware Redundant Array of Independent Disks (RAID) and host-based software RAID. Hardware RAID uses the storage array's or storage system's hardware redundancy to ensure that independent hardware failures do not impact data availability. If you mirror across separate storage arrays, host-based software RAID ensures that independent hardware failures do not impact data availability when an entire storage array is offline. Although you can use hardware RAID and host-based software RAID concurrently, you need only one RAID solution to maintain a high degree of data availability.

File System Support

Because one of the inherent properties of clustered systems is shared resources, a cluster requires a file system that addresses the need for files to be shared coherently. The Sun Cluster file system enables users or applications to access any file on any node of the cluster by using remote or local standard UNIX APIs. Sun Cluster systems support the following file systems:

If an application is moved from one node to another node, no change is required for the application to access the same files. No changes need to be made to existing applications to fully utilize the cluster file system.

Campus Clusters

Standard Sun Cluster systems provide high availability and reliability from a single location. If your application must remain available after unpredictable disasters such as an earthquake, flood, or power outage, you can configure your cluster as a campus cluster.

Campus clusters enable you to locate cluster components, such as nodes and shared storage, in separate rooms several kilometers apart. You can separate your nodes and shared storage and locate them in different facilities around your corporate campus or elsewhere within several kilometers. When an a disaster strikes one location, the surviving nodes can take over service for the failed node. This enables applications and data to remain available for your users.

Monitoring Failure

The Sun Cluster system makes the path between users and data highly available by using multihost disks, multipathing, and a global file system. The Sun Cluster system monitors failures for the following:

Administration and Configuration Tools

You can install, configure, and administer the Sun Cluster system either though the SunPlex Manager GUI or through the command-line interface (CLI).

The Sun Cluster system also has a module that runs as part of Sun Management Center software that provides a GUI to certain cluster tasks.

SunPlex Manager

SunPlex Manager is a browser-based tool for administering Sun Cluster systems. The SunPlex Manager software enables administrators to perform system management and monitoring, software installation, and system configuration.

The SunPlex Manager software includes the following features.

Command-Line Interface

The Sun Cluster command-line interface is a set of utilities you can use to install and administer Sun Cluster systems, and administer the volume manager portion of Sun Cluster software.

You can perform the following SunPlex administration tasks through the Sun Cluster CLI:

Sun Management Center

The Sun Cluster system also has a module that runs as part of Sun Management Center software. Sun Management Center software serves as the cluster's base for administrative and monitoring operations and enables system administrators to perform the following tasks through a GUI or CLI:

Sun Management Center software can also be used as the interface to manage dynamic reconfiguration within Sun Cluster servers. Dynamic reconfiguration includes domain creation, dynamic board attach, and dynamic detach.

Role-Based Access Control

In conventional UNIX systems, the root user, also referred to as superuser, is omnipotent, with the ability to read and write to any file, run all programs, and send kill signals to any process. Solaris role-based access control (RBAC) is an alternative to the all-or-nothing superuser model. RBAC uses the security principle of least privilege, which is that no user should be given more privilege than necessary for performing his or her job.

RBAC enables an organization to separate superuser capabilities and package them into special user accounts or roles for assignment to specific individuals. This separation and packaging enables a variety of security policies. Accounts can be set up for special-purpose administrators in such areas as security, networking, firewall, backups, and system operation.