This chapter describes the key concepts that are related to the hardware components of a Sun Cluster configuration.
This chapter covers the following topics:
This information is directed primarily to hardware service providers. These concepts can help service providers understand the relationships between the hardware components before they install, configure, or service cluster hardware. Cluster system administrators might also find this information useful as background to installing, configuring, and administering cluster software.
A cluster is composed of several hardware components, including the following:
Solaris hosts with local disks (unshared)
Multihost storage (disks are shared between Solaris hosts)
Removable media (tapes and CD-ROMs)
Cluster interconnect
Public network interfaces
Client systems
Administrative console
Console access devices
The Sun Cluster software enables you to combine these components into a variety of configurations. The following sections describe these configurations.
For an illustration of a sample two-host cluster configuration, see Sun Cluster Hardware Environment in Sun Cluster Overview for Solaris OS.
In a cluster that runs on any version of the Solaris OS that was released before the Solaris 10 OS, a node is a physical machine that contributes to cluster membership and is not a quorum device. In a cluster that runs on the Solaris 10 OS, the concept of a node changes. In this environment, a node is a Solaris zone that is associated with a cluster. In this environment, a Solaris host, or simply host, is one of the following hardware or software configurations that runs the Solaris OS and its own processes:
A “bare metal” physical machine that is not configured with a virtual machine or as a hardware domain
A Sun Logical Domains (LDoms) guest domain
A Sun Logical Domains (LDoms) I/O domain
A hardware domain
Depending on your platform, Sun Cluster software supports the following configurations:
SPARC: Sun Cluster software supports from one to sixteen Solaris hosts in a cluster. Different hardware configurations impose additional limits on the maximum number of hosts that you can configure in a cluster composed of SPARC based systems. See SPARC: Sun Cluster Topologies for the supported configurations.
x86: Sun Cluster software supports from one to eight Solaris hosts in a cluster. Different hardware configurations impose additional limits on the maximum number of hosts that you can configure in a cluster composed of x86 based systems. See x86: Sun Cluster Topologies for the supported configurations.
Solaris hosts are generally attached to one or more multihost devices. Hosts that are not attached to multihost devices use the cluster file system to access the multihost devices. For example, one scalable services configuration enables hosts to service requests without being directly attached to multihost devices.
In addition, hosts in parallel database configurations share concurrent access to all the disks.
See Multihost Devices for information about concurrent access to disks.
See SPARC: Clustered Pair Topology and x86: Clustered Pair Topology for more information about parallel database configurations.
All nodes in the cluster are grouped under a common name (the cluster name), which is used for accessing and managing the cluster.
Public network adapters attach hosts to the public networks, providing client access to the cluster.
Cluster members communicate with the other hosts in the cluster through one or more physically independent networks. This set of physically independent networks is referred to as the cluster interconnect.
Every node in the cluster is aware when another node joins or leaves the cluster. Additionally, every node in the cluster is aware of the resources that are running locally as well as the resources that are running on the other cluster nodes.
Hosts in the same cluster should have similar processing, memory, and I/O capability to enable failover to occur without significant degradation in performance. Because of the possibility of failover, every host must have enough excess capacity to support the workload of all hosts for which they are a backup or secondary.
Each host boots its own individual root (/) file system.
To function as a cluster member, a Solaris host must have the following software installed:
Solaris Operating System
Sun Cluster software
Data service application
Volume management (Solaris Volume ManagerTM or Veritas Volume Manager)
An exception is a configuration that uses hardware redundant array of independent disks (RAID). This configuration might not require a software volume manager such as Solaris Volume Manager or Veritas Volume Manager.
See the Sun Cluster Software Installation Guide for Solaris OS for information about how to install the Solaris Operating System, Sun Cluster, and volume management software.
See the Sun Cluster Data Services Planning and Administration Guide for Solaris OS for information about how to install and configure data services.
See Chapter 3, Key Concepts for System Administrators and Application Developers for conceptual information about the preceding software components.
The following figure provides a high-level view of the software components that work together to create the Sun Cluster environment.
See Chapter 4, Frequently Asked Questions for questions and answers about cluster members.
Disks that can be connected to more than one Solaris host at a time are multihost devices. In the Sun Cluster environment, multihost storage makes disks highly available. Sun Cluster software requires multihost storage for two-host clusters to establish quorum. Greater than two-host clusters do not require quorum devices. For more information about quorum, see Quorum and Quorum Devices.
Multihost devices have the following characteristics.
Tolerance of single-host failures.
Ability to store application data, application binaries, and configuration files.
Protection against host failures. If clients request the data through one host and the host fails, the requests are switched over to use another host with a direct connection to the same disks.
Global access through a primary host that “masters” the disks, or direct concurrent access through local paths. The only application that uses direct concurrent access currently is Oracle Real Application Clusters Guard.
A volume manager provides for mirrored or RAID-5 configurations for data redundancy of the multihost devices. Currently, Sun Cluster supports Solaris Volume Manager and Veritas Volume Manager as volume managers, and the RDAC RAID-5 hardware controller on several hardware RAID platforms.
Combining multihost devices with disk mirroring and disk striping protects against both host failure and individual disk failure.
See Chapter 4, Frequently Asked Questions for questions and answers about multihost storage.
This section applies only to SCSI storage devices and not to Fibre Channel storage that is used for the multihost devices.
In a standalone (that is, non-clustered) host, the host controls the SCSI bus activities by way of the SCSI host adapter circuit that connects this host to a particular SCSI bus. This SCSI host adapter circuit is referred to as the SCSI initiator. This circuit initiates all bus activities for this SCSI bus. The default SCSI address of SCSI host adapters in Sun systems is 7.
Cluster configurations share storage between multiple hosts, using multihost devices. When the cluster storage consists of single-ended or differential SCSI devices, the configuration is referred to as multi-initiator SCSI. As this terminology implies, more than one SCSI initiator exists on the SCSI bus.
The SCSI specification requires each device on a SCSI bus to have a unique SCSI address. (The host adapter is also a device on the SCSI bus.) The default hardware configuration in a multi-initiator environment results in a conflict because all SCSI host adapters default to 7.
To resolve this conflict, on each SCSI bus, leave one of the SCSI host adapters with the SCSI address of 7, and set the other host adapters to unused SCSI addresses. Proper planning dictates that these “unused” SCSI addresses include both currently and eventually unused addresses. An example of addresses unused in the future is the addition of storage by installing new drives into empty drive slots.
In most configurations, the available SCSI address for a second host adapter is 6.
You can change the selected SCSI addresses for these host adapters by using one of the following tools to set the scsi-initiator-id property:
The OpenBootTM PROM on a SPARC based system
The SCSI utility that you optionally run after the BIOS boots on an x86 based system
You can set this property globally for a host or on a per-host-adapter basis. Instructions for setting a unique scsi-initiator-id for each SCSI host adapter are included in Sun Cluster 3.1 - 3.2 With SCSI JBOD Storage Device Manual for Solaris OS.
Local disks are the disks that are only connected to a single Solaris host. Local disks are, therefore, not protected against host failure (they are not highly available). However, all disks, including local disks, are included in the global namespace and are configured as global devices. Therefore, the disks themselves are visible from all cluster hosts.
You can make the file systems on local disks available to other hosts by placing them under a global mount point. If the host that currently has one of these global file systems mounted fails, all hosts lose access to that file system. Using a volume manager lets you mirror these disks so that a failure cannot cause these file systems to become inaccessible, but volume managers do not protect against host failure.
See the section Global Devices for more information about global devices.
Removable media such as tape drives and CD-ROM drives are supported in a cluster. In general, you install, configure, and service these devices in the same way as in a nonclustered environment. These devices are configured as global devices in Sun Cluster, so each device can be accessed from any node in the cluster. Refer to Sun Cluster 3.1 - 3.2 Hardware Administration Manual for Solaris OS for information about installing and configuring removable media.
See the section Global Devices for more information about global devices.
The cluster interconnect is the physical configuration of devices that is used to transfer cluster-private communications and data service communications between Solaris hosts in the cluster. Because the interconnect is used extensively for cluster-private communications, it can limit performance.
Only hosts in the cluster can be connected to the cluster interconnect. The Sun Cluster security model assumes that only cluster hosts have physical access to the cluster interconnect.
You can set up from one to six cluster interconnects in a cluster. While a single cluster interconnect reduces the number of adapter ports that are used for the private interconnect, it provides no redundancy and less availability. If a single interconnect fails, moreover, the cluster is at a higher risk of having to perform automatic recovery. Whenever possible, install two or more cluster interconnects to provide redundancy and scalability, and therefore higher availability, by avoiding a single point of failure.
The cluster interconnect consists of three hardware components: adapters, junctions, and cables. The following list describes each of these hardware components.
Adapters – The network interface cards that are located in each cluster host. Their names are constructed from a device name immediately followed by a physical-unit number, for example, qfe2. Some adapters have only one physical network connection, but others, like the qfe card, have multiple physical connections. Some adapters also contain both network interfaces and storage interfaces.
A network adapter with multiple interfaces could become a single point of failure if the entire adapter fails. For maximum availability, plan your cluster so that the only path between two hosts does not depend on a single network adapter.
Junctions – The switches that are located outside of the cluster hosts. Junctions perform pass-through and switching functions to enable you to connect more than two hosts. In a two-host cluster, you do not need junctions because the hosts can be directly connected to each other through redundant physical cables connected to redundant adapters on each host. Greater than two-host configurations generally require junctions.
Cables – The physical connections that you install either between two network adapters or between an adapter and a junction.
See Chapter 4, Frequently Asked Questions for questions and answers about the cluster interconnect.
Clients connect to the cluster through the public network interfaces. Each network adapter card can connect to one or more public networks, depending on whether the card has multiple hardware interfaces.
You can set up Solaris hosts in the cluster to include multiple public network interface cards that perform the following functions:
Are configured so that multiple cards are active.
Serve as failover backups for one another.
If one of the adapters fails, IP network multipathing software is called to fail over the defective interface to another adapter in the group.
No special hardware considerations relate to clustering for the public network interfaces.
See Chapter 4, Frequently Asked Questions for questions and answers about public networks.
Client systems include machines or other hosts that access the cluster over the public network. Client-side programs use data or other services that are provided by server-side applications running on the cluster.
Client systems are not highly available. Data and applications on the cluster are highly available.
See Chapter 4, Frequently Asked Questions for questions and answers about client systems.
You must have console access to all Solaris hosts in the cluster.
To gain console access, use one of the following devices:
The terminal concentrator that you purchased with your cluster hardware
The System Service Processor (SSP) on Sun Enterprise E10000 servers (for SPARC based clusters)
The system controller on Sun FireTM servers (also for SPARC based clusters)
Another device that can access ttya on each host
Only one supported terminal concentrator is available from Sun and use of the supported Sun terminal concentrator is optional. The terminal concentrator enables access to /dev/console on each host by using a TCP/IP network. The result is console-level access for each host from a remote machine anywhere on the network.
The System Service Processor (SSP) provides console access for Sun Enterprise E1000 servers. The SSP is a processor card in a machine on an Ethernet network that is configured to support the Sun Enterprise E1000 server. The SSP is the administrative console for the Sun Enterprise E1000 server. Using the Sun Enterprise E10000 Network Console feature, any machine in the network can open a host console session.
Other console access methods include other terminal concentrators, tip serial port access from another host, and dumb terminals.
You can attach a keyboard or monitor to a cluster host provided that that keyboard or monitor is supported by the base server platform. However, you cannot use that keyboard or monitor as a console device. You must redirect the console to a serial port, or depending on your machine, to the System Service Processor (SSP) and Remote System Control (RSC) by setting the appropriate OpenBoot PROM parameter.
You can use a dedicated machine, known as the administrative console, to administer the active cluster. Usually, you install and run administrative tool software, such as the Cluster Control Panel (CCP) and the Sun Cluster module for the Sun Management Center product (for use with SPARC based clusters only), on the administrative console. Using cconsole under the CCP enables you to connect to more than one host console at a time. For more information about to use the CCP, see the Chapter 1, Introduction to Administering Sun Cluster, in Sun Cluster System Administration Guide for Solaris OS.
The administrative console is not a cluster host. You use the administrative console for remote access to the cluster hosts, either over the public network, or optionally through a network-based terminal concentrator.
If your cluster consists of the Sun Enterprise E10000 platform, you must do the following:
Log in from the administrative console to the SSP.
Connect by using the netcon command.
Typically, you configure hosts without monitors. Then, you access the host's console through a telnet session from the administrative console. The administration console is connected to a terminal concentrator, and from the terminal concentrator to the host's serial port. In the case of a Sun Enterprise E1000 server, you connect from the System Service Processor. See Console Access Devices for more information.
Sun Cluster does not require a dedicated administrative console, but using one provides these benefits:
Enables centralized cluster management by grouping console and management tools on the same machine
Provides potentially quicker problem resolution by your hardware service provider
See Chapter 4, Frequently Asked Questions for questions and answers about the administrative console.
A topology is the connection scheme that connects the Solaris hosts in the cluster to the storage platforms that are used in a Sun Cluster environment. Sun Cluster software supports any topology that adheres to the following guidelines.
A Sun Cluster environment that is composed of SPARC based systems supports from one to sixteen Solaris hosts in a cluster. Different hardware configurations impose additional limits on the maximum number of hosts that you can configure in a cluster composed of SPARC based systems.
A shared storage device can connect to as many hosts as the storage device supports.
Shared storage devices do not need to connect to all hosts of the cluster. However, these storage devices must connect to at least two hosts.
You can configure logical domains (LDoms) guest domains and LDoms I/O domains as virtual Solaris hosts. In other words, you can create a clustered pair, pair+N, N+1, and N*N cluster that consists of any combination of physical machines, LDoms I/O domains, and LDoms guest domains. You can also create clusters that consist of only LDoms guest domains, LDoms I/O domains, or any combination of the two.
Sun Cluster software does not require you to configure a cluster by using specific topologies. The following topologies are described to provide the vocabulary to discuss a cluster's connection scheme. These topologies are typical connection schemes.
Clustered pair
Pair+N
N+1 (star)
N*N (scalable)
LDoms Guest Domains: Cluster in a Box
LDoms Guest Domains: Single Cluster Spans Two Different Hosts
LDoms Guest Domains: Clusters Span Two Different Hosts
LDoms Guest Domains: Redundant I/O Domains
The following sections include sample diagrams of each topology.
A clustered pair topology is two or more pairs of Solaris hosts that operate under a single cluster administrative framework. In this configuration, failover occurs only between a pair. However, all hosts are connected by the cluster interconnect and operate under Sun Cluster software control. You might use this topology to run a parallel database application on one pair and a failover or scalable application on another pair.
Using the cluster file system, you could also have a two-pair configuration. More than two hosts can run a scalable service or parallel database, even though all the hosts are not directly connected to the disks that store the application data.
The following figure illustrates a clustered pair configuration.
The pair+N topology includes a pair of Solaris hosts that are directly connected to the following:
Shared storage.
An additional set of hosts that use the cluster interconnect to access shared storage (they have no direct connection themselves).
The following figure illustrates a pair+N topology where two of the four hosts (Host 3 and Host 4) use the cluster interconnect to access the storage. This configuration can be expanded to include additional hosts that do not have direct access to the shared storage.
An N+1 topology includes some number of primary Solaris hosts and one secondary host. You do not have to configure the primary hosts and secondary host identically. The primary hosts actively provide application services. The secondary host need not be idle while waiting for a primary host to fail.
The secondary host is the only host in the configuration that is physically connected to all the multihost storage.
If a failure occurs on a primary host, Sun Cluster fails over the resources to the secondary host. The secondary host is where the resources function until they are switched back (either automatically or manually) to the primary host.
The secondary host must always have enough excess CPU capacity to handle the load if one of the primary hosts fails.
The following figure illustrates an N+1 configuration.
An N*N topology enables every shared storage device in the cluster to connect to every Solaris host in the cluster. This topology enables highly available applications to fail over from one host to another without service degradation. When failover occurs, the new host can access the storage device by using a local path instead of the private interconnect.
The following figure illustrates an N*N configuration.
In this logical domains (LDoms) guest domain topology, a cluster and every node within that cluster are located on the same Solaris host. Each LDoms guest domain node acts the same as a Solaris host in a cluster. To preclude your having to include a quorum device, this configuration includes three nodes rather than only two.
In this topology, you do not need to connect each virtual switch (vsw) for the private network to a physical network because they need only communicate with each other. In this topology, cluster nodes can also share the same storage device, as all cluster nodes are located on the same host. To learn more about guidelines for using and installing LDoms guest domains or LDoms I/O domains in a cluster, see How to Install Sun Logical Domains Software and Create Domains in Sun Cluster Software Installation Guide for Solaris OS.
This topology does not provide high availability, as all nodes in the cluster are located on the same host. However, developers and administrators might find this topology useful for testing and other non-production tasks. This topology is also called a “cluster in a box”.
The following figure illustrates a cluster in a box configuration.
In this logical domains (LDoms) guest domain topology, a single cluster spans two different Solaris hosts and each cluster comprises one node on each host. Each LDoms guest domain node acts the same as a Solaris host in a cluster. To learn more about guidelines for using and installing LDoms guest domains or LDoms I/O domains in a cluster, see How to Install Sun Logical Domains Software and Create Domains in Sun Cluster Software Installation Guide for Solaris OS.
The following figure illustrates a configuration in which a single cluster spans two different hosts.
In this logical domains (LDoms) guest domain topology, each cluster spans two different Solaris hosts and each cluster comprises one node on each host. Each LDoms guest domain node acts the same as a Solaris host in a cluster. In this configuration, because both clusters share the same interconnect switch, you must specify a different private network address on each cluster. Otherwise, if you specify the same private network address on clusters that share an interconnect switch, the configuration fails.
To learn more about guidelines for using and installing LDoms guest domains or LDoms I/O domains in a cluster, see How to Install Sun Logical Domains Software and Create Domains in Sun Cluster Software Installation Guide for Solaris OS.
The following figure illustrates a configuration in which more than a single cluster spans two different hosts.
In this logical domains (LDoms) guest domain topology, multiple I/O domains ensure that guest domains, or nodes within the cluster, continue to operate if an I/O domain fails. Each LDoms guest domain node acts the same as a Solaris host in a cluster.
In this topology, the guest domain runs IP network multipathing (IPMP) across two public networks, one through each I/O domain. Guest domains also mirror storage devices across different I/O domains. To learn more about guidelines for using and installing LDoms guest domains or LDoms I/O domains in a cluster, see How to Install Sun Logical Domains Software and Create Domains in Sun Cluster Software Installation Guide for Solaris OS.
The following figure illustrates a configuration in which redundant I/O domains ensure that nodes within the cluster continue to operate if an I/O domain fails.
A topology is the connection scheme that connects the cluster nodes to the storage platforms that are used in the cluster. Sun Cluster supports any topology that adheres to the following guidelines.
Sun Cluster software supports from one to eight Solaris hosts in a cluster. Different hardware configurations impose additional limits on the maximum number of hosts that you can configure in a cluster composed of x86 based systems. See x86: Sun Cluster Topologies for the supported host configurations.
Shared storage devices must connect to hosts.
Sun Cluster does not require you to configure a cluster by using specific topologies. The following clustered pair topology, which is a topology for clusters that are composed of x86 based hosts, is described to provide the vocabulary to discuss a cluster's connection scheme. This topology is a typical connection scheme.
The following section includes a sample diagram of the topology.
A clustered pair topology is two Solaris hosts that operate under a single cluster administrative framework. In this configuration, failover occurs only between a pair. However, all hosts are connected by the cluster interconnect and operate under Sun Cluster software control. You might use this topology to run a parallel database or a failover or scalable application on the pair.
The following figure illustrates a clustered pair configuration.
An N+1 topology includes some number of primary Solaris hosts and one secondary host. You do not have to configure the primary hosts and secondary host identically. The primary hosts actively provide application services. The secondary host need not be idle while waiting for a primary host to fail.
The secondary host is the only host in the configuration that is physically connected to all the multihost storage.
If a failure occurs on a primary host, Sun Cluster fails over the resources to the secondary host. The secondary host is where the resources function until they are switched back (either automatically or manually) to the primary host.
The secondary host must always have enough excess CPU capacity to handle the load if one of the primary hosts fails.
The following figure illustrates an N+1 configuration.