Sun Cluster 2.2 Software Installation Guide

Configuration Rules for Improved Reliability

The rules discussed in this section help ensure that your Sun Cluster configuration is highly available. These rules also help determine the appropriate hardware for your configuration.

Configuring redundant hardware is not always possible--some configurations might contain only one system board--but some of the concerns can still be addressed easily with hardware options. For example, in an Ultra Enterprise(TM) 2 Cluster with two SPARCstorage Arrays, one private network can be connected to a Sun Quad FastEthernet(TM) Controller card (SQEC), while the other private network can be connected to the on-board interface.

Mirroring Guidelines

Unless you are using a RAID5 configuration, all multihost disks must be mirrored in Sun Cluster configurations. This enables the configuration to tolerate single-disk failures.

Consider these points when mirroring multihost disks:

Mirroring Root (/)

For maximum availability, you should mirror root (/), /usr, /var, /opt, and swap on the local disks. Under VERITAS Volume Manager, this means encapsulating the root disk and mirroring the generated subdisks. However, mirroring the root disk is not a requirement of Sun Cluster.

You should consider the risks, complexity, cost, and service time for the various alternatives concerning the root disk. There is not one answer for all configurations. You might want to consider your local Enterprise Services representative's preferred solution when deciding whether to mirror root.

Refer to your volume manager documentation for instructions on mirroring root.

Consider the following issues when deciding whether to mirror the root file system.

Note, however, that there is nothing in the Sun Cluster software that guarantees an immediate takeover. In fact, the takeover might not occur at all. For example, presume some sectors of a disk are bad. Presume they are all in the user data portions of a file that is crucial to some data service. The data service will start getting I/O errors, but the Sun Cluster node will stay up.

At a later point the primary root disk might return to service (perhaps after a power cycle or transient I/O errors) and subsequent boots are performed using the primary root disk specified in the OpenBoot(TM) PROM boot-device field. Note that a Solstice DiskSuite resync has not occurred--that requires a manual step when the drive is returned to service.

In this situation there was no manual repair task--the drive simply started working "well enough" to boot.

If there were changes to any files on the secondary (mirror) root device, they would not be reflected on the primary root device during boot time (causing a stale submirror). For example, changes to /etc/system would be lost. It is possible that some Solstice DiskSuite administrative commands changed /etc/system while the primary root device was out of service.

The boot program does not know whether it is booting from a mirror or an underlying physical device, and the mirroring becomes active part way through the boot process (after the metadevices are loaded). Before this point the system is vulnerable to stale submirror problems.

Solstice DiskSuite Mirroring Alternatives

Consider the following alternatives when deciding whether to mirror root (/) file systems under Solstice DiskSuite. The issues mentioned in this section are not applicable to VERITAS Volume Manager configurations.