This chapter provides planning information and guidelines for installing a Sun Cluster configuration.
The following overview information is in this chapter:
The following table shows where to find instructions for various installation tasks for Sun Cluster software installation and the order in which you should perform the tasks.
Table 1–1 Sun Cluster Software Installation Task Information
Task |
Instructions |
---|---|
Set up cluster hardware. |
Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS Documentation that shipped with your server and storage devices |
Plan cluster software installation. | |
Install software packages. Optionally, install and configure Sun StorEdge QFS software. |
Sun StorEdge QFS and Sun StorEdge SAM-FS Software Installation and Configuration Guide |
Establish a new cluster or a new cluster node. | |
Install and configure Solstice DiskSuiteTM or Solaris Volume Manager software. |
Installing and Configuring Solstice DiskSuite or Solaris Volume Manager Software Solstice DiskSuite or Solaris Volume Manager documentation |
SPARC: Install and configure VERITAS Volume Manager (VxVM) software. |
SPARC: Installing and Configuring VxVM Software VxVM documentation |
Configure cluster file systems and other cluster components. | |
(Optional) SPARC: Install and configure the Sun Cluster module to Sun Management Center. |
SPARC: Installing the Sun Cluster Module for Sun Management Center Sun Management Center documentation |
Plan, install, and configure resource groups and data services. |
Sun Cluster Data Services Planning and Administration Guide for Solaris OS |
Develop custom data services. | |
Upgrade to Sun Cluster 3.1 8/05 software. |
Chapter 5, Upgrading Sun Cluster Software Installing and Configuring Solstice DiskSuite or Solaris Volume Manager Software or SPARC: Installing and Configuring VxVM Software Volume manager documentation |
This section provides guidelines for planning Solaris software installation in a cluster configuration. For more information about Solaris software, see your Solaris installation documentation.
You can install Solaris software from a local CD-ROM or from a network installation server by using the JumpStartTM installation method. In addition, Sun Cluster software provides a custom method for installing both the Solaris OS and Sun Cluster software by using the JumpStart installation method. If you are installing several cluster nodes, consider a network installation.
See How to Install Solaris and Sun Cluster Software (JumpStart) for details about the scinstall JumpStart installation method. See your Solaris installation documentation for details about standard Solaris installation methods.
The following Solaris OS features are not supported in a Sun Cluster configuration:
Sun Cluster 3.1 8/05 software does not support non-global zones. All Sun Cluster software and software that is managed by the cluster must be installed only on the global zone of the node. Do not install cluster-related software on a non-global zone. In addition, all cluster-related software must be installed in a way that prevents propagation to a non-global zone that is later created on a cluster node. For more information, see Adding a Package to the Global Zone Only in System Administration Guide: Solaris Containers-Resource Management and Solaris Zones.
Solaris interface groups are not supported in a Sun Cluster configuration. The Solaris interface groups feature is disabled by default during Solaris software installation. Do not re-enable Solaris interface groups. See the ifconfig(1M) man page for more information about Solaris interface groups.
Automatic power-saving shutdown is not supported in Sun Cluster configurations and should not be enabled. See the pmconfig(1M) and power.conf(4) man pages for more information.
Sun Cluster software does not support Extensible Firmware Interface (EFI) disk labels.
Sun Cluster software does not support filtering with Solaris IP Filter. The use of the STREAMS autopush(1M) mechanism by Solaris IP Filter conflicts with Sun Cluster software's use of the mechanism.
Sun Cluster 3.1 8/05 software requires at least the End User Solaris Software Group. However, other components of your cluster configuration might have their own Solaris software requirements as well. Consider the following information when you decide which Solaris software group you are installing.
Check your server documentation for any Solaris software requirements. For example , Sun EnterpriseTM 10000 servers require the Entire Solaris Software Group Plus OEM Support.
If you intend to use SCI-PCI adapters, which are available for use in SPARC based clusters only, or the Remote Shared Memory Application Programming Interface (RSMAPI), ensure that you install the RSMAPI software packages (SUNWrsm and SUNWrsmo, and also SUNWrsmx and SUNWrsmox for the Solaris 8 or Solaris 9 OS). The RSMAPI software packages are included only in some Solaris software groups. For example, the Developer Solaris Software Group includes the RSMAPI software packages but the End User Solaris Software Group does not.
If the software group that you install does not include the RSMAPI software packages, install the RSMAPI software packages manually before you install Sun Cluster software. Use the pkgadd(1M) command to manually install the software packages. See the Solaris 8 Section (3RSM) man pages for information about using the RSMAPI.
You might need to install other Solaris software packages that are not part of the End User Solaris Software Group. The Apache HTTP server packages are one example. Third-party software, such as ORACLE®, might also require additional Solaris software packages. See your third-party documentation for any Solaris software requirements.
To avoid the need to manually install Solaris software packages, install the Entire Solaris Software Group Plus OEM Support.
Add this information to the appropriate Local File System Layout Worksheet.
When you install the Solaris OS, ensure that you create the required Sun Cluster partitions and that all partitions meet minimum space requirements.
swap – The combined amount of swap space that is allocated for Solaris and Sun Cluster software must be no less than 750 Mbytes. For best results, add at least 512 Mbytes for Sun Cluster software to the amount that is required by the Solaris OS. In addition, allocate any additional swap amount that is required by applications that are to run on the cluster node.
If you intend to create an additional swap file, do not create the swap file on a global device. Only use a local disk as a swap device for the node.
/globaldevices – Create a 512-Mbyte file system that is to be used by the scinstall(1M) utility for global devices.
Volume manager – Create a 20-Mbyte partition on a slice at the end of the disk (slice 7) for volume manager use. If your cluster uses VERITAS Volume Manager (VxVM) and you intend to encapsulate the root disk, you need to have two unused slices available for use by VxVM.
To meet these requirements, you must customize the partitioning if you are performing interactive installation of the Solaris OS.
See the following guidelines for additional partition planning information:
As with any other system running the Solaris OS, you can configure the root (/), /var, /usr, and /opt directories as separate file systems. Or, you can include all the directories in the root (/) file system. The following describes the software contents of the root (/), /var, /usr, and /opt directories in a Sun Cluster configuration. Consider this information when you plan your partitioning scheme.
root (/) – The Sun Cluster software itself occupies less than 40 Mbytes of space in the root (/) file system. Solstice DiskSuite or Solaris Volume Manager software requires less than 5 Mbytes, and VxVM software requires less than 15 Mbytes. To configure ample additional space and inode capacity, add at least 100 Mbytes to the amount of space you would normally allocate for your root ( /) file system. This space is used for the creation of both block special devices and character special devices used by the volume management software. You especially need to allocate this extra space if a large number of shared disks are in the cluster.
/var – The Sun Cluster software occupies a negligible amount of space in the /var file system at installation time. However, you need to set aside ample space for log files. Also, more messages might be logged on a clustered node than would be found on a typical standalone server. Therefore, allow at least 100 Mbytes for the /var file system.
/usr – Sun Cluster software occupies less than 25 Mbytes of space in the /usr file system. Solstice DiskSuite or Solaris Volume Manager and VxVM software each require less than 15 Mbytes.
/opt – Sun Cluster framework software uses less than 2 Mbytes in the /opt file system. However, each Sun Cluster data service might use between 1 Mbyte and 5 Mbytes. Solstice DiskSuite or Solaris Volume Manager software does not use any space in the /opt file system. VxVM software can use over 40 Mbytes if all of its packages and tools are installed.
In addition, most database and applications software is installed in the /opt file system.
SPARC: If you use Sun Management Center software to monitor the cluster, you need an additional 25 Mbytes of space on each node to support the Sun Management Center agent and Sun Cluster module packages.
Sun Cluster software requires you to set aside a special file system on one of the local disks for use in managing global devices. This file system is later mounted as a cluster file system. Name this file system /globaldevices, which is the default name that is recognized by the scinstall(1M) command.
The scinstall command later renames the file system /global/.devices/node@nodeid, where nodeid represents the number that is assigned to a node when it becomes a cluster member. The original /globaldevices mount point is removed.
The /globaldevices file system must have ample space and ample inode capacity for creating both block special devices and character special devices. This guideline is especially important if a large number of disks are in the cluster. A file system size of 512 Mbytes should suffice for most cluster configurations.
If you use Solstice DiskSuite or Solaris Volume Manager software, you must set aside a slice on the root disk for use in creating the state database replica. Specifically, set aside a slice for this purpose on each local disk. But, if you only have one local disk on a node, you might need to create three state database replicas in the same slice for Solstice DiskSuite or Solaris Volume Manager software to function properly. See your Solstice DiskSuite or Solaris Volume Manager documentation for more information.
SPARC: If you use VERITAS Volume Manager (VxVM) and you intend to encapsulate the root disk, you need to have two unused slices that are available for use by VxVM. Additionally, you need to have some additional unassigned free space at either the beginning or the end of the disk. See your VxVM documentation for more information about root disk encapsulation.
Table 1–2 shows a partitioning scheme for a cluster node that has less than 750 Mbytes of physical memory. This scheme is to be installed with the End User Solaris Software Group, Sun Cluster software, and the Sun Cluster HA for NFS data service. The last slice on the disk, slice 7, is allocated with a small amount of space for volume-manager use.
This layout allows for the use of either Solstice DiskSuite or Solaris Volume Manager software or VxVM software. If you use Solstice DiskSuite or Solaris Volume Manager software, you use slice 7 for the state database replica. If you use VxVM, you later free slice 7 by assigning the slice a zero length. This layout provides the necessary two free slices, 4 and 7, as well as provides for unused space at the end of the disk.
Table 1–2 Example File-System Allocation
Slice |
Contents |
Size Allocation |
Description |
---|---|---|---|
0 |
/ |
6.75GB |
Remaining free space on the disk after allocating space to slices 1 through 7. Used for the Solaris OS, Sun Cluster software, data-services software, volume-manager software, Sun Management Center agent and Sun Cluster module agent packages, root file systems, and database and application software. |
1 |
swap |
1GB |
512 Mbytes for the Solaris OS. 512 Mbytes for Sun Cluster software. |
2 |
overlap |
8.43GB |
The entire disk. |
3 |
/globaldevices |
512MB |
The Sun Cluster software later assigns this slice a different mount point and mounts the slice as a cluster file system. |
4 |
unused |
- |
Available as a free slice for encapsulating the root disk under VxVM. |
5 |
unused |
- |
- |
6 |
unused |
- |
- |
7 |
volume manager |
20MB |
Used by Solstice DiskSuite or Solaris Volume Manager software for the state database replica, or used by VxVM for installation after you free the slice. |
This section provides guidelines for planning and preparing the following components for Sun Cluster software installation and configuration:
For detailed information about Sun Cluster components, see the Sun Cluster Overview for Solaris OS and the Sun Cluster Concepts Guide for Solaris OS.
Ensure that you have available all necessary license certificates before you begin software installation. Sun Cluster software does not require a license certificate, but each node installed with Sun Cluster software must be covered under your Sun Cluster software license agreement.
For licensing requirements for volume-manager software and applications software, see the installation documentation for those products.
After installing each software product, you must also install any required patches.
For information about current required patches, see Patches and Required Firmware Levels in Sun Cluster 3.1 8/05 Release Notes for Solaris OS or consult your Sun service provider.
For general guidelines and procedures for applying patches, see Chapter 8, Patching Sun Cluster Software and Firmware, in Sun Cluster System Administration Guide for Solaris OS.
You must set up a number of IP addresses for various Sun Cluster components, depending on your cluster configuration. Each node in the cluster configuration must have at least one public-network connection to the same set of public subnets.
The following table lists the components that need IP addresses assigned. Add these IP addresses to the following locations:
Any naming services that are used
The local /etc/inet/hosts file on each cluster node, after you install Solaris software
For Solaris 10, the local /etc/inet/iphosts file on each cluster node, after you install Solaris software
For more information about planning IP addresses, see System Administration Guide, Volume 3 (Solaris 8) or System Administration Guide: IP Services (Solaris 9 or Solaris 10).
For more information about test IP addresses to support IP Network Multipathing, see IP Network Multipathing Administration Guide.
You must have console access to all cluster nodes. If you install Cluster Control Panel software on your administrative console, you must provide the hostname of the console-access device that is used to communicate with the cluster nodes.
A terminal concentrator is used to communicate between the administrative console and the cluster node consoles.
A Sun Enterprise 10000 server uses a System Service Processor (SSP) instead of a terminal concentrator.
A Sun FireTM server uses a system controller instead of a terminal concentrator.
For more information about console access, see the Sun Cluster Concepts Guide for Solaris OS.
Consider the following points when you plan your logical addresses:
Each data-service resource group that uses a logical address must have a hostname specified for each public network from which the logical address can be accessed.
The IP address must be on the same subnet as the test IP address that is used by the IP Network Multipathing group that hosts the logical address.
For more information, see the Sun Cluster Data Services Planning and Administration Guide for Solaris OS. For additional information about data services and resources, also see the Sun Cluster Overview for Solaris OS and the Sun Cluster Concepts Guide for Solaris OS.
Public networks communicate outside the cluster. Consider the following points when you plan your public-network configuration:
Public networks and the private network (cluster interconnect) must use separate adapters, or you must configure tagged VLAN on tagged-VLAN capable adapters and VLAN-capable switches to use the same adapter for both the private interconnect and the public network.
You must have at least one public network that is connected to all cluster nodes.
You can have as many additional public-network connections as your hardware configuration allows.
Sun Cluster software supports IPv4 addresses on the public network.
Sun Cluster software supports IPv6 addresses on the public network under the following conditions or restrictions:
Sun Cluster software does not support IPv6 addresses on the public network if the private interconnect uses SCI adapters.
On Solaris 9 OS and Solaris 10 OS, Sun Cluster software supports IPv6 addresses for both failover and scalable data services.
On Solaris 8 OS, Sun Cluster software supports IPv6 addresses for failover data services only.
Each public network adapter must belong to an Internet Protocol (IP) Network Multipathing (IP Network Multipathing) group. See IP Network Multipathing Groups for additional guidelines.
All public network adapters must use network interface cards (NICs) that support local MAC address assignment. Local MAC address assignment is a requirement of IP Network Multipathing.
The local-mac-address? variable must use the default value true for Ethernet adapters. Sun Cluster software does not support a local-mac-address? value of false for Ethernet adapters. This requirement is a change from Sun Cluster 3.0, which did require a local-mac-address? value of false.
During Sun Cluster installation on the Solaris 9 or Solaris 10 OS, the scinstall utility automatically configures a single-adapter IP Network Multipathing group for each public-network adapter. To modify these backup groups after installation, follow the procedures in Administering IPMP (Tasks) in System Administration Guide: IP Services (Solaris 9 or Solaris 10).
Sun Cluster configurations do not support filtering with Solaris IP Filter.
See IP Network Multipathing Groups for guidelines on planning public-network-adapter backup groups. For more information about public-network interfaces, see Sun Cluster Concepts Guide for Solaris OS.
Add this planning information to the Public Networks Worksheet.
Internet Protocol (IP) Network Multipathing groups, which replace Network Adapter Failover (NAFO) groups, provide public-network adapter monitoring and failover, and are the foundation for a network-address resource. A multipathing group provides high availability when the multipathing group is configured with two or more adapters. If one adapter fails, all of the addresses on the failed adapter fail over to another adapter in the multipathing group. In this way, the multipathing-group adapters maintain public-network connectivity to the subnet to which the adapters in the multipathing group connect.
The following describes the circumstances when you must manually configure IP Network Multipathing groups during a Sun Cluster software installation:
For Sun Cluster software installations on the Solaris 8 OS, you must manually configure all public network adapters in IP Network Multipathing groups, with test IP addresses.
If you use SunPlex Installer to install Sun Cluster software on the Solaris 9 or Solaris 10 OS, some but not all public network adapters might need to be manually configured in IP Network Multipathing groups.
For Sun Cluster software installations on the Solaris 9 or Solaris 10 OS, except when using SunPlex Installer, the scinstall utility automatically configures all public network adapters as single-adapter IP Network Multipathing groups.
Consider the following points when you plan your multipathing groups.
Each public network adapter must belong to a multipathing group.
In the following kinds of multipathing groups, you must configure a test IP address for each adapter in the group:
On the Solaris 8 OS, all multipathing groups require a test IP address for each adapter.
On the Solaris 9 or Solaris 10 OS, multipathing groups that contain two or more adapters require test IP addresses. If a multipathing group contains only one adapter, you do not need to configure a test IP address.
Test IP addresses for all adapters in the same multipathing group must belong to a single IP subnet.
Test IP addresses must not be used by normal applications because the test IP addresses are not highly available.
In the /etc/default/mpathd file, the value of TRACK_INTERFACES_ONLY_WITH_GROUPS must be yes.
The name of a multipathing group has no requirements or restrictions.
Most procedures, guidelines, and restrictions that are identified in the Solaris documentation for IP Network Multipathing are the same for both cluster and noncluster environments. Therefore, see the appropriate Solaris document for additional information about IP Network Multipathing:
For the Solaris 8 OS, see Deploying Network Multipathing in IP Network Multipathing Administration Guide.
For the Solaris 9 OS, see Chapter 28, Administering Network Multipathing (Task), in System Administration Guide: IP Services.
For the Solaris 10 OS, see Chapter 30, Administering IPMP (Tasks), in System Administration Guide: IP Services.
Also see IP Network Multipathing Groups in Sun Cluster Overview for Solaris OS and Sun Cluster Concepts Guide for Solaris OS.
Consider the following points when you plan the use of Network File System (NFS) in a Sun Cluster configuration.
No Sun Cluster node can be an NFS client of a Sun Cluster HA for NFS-exported file system being mastered on a node in the same cluster. Such cross-mounting of Sun Cluster HA for NFS is prohibited. Use the cluster file system to share files among cluster nodes.
Applications that run locally on the cluster must not lock files on a file system that is exported through NFS. Otherwise, local blocking (for example, flock(3UCB) or fcntl(2)) might interfere with the ability to restart the lock manager ( lockd(1M)). During restart, a blocked local process might be granted a lock which might be intended to be reclaimed by a remote client. This would cause unpredictable behavior.
Sun Cluster software does not support the following options of the share_nfs(1M) command:
secure
sec=dh
However, Sun Cluster software does support the following security features for NFS:
The use of secure ports for NFS. You enable secure ports for NFS by adding the entry set nfssrv:nfs_portmon=1 to the /etc/system file on cluster nodes.
The use of Kerberos with NFS. For more information, see Securing Sun Cluster HA for NFS With Kerberos V5 in Sun Cluster Data Service for NFS Guide for Solaris OS.
Observe the following service restrictions for Sun Cluster configurations:
Do not configure cluster nodes as routers (gateways). If the system goes down, the clients cannot find an alternate router and cannot recover.
Do not configure cluster nodes as NIS or NIS+ servers. There is no data service available for NIS or NIS+. However, cluster nodes can be NIS or NIS+ clients.
Do not use a Sun Cluster configuration to provide a highly available boot or installation service on client systems.
Do not use a Sun Cluster configuration to provide an rarpd service.
If you install an RPC service on the cluster, the service must not use any of the following program numbers:
100141
100142
100248
These numbers are reserved for the Sun Cluster daemons rgmd_receptionist, fed, and pmfd, respectively.
If the RPC service that you install also uses one of these program numbers, you must change that RPC service to use a different program number.
Sun Cluster software does not support the running of high-priority process scheduling classes on cluster nodes. Do not run either of the following types of processes on cluster nodes:
Processes that run in the time-sharing scheduling class with a high priority
Processes that run in the real-time scheduling class
Sun Cluster software relies on kernel threads that do not run in the real-time scheduling class. Other time-sharing processes that run at higher-than-normal priority or real-time processes can prevent the Sun Cluster kernel threads from acquiring needed CPU cycles.
This section provides guidelines for the following Sun Cluster components that you configure:
Add this information to the appropriate configuration planning worksheet.
Specify a name for the cluster during Sun Cluster configuration. The cluster name should be unique throughout the enterprise.
The node name is the name that you assign to a machine when you install the Solaris OS. During Sun Cluster configuration, you specify the names of all nodes that you are installing as a cluster. In single-node cluster installations, the default cluster name is the node name.
You do not need to configure a private network for a single-node cluster.
Sun Cluster software uses the private network for internal communication between nodes. A Sun Cluster configuration requires at least two connections to the cluster interconnect on the private network. You specify the private-network address and netmask when you configure Sun Cluster software on the first node of the cluster. You can either accept the default private-network address (172.16.0.0) and netmask (255.255.0.0) or type different choices.
After the installation utility (scinstall, SunPlex Installer, or JumpStart) has finished processing and the cluster is established, you cannot change the private-network address and netmask. You must uninstall and reinstall the cluster software to use a different private-network address or netmask.
If you specify a private-network address other than the default, the address must meet the following requirements:
The address must use zeroes for the last two octets of the address, as in the default address 172.16.0.0. Sun Cluster software requires the last 16 bits of the address space for its own use.
The address must be included in the block of addresses that RFC 1918 reserves for use in private networks. You can contact the InterNIC to obtain copies of RFCs or view RFCs online at http://www.rfcs.org.
You can use the same private network address in more than one cluster. Private IP network addresses are not accessible from outside the cluster.
Sun Cluster software does not support IPv6 addresses for the private interconnect. The system does configure IPv6 addresses on the private network adapters to support scalable services that use IPv6 addresses. But internode communication on the private network does not use these IPv6 addresses.
Although the scinstall utility lets you specify an alternate netmask, best practice is to accept the default netmask, 255.255.0.0. There is no benefit if you specify a netmask that represents a larger network. And the scinstall utility does not accept a netmask that represents a smaller network.
See Planning Your TCP/IP Network in System Administration Guide, Volume 3 (Solaris 8) or Planning Your TCP/IP Network (Tasks), in System Administration Guide: IP Services (Solaris 9 or Solaris 10) for more information about private networks.
The private hostname is the name that is used for internode communication over the private-network interface. Private hostnames are automatically created during Sun Cluster configuration. These private hostnames follow the naming convention clusternodenodeid -priv, where nodeid is the numeral of the internal node ID. During Sun Cluster configuration, the node ID number is automatically assigned to each node when the node becomes a cluster member. After the cluster is configured, you can rename private hostnames by using the scsetup(1M) utility.
You do not need to configure a cluster interconnect for a single-node cluster. However, if you anticipate eventually adding nodes to a single-node cluster configuration, you might want to configure the cluster interconnect for future use.
The cluster interconnects provide the hardware pathways for private-network communication between cluster nodes. Each interconnect consists of a cable that is connected in one of the following ways:
Between two transport adapters
Between a transport adapter and a transport junction
Between two transport junctions
During Sun Cluster configuration, you specify configuration information for two cluster interconnects. You can configure additional private-network connections after the cluster is established by using the scsetup(1M) utility.
For guidelines about cluster interconnect hardware, see Interconnect Requirements and Restrictions in Sun Cluster 3.0-3.1 Hardware Administration Manual for Solaris OS.For general information about the cluster interconnect, see Cluster Interconnect in Sun Cluster Overview for Solaris OS and Sun Cluster Concepts Guide for Solaris OS.
For the transport adapters, such as ports on network interfaces, specify the transport adapter names and transport type. If your configuration is a two-node cluster, you also specify whether your interconnect is direct connected (adapter to adapter) or uses a transport junction.
Consider the following guidelines and restrictions:
IPv6 - Sun Cluster software does not support IPv6 communications over the private interconnects.
Local MAC address assignment - All private network adapters must use network interface cards (NICs) that support local MAC address assignment. Link-local IPv6 addresses, which are required on private network adapters to support IPv6 public network addresses, are derived from the local MAC addresses.
Tagged VLAN adapters – Sun Cluster software supports tagged Virtual Local Area Networks (VLANs) to share an adapter between the private interconnect and the public network. To configure a tagged VLAN adapter for the private interconnect, specify the adapter name and its VLAN ID (VID) in one of the following ways:
Specify the usual adapter name, which is the device name plus the instance number or physical point of attachment (PPA). For example, the name of instance 2 of a Cassini Gigabit Ethernet adapter would be ce2. If the scinstall utility asks whether the adapter is part of a shared virtual LAN, answer yes and specify the adapter's VID number.
Specify the adapter by its VLAN virtual device name. This name is composed of the adapter name plus the VLAN instance number. The VLAN instance number is derived from the formula (1000*V)+N, where V is the VID number and N is the PPA.
As an example, for VID 73 on adapter ce2, the VLAN instance number would be calculated as (1000*73)+2. You would therefore specify the adapter name as ce73002 to indicate that it is part of a shared virtual LAN.
For more information about VLAN, see Configuring VLANs in Solaris 9 9/04 Sun Hardware Platform Guide.
SBus SCI adapters – The SBus Scalable Coherent Interface (SCI) is not supported as a cluster interconnect. However, the SCI–PCI interface is supported.
Logical network interfaces - Logical network interfaces are reserved for use by Sun Cluster software.
See the scconf_trans_adap_*(1M) family of man pages for information about a specific transport adapter.
If you use transport junctions, such as a network switch, specify a transport junction name for each interconnect. You can use the default name switchN, where N is a number that is automatically assigned during configuration, or create another name. The exception is the Sun Fire Link adapter, which requires the junction name sw-rsm N. The scinstall utility automatically uses this junction name after you specify a Sun Fire Link adapter (wrsmN).
Also specify the junction port name or accept the default name. The default port name is the same as the internal node ID number of the node that hosts the adapter end of the cable. However, you cannot use the default port name for certain adapter types, such as SCI-PCI.
Clusters with three or more nodes must use transport junctions. Direct connection between cluster nodes is supported only for two-node clusters.
If your two-node cluster is direct connected, you can still specify a transport junction for the interconnect.
If you specify a transport junction, you can more easily add another node to the cluster in the future.
Sun Cluster configurations use quorum devices to maintain data and resource integrity. If the cluster temporarily loses connection to a node, the quorum device prevents amnesia or split-brain problems when the cluster node attempts to rejoin the cluster. During Sun Cluster installation of a two-node cluster, the scinstall utility automatically configures a quorum device. The quorum device is chosen from the available shared storage disks. The scinstall utility assumes that all available shared storage disks are supported to be quorum devices. After installation, you can also configure additional quorum devices by using the scsetup(1M) utility.
You do not need to configure quorum devices for a single-node cluster.
If your cluster configuration includes third-party shared storage devices that are not supported for use as quorum devices, you must use the scsetup utility to configure quorum manually.
Consider the following points when you plan quorum devices.
Minimum – A two-node cluster must have at least one quorum device, which can be a shared disk or a Network Appliance NAS device. For other topologies, quorum devices are optional.
Odd-number rule – If more than one quorum device is configured in a two-node cluster, or in a pair of nodes directly connected to the quorum device, configure an odd number of quorum devices. This configuration ensures that the quorum devices have completely independent failure pathways.
Connection – You must connect a quorum device to at least two nodes.
For more information about quorum devices, see Quorum and Quorum Devices in Sun Cluster Concepts Guide for Solaris OS and Quorum Devices in Sun Cluster Overview for Solaris OS.
This section provides the following guidelines for planning global devices and for planning cluster file systems:
For more information about global devices and about cluster files systems, see Sun Cluster Overview for Solaris OS and Sun Cluster Concepts Guide for Solaris OS.
Sun Cluster software does not require any specific disk layout or file system size. Consider the following points when you plan your layout for global devices and for cluster file systems.
Mirroring – You must mirror all global devices for the global device to be considered highly available. You do not need to use software mirroring if the storage device provides hardware RAID as well as redundant paths to disks.
Disks – When you mirror, lay out file systems so that the file systems are mirrored across disk arrays.
Availability – You must physically connect a global device to more than one node in the cluster for the global device to be considered highly available. A global device with multiple physical connections can tolerate a single-node failure. A global device with only one physical connection is supported, but the global device becomes inaccessible from other nodes if the node with the connection is down.
Swap devices - Do not create a swap file on a global device.
Consider the following points when you plan cluster file systems.
Loopback file system (LOFS) - Do not use the loopback file system (LOFS) if both conditions in the following list are met:
Sun Cluster HA for NFS is configured on a highly available local file system.
The automountd daemon is running.
If both of these conditions are met, LOFS must be disabled to avoid switchover problems or other failures. If only one of these conditions is met, it is safe to enable LOFS.
If you require both LOFS and the automountd daemon to be enabled, exclude from the automounter map all files that are part of the highly available local file system that is exported by Sun Cluster HA for NFS.
Process accounting log files - Do not locate process accounting log files on a cluster file system or on a highly available local file system. A switchover would be blocked by writes to the log file, which would cause the node to hang. Use only a local file system to contain process accounting log files.
Communication endpoints - The cluster file system does not support any of the file-system features of Solaris software by which one would put a communication endpoint in the file-system namespace.
Although you can create a UNIX domain socket whose name is a path name into the cluster file system, the socket would not survive a node failover.
Any FIFOs or named pipes that you create on a cluster file system would not be globally accessible.
Therefore, do not attempt to use the fattach command from any node other than the local node.
Add this planning information to the Disk Device Group Configurations Worksheet.
You must configure all volume-manager disk groups as Sun Cluster disk device groups. This configuration enables a secondary node to host multihost disks if the primary node fails. Consider the following points when you plan disk device groups.
Failover – You can configure multihost disks and properly configured volume-manager devices as failover devices. Proper configuration of a volume-manager device includes multihost disks and correct setup of the volume manager itself. This configuration ensures that multiple nodes can host the exported device. You cannot configure tape drives, CD-ROMs, or single-ported devices as failover devices.
Mirroring – You must mirror the disks to protect the data from disk failure. See Mirroring Guidelines for additional guidelines. See Installing and Configuring Solstice DiskSuite or Solaris Volume Manager Software or SPARC: Installing and Configuring VxVM Software and your volume-manager documentation for instructions about mirroring.
For more information about disk device groups, see Devices in Sun Cluster Overview for Solaris OS and Sun Cluster Concepts Guide for Solaris OS.
Consider the following points when you plan mount points for cluster file systems.
Mount-point location – Create mount points for cluster file systems in the /global directory, unless you are prohibited by other software products. By using the /global directory, you can more easily distinguish cluster file systems, which are globally available, from local file systems.
SPARC: VxFS mount requirement – If you use VERITAS File System (VxFS), globally mount and unmount a VxFS file system from the primary node. The primary node is the node that masters the disk on which the VxFS file system resides. This method ensures that the mount or unmount operation succeeds. A VxFS file-system mount or unmount operation that is performed from a secondary node might fail.
The following VxFS features are not supported in a Sun Cluster 3.1 cluster file system. They are, however, supported in a local file system.
Quick I/O
Snapshots
Storage checkpoints
VxFS-specific mount options:
convosync (Convert O_SYNC)
mincache
qlog, delaylog, tmplog
VERITAS cluster file system (requires VxVM cluster feature & VERITAS Cluster Server)
Cache advisories can be used, but the effect is observed on the given node only.
All other VxFS features and options that are supported in a cluster file system are supported by Sun Cluster 3.1 software. See VxFS documentation for details about VxFS options that are supported in a cluster configuration.
Nesting mount points – Normally, you should not nest the mount points for cluster file systems. For example, do not set up one file system that is mounted on /global/a and another file system that is mounted on /global/a/b. To ignore this rule can cause availability and node boot-order problems. These problems would occur if the parent mount point is not present when the system attempts to mount a child of that file system. The only exception to this rule is if the devices for the two file systems have the same physical node connectivity. An example is different slices on the same disk.
forcedirectio - Sun Cluster software does not support the execution of binaries off cluster file systems that are mounted by using the forcedirectio mount option.
Add this planning information to the Disk Device Group Configurations Worksheet and the Volume-Manager Configurations Worksheet. For Solstice DiskSuite or Solaris Volume Manager, also add this planning information to the Metadevices Worksheet (Solstice DiskSuite or Solaris Volume Manager).
This section provides the following guidelines for planning volume management of your cluster configuration:
Sun Cluster software uses volume-manager software to group disks into disk device groups which can then be administered as one unit. Sun Cluster software supports Solstice DiskSuite or Solaris Volume Manager software and VERITAS Volume Manager (VxVM) software that you install or use in the following ways.
Table 1–4 Supported Use of Volume Managers With Sun Cluster Software
Volume-Manager Software |
Requirements |
---|---|
Solstice DiskSuite or Solaris Volume Manager |
You must install Solstice DiskSuite or Solaris Volume Manager software on all nodes of the cluster, regardless of whether you use VxVM on some nodes to manage disks. |
You must install and license VxVM with the cluster feature on all nodes of the cluster. |
|
SPARC: VxVM without the cluster feature |
You are only required to install and license VxVM on those nodes that are attached to storage devices which VxVM manages. |
SPARC: Both Solstice DiskSuite or Solaris Volume Manager and VxVM |
If you install both volume managers on the same node, you must use Solstice DiskSuite or Solaris Volume Manager software to manage disks that are local to each node. Local disks include the root disk. Use VxVM to manage all shared disks. |
See your volume-manager documentation and Installing and Configuring Solstice DiskSuite or Solaris Volume Manager Software or SPARC: Installing and Configuring VxVM Software for instructions about how to install and configure the volume-manager software. For more information about volume management in a cluster configuration, see the Sun Cluster Concepts Guide for Solaris OS.
Consider the following general guidelines when you configure your disks with volume-manager software:
Software RAID – Sun Cluster software does not support software RAID 5.
Mirrored multihost disks – You must mirror all multihost disks across disk expansion units. See Guidelines for Mirroring Multihost Disks for guidelines on mirroring multihost disks. You do not need to use software mirroring if the storage device provides hardware RAID as well as redundant paths to devices.
Mirrored root – Mirroring the root disk ensures high availability, but such mirroring is not required. See Mirroring Guidelines for guidelines about deciding whether to mirror the root disk.
Unique naming – You might have local Solstice DiskSuite metadevices, local Solaris Volume Manager volumes, or VxVM volumes that are used as devices on which the /global/.devices/node@nodeid file systems are mounted. If so, the name of each local metadevice or local volume on which a /global/.devices/node@nodeid file system is to be mounted must be unique throughout the cluster.
Node lists – To ensure high availability of a disk device group, make its node lists of potential masters and its failback policy identical to any associated resource group. Or, if a scalable resource group uses more nodes than its associated disk device group, make the scalable resource group's node list a superset of the disk device group's node list. See the resource group planning information in the Sun Cluster Data Services Planning and Administration Guide for Solaris OS for information about node lists.
Multihost disks – You must connect, or port, all devices that are used to construct a device group to all of the nodes that are configured in the node list for that device group. Solstice DiskSuite or Solaris Volume Manager software can automatically check for this connection at the time that devices are added to a disk set. However, configured VxVM disk groups do not have an association to any particular set of nodes.
Hot spare disks – You can use hot spare disks to increase availability, but hot spare disks are not required.
See your volume-manager documentation for disk layout recommendations and any additional restrictions.
Consider the following points when you plan Solstice DiskSuite or Solaris Volume Manager configurations:
Local metadevice names or volume names – The name of each local Solstice DiskSuite metadevice or Solaris Volume Manager volume on which a global–devices file system, /global/.devices/node@nodeid, is mounted must be unique throughout the cluster. Also, the name cannot be the same as any device-ID name.
Dual-string mediators – Each disk set configured with exactly two disk strings and mastered by exactly two nodes must have Solstice DiskSuite or Solaris Volume Manager mediators configured for the disk set. A disk string consists of a disk enclosure, its physical disks, cables from the enclosure to the node(s), and the interface adapter cards. Observe the following rules to configure dual-string mediators:
You must configure each disk set with exactly two nodes that act as mediator hosts.
You must use the same two nodes for all disk sets that require mediators. Those two nodes must master those disk sets.
Mediators cannot be configured for disk sets that do not meet the two-string and two-host requirements.
See the mediator(7D) man page for details.
/kernel/drv/md.conf settings – All Solstice DiskSuite metadevices or Solaris 9 Solaris Volume Manager volumes used by each disk set are created in advance, at reconfiguration boot time. This reconfiguration is based on the configuration parameters that exist in the /kernel/drv/md.conf file.
With the Solaris 10 release, Solaris Volume Manager has been enhanced to configure volumes dynamically. You no longer need to edit the nmd and the md_nsets parameters in the /kernel/drv/md.conf file. New volumes are dynamically created, as needed.
You must modify the nmd and md_nsets fields as follows to support a Sun Cluster configuration on the Solaris 8 or Solaris 9 OS:
All cluster nodes must have identical /kernel/drv/md.conf files, regardless of the number of disk sets that are served by each node. Failure to follow this guideline can result in serious Solstice DiskSuite or Solaris Volume Manager errors and possible loss of data.
md_nsets – The md_nsets field defines the total number of disk sets that can be created for a system to meet the needs of the entire cluster. Set the value of md_nsets to the expected number of disk sets in the cluster plus one additional disk set. Solstice DiskSuite or Solaris Volume Manager software uses the additional disk set to manage the private disks on the local host.
The maximum number of disk sets that are allowed per cluster is 32. This number allows for 31 disk sets for general use plus one disk set for private disk management. The default value of md_nsets is 4.
nmd – The nmd field defines the highest predicted value of any metadevice or volume name that will exist in the cluster. For example, if the highest value of the metadevice or volume names that are used in the first 15 disk sets of a cluster is 10, but the highest value of the metadevice or volume in the 16th disk set is 1000, set the value of nmd to at least 1000. Also, the value of nmd must be large enough to ensure that enough numbers exist for each device–ID name. The number must also be large enough to ensure that each local metadevice name or local volume name can be unique throughout the cluster.
The highest allowed value of a metadevice or volume name per disk set is 8192. The default value of nmd is 128.
Set these fields at installation time to allow for all predicted future expansion of the cluster. To increase the value of these fields after the cluster is in production is time consuming. The value change requires a reconfiguration reboot for each node. To raise these values later also increases the possibility of inadequate space allocation in the root (/) file system to create all of the requested devices.
At the same time, keep the value of the nmdfield and the md_nsets field as low as possible. Memory structures exist for all possible devices as determined by nmdand md_nsets, even if you have not created those devices. For optimal performance, keep the value of nmd and md_nsets only slightly higher than the number of metadevices or volumes you plan to use.
See System and Startup Files in Solstice DiskSuite 4.2.1 Reference Guide (Solaris 8) or System Files and Startup Files in Solaris Volume Manager Administration Guide (Solaris 9 or Solaris 10) for more information about the md.conf file.
Consider the following points when you plan VERITAS Volume Manager (VxVM) configurations.
Enclosure-Based Naming – If you use Enclosure-Based Naming of devices, ensure that you use consistent device names on all cluster nodes that share the same storage. VxVM does not coordinate these names, so the administrator must ensure that VxVM assigns the same names to the same devices from different nodes. Failure to assign consistent names does not interfere with correct cluster behavior. However, inconsistent names greatly complicate cluster administration and greatly increase the possibility of configuration errors, potentially leading to loss of data.
Root disk group – As of VxVM 4.0, the creation of a root disk group is optional.
A root disk group can be created on the following disks:
The root disk, which must be encapsulated
One or more local nonroot disks, which you can encapsulate or initialize
A combination of root and local nonroot disks
The root disk group must be local to the node.
Simple root disk groups – Simple root disk groups (rootdg created on a single slice of the root disk) are not supported as disk types with VxVM on Sun Cluster software. This is a general VxVM software restriction.
Encapsulation – Disks to be encapsulated must have two disk-slice table entries free.
Number of volumes – Estimate the maximum number of volumes any given disk device group can use at the time the disk device group is created.
If the number of volumes is less than 1000, you can use default minor numbering.
If the number of volumes is 1000 or greater, you must carefully plan the way in which minor numbers are assigned to disk device group volumes. No two disk device groups can have overlapping minor number assignments.
Dirty Region Logging – The use of Dirty Region Logging (DRL) decreases volume recovery time after a node failure. Using DRL might decrease I/O throughput.
Dynamic Multipathing (DMP) – The use of DMP alone to manage multiple I/O paths per node to the shared storage is not supported. The use of DMP is supported only in the following configurations:
A single I/O path per node to the cluster's shared storage.
A supported multipathing solution, such as Sun Traffic Manager, EMC PowerPath, or Hitachi HDLM, that manages multiple I/O paths per node to the shared cluster storage.
See your VxVM installation documentation for additional information.
Logging is required for UFS and VxFS cluster file systems. This requirement does not apply to QFS shared file systems. Sun Cluster software supports the following choices of file-system logging:
Solaris UFS logging – See the mount_ufs(1M) man page for more information.
Solstice DiskSuite trans-metadevice logging or Solaris Volume Manager transactional-volume logging – See Chapter 2, Creating DiskSuite Objects, in Solstice DiskSuite 4.2.1 User’s Guide or Transactional Volumes (Overview) in Solaris Volume Manager Administration Guide for more information. Transactional volumes are no longer valid as of the Solaris 10 release of Solaris Volume Manager.
SPARC: VERITAS File System (VxFS) logging – See the mount_vxfs man page provided with VxFS software for more information.
The following table lists the file-system logging supported by each volume manager.
Table 1–5 Supported File System Logging Matrix
Consider the following points when you choose between Solaris UFS logging and Solstice DiskSuite trans-metadevice logging or Solaris Volume Manager transactional-volume logging for UFS cluster file systems:
Solaris Volume Manager transactional-volume logging (formerly Solstice DiskSuite trans-metadevice logging) is scheduled to be removed from the Solaris OS in an upcoming Solaris release. Solaris UFS logging provides the same capabilities but superior performance, as well as lower system administration requirements and overhead.
Solaris UFS log size – Solaris UFS logging always allocates the log by using free space on the UFS file system, and depending on the size of the file system.
On file systems less than 1 Gbyte, the log occupies 1 Mbyte.
On file systems 1 Gbyte or greater, the log occupies 1 Mbyte per Gbyte on the file system, to a maximum of 64 Mbytes.
Log metadevice/transactional volume – A Solstice DiskSuite trans metadevice or Solaris Volume Manager transactional volume manages UFS logging. The logging device component of a trans metadevice or transactional volume is a metadevice or volume that you can mirror and stripe. You can create a maximum 1-Gbyte log size, although 64 Mbytes is sufficient for most file systems. The minimum log size is 1 Mbyte.
This section provides the following guidelines for planning the mirroring of your cluster configuration:
To mirror all multihost disks in a Sun Cluster configuration enables the configuration to tolerate single-device failures. Sun Cluster software requires that you mirror all multihost disks across expansion units. You do not need to use software mirroring if the storage device provides hardware RAID as well as redundant paths to devices.
Consider the following points when you mirror multihost disks:
Separate disk expansion units – Each submirror of a given mirror or plex should reside in a different multihost expansion unit.
Disk space – Mirroring doubles the amount of necessary disk space.
Three-way mirroring – Solstice DiskSuite or Solaris Volume Manager software and VERITAS Volume Manager (VxVM) software support three-way mirroring. However, Sun Cluster software requires only two-way mirroring.
Differing device sizes – If you mirror to a device of a different size, your mirror capacity is limited to the size of the smallest submirror or plex.
For more information about multihost disks, see Multihost Disk Storage in Sun Cluster Overview for Solaris OS and Sun Cluster Concepts Guide for Solaris OS.
Add this planning information to the Local File System Layout Worksheet.
For maximum availability, mirror root (/), /usr, /var, /opt, and swap on the local disks. Under VxVM, you encapsulate the root disk and mirror the generated subdisks. However, Sun Cluster software does not require that you mirror the root disk.
Before you decide whether to mirror the root disk, consider the risks, complexity, cost, and service time for the various alternatives that concern the root disk. No single mirroring strategy works for all configurations. You might want to consider your local Sun service representative's preferred solution when you decide whether to mirror root.
See your volume-manager documentation and Installing and Configuring Solstice DiskSuite or Solaris Volume Manager Software or SPARC: Installing and Configuring VxVM Software for instructions about how to mirror the root disk.
Consider the following points when you decide whether to mirror the root disk.
Boot disk – You can set up the mirror to be a bootable root disk. You can then boot from the mirror if the primary boot disk fails.
Complexity – To mirror the root disk adds complexity to system administration. To mirror the root disk also complicates booting in single-user mode.
Backups – Regardless of whether you mirror the root disk, you also should perform regular backups of root. Mirroring alone does not protect against administrative errors. Only a backup plan enables you to restore files that have been accidentally altered or deleted.
Quorum devices – Do not use a disk that was configured as a quorum device to mirror a root disk.
Quorum – Under Solstice DiskSuite or Solaris Volume Manager software, in failure scenarios in which state database quorum is lost, you cannot reboot the system until maintenance is performed. See your Solstice DiskSuite or Solaris Volume Manager documentation for information about the state database and state database replicas.
Separate controllers – Highest availability includes mirroring the root disk on a separate controller.
Secondary root disk – With a mirrored root disk, the primary root disk can fail but work can continue on the secondary (mirror) root disk. Later, the primary root disk might return to service, for example, after a power cycle or transient I/O errors. Subsequent boots are then performed by using the primary root disk that is specified for the eeprom(1M) boot-device parameter. In this situation, no manual repair task occurs, but the drive starts working well enough to boot. With Solstice DiskSuite or Solaris Volume Manager software, a resync does occur. A resync requires a manual step when the drive is returned to service.
If changes were made to any files on the secondary (mirror) root disk, they would not be reflected on the primary root disk during boot time. This condition would cause a stale submirror. For example, changes to the /etc/system file would be lost. With Solstice DiskSuite or Solaris Volume Manager software, some administrative commands might have changed the /etc/system file while the primary root disk was out of service.
The boot program does not check whether the system is booting from a mirror or from an underlying physical device. The mirroring becomes active partway through the boot process, after the metadevices or volumes are loaded. Before this point, the system is therefore vulnerable to stale submirror problems.