Open HA Cluster Installation Guide

How to Configure Open HA Cluster Software on All Nodes (`scinstall`)

Perform this procedure from one node of the cluster to configure Open HA Cluster software on both nodes of the cluster.

Note –

This procedure uses the interactive form of the scinstall command. To use the noninteractive forms of the scinstall command, such as when developing installation scripts, see the scinstall(1M) man page.

Before You Begin

Perform the following tasks:

Ensure that Open HA Cluster software packages are installed on each node. See How to Install Open HA Cluster 2009.06 Software.

Determine which mode of the scinstall utility you will use, Typical or Custom.

Note –

Use Custom mode to have the scinstall utility create a new virtual network interface (VNIC) for the cluster private interconnect.

You can use either Typical or Custom mode if you have preconfigured VNICs.

For the Typical installation of Open HA Cluster software, scinstall automatically specifies the following configuration defaults.

Component	Default Value
Private-network address	`172.16.0.0`
Private-network netmask	`255.255.240.0`
Cluster-transport adapters	Exactly two adapters
Cluster-transport switches	`switch1` and `switch2`
Global fencing	Enabled
Global-devices file-system name	Looks for a `/globaldevices` partition, then prompts you to configure a `lofi` device
Installation security (DES)	Limited

Complete one of the following cluster configuration worksheets, depending on whether you run the scinstall utility in Typical mode or Custom mode.

Typical Mode Worksheet – If you will use Typical mode and accept all defaults, complete the following worksheet.

Component	Description/Example	Answer
Cluster Name	What is the name of the cluster that you want to establish?
Cluster Nodes	List the name of the other cluster node planned for the initial cluster configuration.
Cluster Transport Adapters and Cables	What are the names of the two cluster-transport adapters that attach the node to the private interconnect? (To specify preconfigured VNICs, select Other from the list of adapters.)	First	Second
(VLAN adapters only)	Will this be a dedicated cluster transport adapter? (Answer No if using tagged VLAN adapters.)	Yes \| No	Yes \| No
(VLAN adapters only)	If no, what is the VLAN ID for this adapter?
Quorum Configuration	Do you want to disable automatic quorum device selection? (Answer Yes if any shared storage is not qualified to be a quorum device or if you want to configure a quorum server as a quorum device.)	Yes \| No
Check	Do you want to interrupt cluster creation for `cluster check` errors?	Yes \| No
lofi Device	Do you want to use a lofi device? (Answer Yes.)	Yes

Custom Mode Worksheet – If you will use Custom mode and customize the configuration data, complete the following worksheet.

Component	Description/Example	Answer
Cluster Name	What is the name of the cluster that you want to establish?
Cluster Nodes	List the name of the other cluster node planned for the initial cluster configuration.
Authenticating Requests to Add Nodes	Do you need to use DES authentication?	No \| Yes
Minimum Number of Private Networks	Should this cluster use at least two private networks?	Yes \| No
Point-to-Point Cables	Does this cluster use switches?	Yes \| No
Cluster Switches	Transport switch name: Defaults: `switch1` and `switch2`	First	Second
Cluster Transport Adapters and Cables	Node name (the node from which you run scinstall):
Cluster Transport Adapters and Cables	Transport adapter name, VNIC name (`vnicN`), or create a new VNIC:	First	Second
(VLAN adapters only)	Will this be a dedicated cluster transport adapter? (Answer No if using tagged VLAN adapters.)	Yes \| No	Yes \| No
(VLAN adapters only)	If no, what is the VLAN ID for this adapter?
	Where does each transport adapter connect to (a switch or another adapter)? Switch defaults: `switch1` and `switch2`	First	Second
	If a transport switch, do you want to use the default port name?	Yes \| No	Yes \| No
	If no, what is the name of the port that you want to use?
	Do you want to use autodiscovery to list the available adapters for the other nodes? (If VNICs are configured on any cluster node, autodiscovery is available only if VNICs are preconfigured on all nodes.) If no, supply the following information for each additional node:	Yes \| No
Specify for each additional node	Node name:
Specify for each additional node	Transport adapter name:	First	Second
(VLAN adapters only)	Will this be a dedicated cluster transport adapter? (Answer No if using tagged VLAN adapters.)	Yes \| No	Yes \| No
(VLAN adapters only)	If no, what is the VLAN ID for this adapter?
	Where does each transport adapter connect to (a switch or another adapter)? Defaults: `switch1` and `switch2`	First	Second
	If a transport switch, do you want to use the default port name?	Yes \| No	Yes \| No
	If no, what is the name of the port that you want to use?
Network Address for the Cluster Transport	Do you want to accept the default network address (`172.16.0.0`)?	Yes \| No
	If no, which private network address do you want to use?	___.___.___.___
	Do you want to accept the default netmask?	Yes \| No
	If no, what are the maximum numbers of nodes, private networks, and zone clusters that you expect to configure in the cluster? Note – Zone clusters are not available in the Open HA Cluster 2009.06 release.	_____ nodes _____ networks _____ zone clusters
	Which netmask do you want to use? (Choose from the values calculated by scinstall or supply your own.)	___.___.___.___
Global Fencing	Do you want to disable global fencing? (Answer No unless the shared storage does not support SCSI reservations or unless you want systems that are outside the cluster to access the shared storage.)	Yes \| No	Yes \| No
Quorum Configuration	Do you want to disable automatic quorum device selection? (Answer Yes if any shared storage is not qualified to be a quorum device or if you want to configure a quorum server as a quorum device.)	Yes \| No	Yes \| No
Global Devices File System (specify for each node)	Do you want to use the default name of the global-devices file system (`/globaldevices`)? (Answer No.)	Yes \| No
	If no, do you want to use an already-existing file system? (Answer No.)	Yes \| No
	What is the name of the file system that you want to use? (Leave blank.)
Check	Do you want to interrupt cluster creation for `cluster check` errors?	Yes \| No

Note –

For the global-devices file system, use only a lofi device. Do not attempt to configure a dedicated /globaldevices partition. Respond “No” to all prompts that ask whether to use or create a file system. After you decline to configure a file system, the scinstall utility prompts you to create a lofi device.

Follow these guidelines to use the interactive scinstall utility in this procedure:

Interactive scinstall enables you to type ahead. Therefore, do not press the Return key more than once if the next menu screen does not appear immediately.
Unless otherwise noted, you can press Control-D to return to either the start of a series of related questions or to the Main Menu.
Default answers or answers to previous sessions are displayed in brackets ([ ]) at the end of a question. Press Return to enter the response that is in brackets without typing it.

On each node to configure in a cluster, become superuser.

Alternatively, if your user account is assigned the Primary Administrator profile, execute commands as non-root through a profile shell, or prefix the command with the pfexec command.

On each node, disable Network Auto-Magic (NWAM).

NWAM activates a single network interface and disables all others. For this reason, NWAM cannot coexist with Open HA Cluster 2009.06 software and you must disable it before you configure or run your cluster.
1. On each cluster node, determine whether NWAM is enabled or disabled.
  phys-schost# svcs -a | grep /network/physical
  - If NWAM is enabled, output is similar to the following:
    online Mar_13 svc:/network/physical:nwam disabled Mar_13 svc:/network/physical:default
  - If NWAM is disabled, output is similar to the following:
    disabled Mar_13 svc:/network/physical:nwam online Mar_13 svc:/network/physical:default
2. If NWAM is enabled on a node, disable it.
  phys-schost# svcadm disable svc:/network/physical:nwam phys-schost# svcadm enable svc:/network/physical:default

On each node, configure each public-network adapter.
1. Determine which adapters are on the system.
  phys-schost# dladm show-link
2. Plumb an adapter.
  phys-schost# ifconfig adapter plumb up
3. Assign an IP address and netmask to the adapter.
  phys-schost# ifconfig adapter IPaddress netmask + netmask
4. Verify that the adapter is up.
  
  Ensure that the comment output contains the UP flag.
  phys-schost# ifconfig -a
5. Create a configuration file for the adapter.
  
  This file ensures that the configuration of the adapter persists across reboots.
  phys-schost# vi /etc/hostname.adapter IPaddress
6. Repeat Step b through Step e for each public-network adapter on both nodes.
7. On both nodes, add an entry to the /etc/inet/hosts file for each public-network adapter that you configured on each node.
  phys-schost# vi /etc/inet/hosts hostname IPaddress
8. If you use a naming service, add the hostname and IP address of each public-network adapter that you configured.
9. Reboot each node.
  phys-schost# /usr/sbin/shutdown -y -g0 -i6
10. Verify that all adapters are configured and up.
  phys-schost# ifconfig -a

On each node, enable the minimal RPC services that are necessary to enable the interactive scinstall utility.

When OpenSolaris software is installed, a restricted network profile is automatically configured. This profile is too restrictive for the cluster private network to function. To enable private-network functionality, run the following commands:

phys-schost# svccfg
svc:> select network/rpc/bind
svc:/network/rpc/bind> setprop config/local_only=false
svc:/network/rpc/bind> quit
 
phys-schost# svcadm refresh network/rpc/bind:default
phys-schost# svcprop network/rpc/bind:default | grep local_only

The output of the last command should show that the local_only property is now set to false.

For more information about re-enabling network services, see Planning Network Security in Solaris 10 5/08 Installation Guide: Planning for Installation and Upgrade.

From one cluster node, start the scinstall utility.
phys-schost# /usr/cluster/bin/scinstall

Type the option number for Create a New Cluster or Add a Cluster Node and press the Return key.

 *** Main Menu ***

    Please select from one of the following (*) options:

      * 1) Create a new cluster or add a cluster node
      * 2) Print release information for this cluster node
 
      * ?) Help with menu options
      * q) Quit

    Option:  1

The New Cluster and Cluster Node Menu is displayed.

Type the option number for Create a New Cluster and press the Return key.

The Typical or Custom Mode menu is displayed.

Type the option number for either Typical or Custom and press the Return key.

The Create a New Cluster screen is displayed. Read the requirements, then press Control-D to continue.

Follow the menu prompts to supply your answers from the configuration planning worksheet.

The scinstall utility installs and configures all cluster nodes and reboots the cluster. The cluster is established when all nodes have successfully booted into the cluster. Open HA Cluster installation output is logged in a /var/cluster/logs/install/scinstall.log.N file.

Verify on each node that multiuser services for the Service Management Facility (SMF) are online.

If services are not yet online for a node, wait until the state becomes online before you proceed to the next step.
phys-schost# svcs multi-user-server STATE STIME FMRI online 17:52:55 svc:/milestone/multi-user-server:default

From one node, verify that all nodes have joined the cluster.

phys-schost# /usr/cluster/bin/clnode status

Output resembles the following.

=== Cluster Nodes ===

--- Node Status ---

Node Name                                       Status
---------                                       ------
phys-schost-1                                   Online
phys-schost-2                                   Online

For more information, see the clnode(1CL) man page.

(Optional) Enable the automatic node reboot feature.

This feature automatically reboots a node if all monitored disk paths fail, provided that at least one of the disks is accessible from a different node in the cluster.

Enable automatic reboot.
phys-schost# /usr/cluster/bin/clnode set -p reboot_on_path_failure=enabled
-p

Specifies the property to set

reboot_on_path_failure=enable

Enables automatic node reboot if failure of all monitored disk paths occurs.

Verify that automatic reboot on disk-path failure is enabled.

phys-schost# /usr/cluster/bin/clnode show
=== Cluster Nodes ===                          

Node Name:                                      node
…
  reboot_on_path_failure:                          enabled
…

If you intend to use the HA for NFS data service on a highly available local file system, ensure that the loopback file system (LOFS) is disabled.

To disable LOFS, add the following entry to the /etc/system file on each node of the cluster.
exclude:lofs
The change to the /etc/system file becomes effective after the next system reboot.

Note –
You cannot have LOFS enabled if you use the HA for NFS data service on a highly available local file system and have automountd running. LOFS can cause switchover problems for the HA for NFS data service. If you choose to add the HA for NFS data service on a highly available local file system, you must make one of the following configuration changes.
- Disable LOFS.
- Disable the automountd daemon.
- Exclude from the automounter map all files that are part of the highly available local file system that is exported by the HA for NFS data service. This choice enables you to keep both LOFS and the automountd daemon enabled.
See The Loopback File System in System Administration Guide: Devices and File Systems for more information about loopback file systems.

Example 3–1 Configuring Open HA Cluster Software on All Nodes

The following example shows the scinstall progress messages that are logged as scinstall completes configuration tasks on the two-node cluster, schost. The cluster is installed from phys-schost-1 by using the scinstall utility in Typical Mode. The other cluster node is phys-schost-2. The adapter name is e1000g0. No /globaldevices partition exists, so the global-devices namespace is created on a lofi device. Automatic quorum-device selection is not used.

*** Create a New Cluster ***
Tue Apr 14 10:36:19 PDT 2009

    Attempting to contact "phys-schost-1" ... 

    Searching for a remote configuration method ... 

scrcmd -N phys-schost-1 test isfullyinstalled
The Sun Cluster framework software is installed.
scrcmd to "phys-schost-1" - return status 1.

rsh phys-schost-1 -n "/bin/sh -c '/bin/true; /bin/echo SC_COMMAND_STATUS=\$?'"
phys-schost-1: Connection refused
rsh to "phys-schost-1" failed.

ssh root@phys-schost-1 -o "BatchMode yes" -o "StrictHostKeyChecking yes" 
-n "/bin/sh -c '/bin/true; /bin/echo SC_COMMAND_STATUS=\$?'"
No RSA host key is known for phys-schost-1 and you have requested strict checking.
Host key verification failed.
ssh to "phys-schost-1" failed.

    The Sun Cluster framework is able to complete the configuration 
    process without remote shell access.


    Checking the status of service network/physical:nwam ... 


/usr/cluster/lib/scadmin/lib/cmd_test isnwamenabled

scrcmd -N phys-schost-1 test isnwamenabled
    Plumbing network address 172.16.0.0 on adapter e1000g0 >> NOT DUPLICATE ... done
    Plumbing network address 172.16.0.0 on adapter e1000g0 >> NOT DUPLICATE ... done
    Testing for "/globaldevices" on "phys-schost-2" ... 

/globaldevices is not a directory or file system mount point.
Cannot use "/globaldevices" on "phys-schost-2".


    Testing for "/globaldevices" on "phys-schost-1" ... 

scrcmd -N phys-schost-1 chk_globaldev fs /globaldevices
/globaldevices is not a directory or file system mount point.


/globaldevices is not a directory or file system mount point.
Cannot use "/globaldevices" on "phys-schost-1".


scrcmd -N phys-schost-1 chk_globaldev lofi /.globaldevices 100m

----------------------------------
- Cluster Creation -
----------------------------------

    Started cluster check on "phys-schost-2".
    Started cluster check on "phys-schost-1".

    cluster check completed with no errors or warnings for "phys-schost-2".
    cluster check completed with no errors or warnings for "phys-schost-1".

Cluster check report is displayed
…

scrcmd -N phys-schost-1 test isinstalling
"" is not running.

scrcmd -N phys-schost-1 test isconfigured
Sun Cluster is not configured.

    Configuring "phys-schost-1" ... 

scrcmd -N phys-schost-1 install -logfile /var/cluster/logs/install/scinstall.log.2895 
-k -C schost -F -G lofi -T node=phys-schost-2,node=phys-schost-1,authtype=sys 
-w netaddr=172.16.0.0,netmask=255.255.240.0,maxnodes=64,maxprivatenets=10,
numvirtualclusters=12 -A trtype=dlpi,name=e1000g0 -B type=direct
ips_package_processing: ips_postinstall...
ips_package_processing: ips_postinstall done

Initializing cluster name to "schost" ... done
Initializing authentication options ... done
Initializing configuration for adapter "e1000g0" ... done
Initializing private network address options ... done

Plumbing network address 172.16.0.0 on adapter e1000g0 >> NOT DUPLICATE ... done

Setting the node ID for "phys-schost-1" ... done (id=1)

Verifying that NTP is configured ... done
Initializing NTP configuration ... done

Updating nsswitch.conf ... done

Adding cluster node entries to /etc/inet/hosts ... done


Configuring IP multipathing groups ...done


Verifying that power management is NOT configured ... done
Unconfiguring power management ... done
/etc/power.conf has been renamed to /etc/power.conf.041409104821
Power management is incompatible with the HA goals of the cluster.
Please do not attempt to re-configure power management.

Ensure network routing is disabled ... done
Network routing has been disabled on this node by creating /etc/notrouter.
Having a cluster node act as a router is not supported by Sun Cluster.
Please do not re-enable network routing.

Please reboot this machine.

Log file - /var/cluster/logs/install/scinstall.log.2895

scrcmd -N phys-schost-1 test hasbooted
This node has not yet been booted as a cluster node.
    Rebooting "phys-schost-1" ...

Troubleshooting

Unsuccessful configuration – If one or more nodes cannot join the cluster, or if the wrong configuration information was specified, first attempt to rerun this procedure. If that does not correct the problem, perform the procedure How to Uninstall Open HA Cluster Software on each misconfigured node to remove it from the cluster configuration. Then rerun this procedure.

Next Steps

If you did not yet configure a quorum device in your cluster, go to How to Configure Quorum Devices.

Otherwise, go to How to Verify the Quorum Configuration and Installation Mode.

How to Configure Open HA Cluster Software on All Nodes (scinstall)

Before You Begin

Example 3–1 Configuring Open HA Cluster Software on All Nodes

Troubleshooting

Next Steps

How to Configure Open HA Cluster Software on All Nodes (`scinstall`)