Open HA Cluster Installation Guide

ProcedureHow to Configure Open HA Cluster Software on All Nodes (scinstall)

Perform this procedure from one node of the cluster to configure Open HA Cluster software on both nodes of the cluster.


Note –

This procedure uses the interactive form of the scinstall command. To use the noninteractive forms of the scinstall command, such as when developing installation scripts, see the scinstall(1M) man page.


Before You Begin

Perform the following tasks:


Note –

For the global-devices file system, use only a lofi device. Do not attempt to configure a dedicated /globaldevices partition. Respond “No” to all prompts that ask whether to use or create a file system. After you decline to configure a file system, the scinstall utility prompts you to create a lofi device.


Follow these guidelines to use the interactive scinstall utility in this procedure:

  1. On each node to configure in a cluster, become superuser.

    Alternatively, if your user account is assigned the Primary Administrator profile, execute commands as non-root through a profile shell, or prefix the command with the pfexec command.

  2. On each node, disable Network Auto-Magic (NWAM).

    NWAM activates a single network interface and disables all others. For this reason, NWAM cannot coexist with Open HA Cluster 2009.06 software and you must disable it before you configure or run your cluster.

    1. On each cluster node, determine whether NWAM is enabled or disabled.


      phys-schost# svcs -a | grep /network/physical
      
      • If NWAM is enabled, output is similar to the following:


         online           Mar_13   svc:/network/physical:nwam
         disabled         Mar_13   svc:/network/physical:default
      • If NWAM is disabled, output is similar to the following:


         disabled          Mar_13  svc:/network/physical:nwam
         online            Mar_13  svc:/network/physical:default
    2. If NWAM is enabled on a node, disable it.


      phys-schost# svcadm disable svc:/network/physical:nwam
      phys-schost# svcadm enable svc:/network/physical:default
      
  3. On each node, configure each public-network adapter.

    1. Determine which adapters are on the system.


      phys-schost# dladm show-link
      
    2. Plumb an adapter.


      phys-schost# ifconfig adapter plumb up
      
    3. Assign an IP address and netmask to the adapter.


      phys-schost# ifconfig adapter IPaddress netmask + netmask
      
    4. Verify that the adapter is up.

      Ensure that the comment output contains the UP flag.


      phys-schost# ifconfig -a
      
    5. Create a configuration file for the adapter.

      This file ensures that the configuration of the adapter persists across reboots.


      phys-schost# vi /etc/hostname.adapter
      IPaddress
      
    6. Repeat Step b through Step e for each public-network adapter on both nodes.

    7. On both nodes, add an entry to the /etc/inet/hosts file for each public-network adapter that you configured on each node.


      phys-schost# vi /etc/inet/hosts
      hostname IPaddress
      
    8. If you use a naming service, add the hostname and IP address of each public-network adapter that you configured.

    9. Reboot each node.


      phys-schost# /usr/sbin/shutdown -y -g0 -i6
      
    10. Verify that all adapters are configured and up.


      phys-schost# ifconfig -a
      
  4. On each node, enable the minimal RPC services that are necessary to enable the interactive scinstall utility.

    When OpenSolaris software is installed, a restricted network profile is automatically configured. This profile is too restrictive for the cluster private network to function. To enable private-network functionality, run the following commands:


    phys-schost# svccfg
    svc:> select network/rpc/bind
    svc:/network/rpc/bind> setprop config/local_only=false
    svc:/network/rpc/bind> quit
     
    phys-schost# svcadm refresh network/rpc/bind:default
    phys-schost# svcprop network/rpc/bind:default | grep local_only
    

    The output of the last command should show that the local_only property is now set to false.

    For more information about re-enabling network services, see Planning Network Security in Solaris 10 5/08 Installation Guide: Planning for Installation and Upgrade.

  5. From one cluster node, start the scinstall utility.


    phys-schost# /usr/cluster/bin/scinstall
    
  6. Type the option number for Create a New Cluster or Add a Cluster Node and press the Return key.


     *** Main Menu ***
    
        Please select from one of the following (*) options:
    
          * 1) Create a new cluster or add a cluster node
          * 2) Print release information for this cluster node
     
          * ?) Help with menu options
          * q) Quit
    
        Option:  1
    

    The New Cluster and Cluster Node Menu is displayed.

  7. Type the option number for Create a New Cluster and press the Return key.

    The Typical or Custom Mode menu is displayed.

  8. Type the option number for either Typical or Custom and press the Return key.

    The Create a New Cluster screen is displayed. Read the requirements, then press Control-D to continue.

  9. Follow the menu prompts to supply your answers from the configuration planning worksheet.

    The scinstall utility installs and configures all cluster nodes and reboots the cluster. The cluster is established when all nodes have successfully booted into the cluster. Open HA Cluster installation output is logged in a /var/cluster/logs/install/scinstall.log.N file.

  10. Verify on each node that multiuser services for the Service Management Facility (SMF) are online.

    If services are not yet online for a node, wait until the state becomes online before you proceed to the next step.


    phys-schost# svcs multi-user-server
    STATE          STIME    FMRI
    online         17:52:55 svc:/milestone/multi-user-server:default
  11. From one node, verify that all nodes have joined the cluster.


    phys-schost# /usr/cluster/bin/clnode status
    

    Output resembles the following.


    === Cluster Nodes ===
    
    --- Node Status ---
    
    Node Name                                       Status
    ---------                                       ------
    phys-schost-1                                   Online
    phys-schost-2                                   Online

    For more information, see the clnode(1CL) man page.

  12. (Optional) Enable the automatic node reboot feature.

    This feature automatically reboots a node if all monitored disk paths fail, provided that at least one of the disks is accessible from a different node in the cluster.

    1. Enable automatic reboot.


      phys-schost# /usr/cluster/bin/clnode set -p reboot_on_path_failure=enabled
      
      -p

      Specifies the property to set

      reboot_on_path_failure=enable

      Enables automatic node reboot if failure of all monitored disk paths occurs.

    2. Verify that automatic reboot on disk-path failure is enabled.


      phys-schost# /usr/cluster/bin/clnode show
      === Cluster Nodes ===                          
      
      Node Name:                                      node
      …
        reboot_on_path_failure:                          enabled
      …
  13. If you intend to use the HA for NFS data service on a highly available local file system, ensure that the loopback file system (LOFS) is disabled.

    To disable LOFS, add the following entry to the /etc/system file on each node of the cluster.


    exclude:lofs

    The change to the /etc/system file becomes effective after the next system reboot.


    Note –

    You cannot have LOFS enabled if you use the HA for NFS data service on a highly available local file system and have automountd running. LOFS can cause switchover problems for the HA for NFS data service. If you choose to add the HA for NFS data service on a highly available local file system, you must make one of the following configuration changes.

    • Disable LOFS.

    • Disable the automountd daemon.

    • Exclude from the automounter map all files that are part of the highly available local file system that is exported by the HA for NFS data service. This choice enables you to keep both LOFS and the automountd daemon enabled.


    See The Loopback File System in System Administration Guide: Devices and File Systems for more information about loopback file systems.


Example 3–1 Configuring Open HA Cluster Software on All Nodes

The following example shows the scinstall progress messages that are logged as scinstall completes configuration tasks on the two-node cluster, schost. The cluster is installed from phys-schost-1 by using the scinstall utility in Typical Mode. The other cluster node is phys-schost-2. The adapter name is e1000g0. No /globaldevices partition exists, so the global-devices namespace is created on a lofi device. Automatic quorum-device selection is not used.


*** Create a New Cluster ***
Tue Apr 14 10:36:19 PDT 2009

    Attempting to contact "phys-schost-1" ... 

    Searching for a remote configuration method ... 

scrcmd -N phys-schost-1 test isfullyinstalled
The Sun Cluster framework software is installed.
scrcmd to "phys-schost-1" - return status 1.

rsh phys-schost-1 -n "/bin/sh -c '/bin/true; /bin/echo SC_COMMAND_STATUS=\$?'"
phys-schost-1: Connection refused
rsh to "phys-schost-1" failed.

ssh root@phys-schost-1 -o "BatchMode yes" -o "StrictHostKeyChecking yes" 
-n "/bin/sh -c '/bin/true; /bin/echo SC_COMMAND_STATUS=\$?'"
No RSA host key is known for phys-schost-1 and you have requested strict checking.
Host key verification failed.
ssh to "phys-schost-1" failed.

    The Sun Cluster framework is able to complete the configuration 
    process without remote shell access.


    Checking the status of service network/physical:nwam ... 


/usr/cluster/lib/scadmin/lib/cmd_test isnwamenabled

scrcmd -N phys-schost-1 test isnwamenabled
    Plumbing network address 172.16.0.0 on adapter e1000g0 >> NOT DUPLICATE ... done
    Plumbing network address 172.16.0.0 on adapter e1000g0 >> NOT DUPLICATE ... done
    Testing for "/globaldevices" on "phys-schost-2" ... 

/globaldevices is not a directory or file system mount point.
Cannot use "/globaldevices" on "phys-schost-2".


    Testing for "/globaldevices" on "phys-schost-1" ... 

scrcmd -N phys-schost-1 chk_globaldev fs /globaldevices
/globaldevices is not a directory or file system mount point.


/globaldevices is not a directory or file system mount point.
Cannot use "/globaldevices" on "phys-schost-1".


scrcmd -N phys-schost-1 chk_globaldev lofi /.globaldevices 100m

----------------------------------
- Cluster Creation -
----------------------------------

    Started cluster check on "phys-schost-2".
    Started cluster check on "phys-schost-1".

    cluster check completed with no errors or warnings for "phys-schost-2".
    cluster check completed with no errors or warnings for "phys-schost-1".

Cluster check report is displayed
…

scrcmd -N phys-schost-1 test isinstalling
"" is not running.

scrcmd -N phys-schost-1 test isconfigured
Sun Cluster is not configured.

    Configuring "phys-schost-1" ... 

scrcmd -N phys-schost-1 install -logfile /var/cluster/logs/install/scinstall.log.2895 
-k -C schost -F -G lofi -T node=phys-schost-2,node=phys-schost-1,authtype=sys 
-w netaddr=172.16.0.0,netmask=255.255.240.0,maxnodes=64,maxprivatenets=10,
numvirtualclusters=12 -A trtype=dlpi,name=e1000g0 -B type=direct
ips_package_processing: ips_postinstall...
ips_package_processing: ips_postinstall done

Initializing cluster name to "schost" ... done
Initializing authentication options ... done
Initializing configuration for adapter "e1000g0" ... done
Initializing private network address options ... done

Plumbing network address 172.16.0.0 on adapter e1000g0 >> NOT DUPLICATE ... done

Setting the node ID for "phys-schost-1" ... done (id=1)

Verifying that NTP is configured ... done
Initializing NTP configuration ... done

Updating nsswitch.conf ... done

Adding cluster node entries to /etc/inet/hosts ... done


Configuring IP multipathing groups ...done


Verifying that power management is NOT configured ... done
Unconfiguring power management ... done
/etc/power.conf has been renamed to /etc/power.conf.041409104821
Power management is incompatible with the HA goals of the cluster.
Please do not attempt to re-configure power management.

Ensure network routing is disabled ... done
Network routing has been disabled on this node by creating /etc/notrouter.
Having a cluster node act as a router is not supported by Sun Cluster.
Please do not re-enable network routing.

Please reboot this machine.

Log file - /var/cluster/logs/install/scinstall.log.2895

scrcmd -N phys-schost-1 test hasbooted
This node has not yet been booted as a cluster node.
    Rebooting "phys-schost-1" ... 

Troubleshooting

Unsuccessful configuration – If one or more nodes cannot join the cluster, or if the wrong configuration information was specified, first attempt to rerun this procedure. If that does not correct the problem, perform the procedure How to Uninstall Open HA Cluster Software on each misconfigured node to remove it from the cluster configuration. Then rerun this procedure.

Next Steps

If you did not yet configure a quorum device in your cluster, go to How to Configure Quorum Devices.

Otherwise, go to How to Verify the Quorum Configuration and Installation Mode.