C H A P T E R  2

Installing the Software on Nodes Running the Solaris OS

This chapter explains how to install Sun HPC ClusterTools software on systems running the Solaris OS using the installation utilities supplied with the HPC ClusterTools software. For information about how to install Sun HPC ClusterTools software on a Linux-based system, see Chapter 3.

This chapter contains the following topics:


Recommendations for Installing Sun HPC ClusterTools 8.2.1c Software on Large Cluster Installations

The following are tips for installing Sun HPC ClusterTools 8.2.1c software on clusters containing hundreds of nodes using the centralized method:


Downloading and Extracting the Software

Before you can install and configure the software, you must download the correct software archive for your hardware platform and then extract it to the correct directory. If you have installed a previous version of the software, there are additional steps you need to do to prepare for installation. The following procedure explains these steps.

procedure icon  Download and Extract the Software

The following procedure downloads Sun HPC ClusterTools 8.2.1c software to a standard location and prepares it for installation by the ctinstall utility.

  1. Boot the cluster nodes.

  2. Download and extract the archive file containing the Sun HPC ClusterTools software to a location that is visible to all the nodes in the cluster.

    If you download the file to a shared file system, ensure that the following conditions are met:

    1. All compute and administrative nodes have access to the shared file system.

    2. The file system is readable by superuser and accessible through a common path from all nodes.

    For centralized installations, these conditions must be met on the central host as well.

    You can download the correct HPC ClusterTools archive file for your platform from the following location:

    http://www.sun.com/clustertools

  3. Log in as superuser on the system from which you will execute the ClusterTools installation utilities.

  4. Change directory to one of the following locations:

    1. If you are installing the software on a SPARC-based system, change directory to:

      sun-hpc-ct8.2.1c-sparc/Product/Install_Utilities/bin

    2. If you are installing on an x64-based system, change directory to:

      sun-hpc-ct8.2.1c-i386/Product/Install_Utilities/bin


Installing the Software

Use the ctinstall command to install Sun HPC ClusterTools software. See TABLE 2-1 for a summary of the ctinstall options.


TABLE 2-1   ctinstall Options  
Options Description
General  

–h

Command help.

–l

Execute the command on the local node only.

–R

Specify the full path to be used as the root path.

–x

Turn on command debug at the specified nodes.
 

Specific to Command

-c

Specify the server and mount path for the software.

–d

Specify a non-default install from location. The default is distribution/Product, relative to the directory where ctinstall is invoked.

–p

List of packages to be installed. Separate names by comma.

–t

Specify a nondefault install to location. The default is /opt.
 

Centralized Operations Only

–g

Generate node lists of successful and unsuccessful installations.

–k

Specify a central location for storing log files of all specified nodes.

–n

List of nodes targeted for installation. Separate names by comma.

–N

File containing list of nodes targeted for installation. One node per line.

–r

Remote connection method: rsh, ssh, or telnet.

–S

Specify full path to an alternate ssh executable.



Note - The options –g, –k, –n, –N, –r, and –S are incompatible with local (non-centralized) installations. If the –l option is used with any of these options, an error message is displayed. 



Installation Location Notes

By default, ctinstall installs the software into /opt/SUNWhpc/HPC8.2.1c/compiler. You can use the -t switch on the ctinstall command line to install the software into another location. The path you specify with the -t switch will replace the /opt portion of the default path.

For example, the following command line will cause the software to be installed on the local node in a location whose pathname that begins with /usr/mpi:


# ./ctinstall –l -t /usr/mpi

The full pathname of the non-standard installation locations is /usr/mpi/SUNWhpc/HPC8.2.1c/compiler.

To use this path, you must set both the PATH and OPAL_PREFIX variables and specify the appropriate compiler name in place of compiler. In the following example, a Sun Studio compiled version of the software is being installed.


setenv PATH OPAL_PREFIX /usr/mpi/SUNWhpc/HPC8.2.1c/sunsetenv PATH $(OPAL_PREFIX)/bin:$(PATH)

Local Versus Centralized Command Initiation

You can choose between two methods of initiating operations on the cluster nodes:



Note - Centralized operations are performed on the specified nodes in parallel. That is, when a command is specified on the central host, the operation is initiated on all the specified nodes at the same time.



Support for centralized command initiation is built into the Sun HPC ClusterTools software installation utilities. Issuing these commands from a central host has the equivalent effect as invoking the commands locally using one of the Cluster Console tools, cconsole, ctelnet, or crlogin.

The Sun HPC ClusterTools software CLI utilities provide several options that are specific to the centralized command initiation mode and are intended to simplify management of parallel installation of the software from a central host. These options support:

The initiating system can be one of the cluster nodes or it can be external to the cluster. It must be a Sun system running the Solaris 9 or Solaris 10 Operating System (Solaris OS). Compute nodes must run the Solaris 10 OS.

Central Host Command Examples

This section shows examples of HPC ClusterTools software being installed from a central host.

To Install From a Central Host Using rsh


./ctinstall –n node1,node2 –r rsh

This command installs the full Sun HPC ClusterTools software suite on node1 and node2 from a central host. The node list is specified on the command line. The remote connection method is ssh. This requires a trusted hosts setup.

The software will be ready for use when the installation process completes.

To Install From a Central Host Using ssh


./ctinstall –n node1,node2 –r ssh

This example is the same as that in the previous section, except that the remote connection method is ssh. This method requires that the initiating node be able to log in as superuser to the target nodes without being prompted for any interaction, such as a password.

To Install From a Central Host Using telnet


./ctinstall –N /tmp/nodelist –r telnet

This command installs the full Sun HPC ClusterTools software suite on the set of nodes listed in the file /tmp/nodelist from a central host. A node list file is particularly useful when you have a large set of nodes or you want to run operations on the same set of nodes repeatedly.

The node list file has the following contents:


# Node list for the above example
 
node1
node2

The remote connection method is telnet. All cluster nodes must share the same password. If some nodes do not use the same password as others, install the software in groups, each group consisting of nodes that use a common password.

The software will be ready for use when the installation process completes.

To Install the Software and Save the Log Files


./ctinstall –N /tmp/nodelist –r telnet –k /tmp/cluster-logs –g

The command in this section is the same as that shown in the previous section, except that it includes the –k and –g options.

In this example, the –k option causes the local log files of all specified nodes to be saved in /tmp/cluster-logs on the central host.

The –g option causes a pair of node list files to be created on the central host in /var/sadm/system/logs/hpc/nodelists. One file, ctinstall.pass$$, contains a list of the nodes on which the installation was successful. The other file, ctinstall.fail$$, lists the nodes on which the installation was unsuccessful. The $$ symbol is replaced by the process number associated with the installation.

These generated node list files can then be used for command retries or in subsequent operations using the –N switch.



Note - Specify a directory that is local to the central host (for example, /tmp). This will avoid unnecessary network traffic in the transfer of log files and will result in faster execution of the operation.



Local Installation Command Examples

This section shows examples of HPC ClusterTools software being installed on the local node.



Note - The options –g, –k, –n, –N, –r, and –S are incompatible with local (non-centralized) installations. If the –l option is used with any of these options, an error message is displayed. 



To Install the Complete Software Suite Locally


./ctinstall –l

This command installs the full Sun HPC ClusterTools software suite on the local node only.

To Install Specified Software Packages Locally


./ctinstall –l –p SUNWompi,SUNWompimn

The command in this section installs the packages SUNWompi and SUNWompimn on the local node.

Solaris OS Packages lists the packages in the Sun HPC ClusterTools 8.2.1c installation.

Installing Specified Software Packages

The following command installs only the specified software packages.


./ctinstall –N /tmp/nodelist –r telnet –p SUNWompi

This command installs the packages SUNWompi and SUNWompimn on the set of nodes listed in the file /tmp/nodelist. No other packages are installed. The remote connection method is telnet.

Solaris OS Packages lists the packages in the Sun HPC ClusterTools 8.2.1c installation.

The –p option can be useful if individual packages were not installed on the nodes by ctinstall.


./ctinstall –N /tmp/nodelist –r rsh

This command installs and activates the full Sun HPC ClusterTools software suite on the nodes listed in the file /tmp/nodelist. The remote connection method is rsh.

Solaris OS Packages

The following is the Solaris OS package breakdown for the Sun HPC ClusterTools 8.2.1c (Open MPI) release.


TABLE 2-2   Solaris OS Packages in the Sun HPC ClusterTools 8.2.1c Installation
Package Name Contents
SUNWompi Open MPI Message Passing Interface files
SUNWompiat Open MPI installer utilities
SUNWompimn Open MPI Message Passing Interface man pages
SUNWomsc Extra package to include miscellaneous files
SUNWompir Open MPI Root Filesystems files


Verifying the Software Installation

You can verify that the software is installed properly by launching a simple non-MPI parallel job using mpirun. In the following example, hostname is the name of the system on which the RPM packages were installed:


% /opt/SUNWhpc/HPC8.2.1c/bin/mpirun hostname


Sun HPC ClusterTools 8.2.1c Installation Log Files

The Sun HPC ClusterTools 8.2.1c installation tools log information about installation-related tasks locally on the nodes where installation tasks are performed. The default location for the log files is /var/sadm/system/logs/hpc. If installation tasks are initiated from a central host, a summary log file is also created on the central host.

Local, Node-Specific Log Files

Two types of log files are created locally on each cluster node where installation operations take place.

These node specific installation log files are created regardless of the installation method used, local or centralized.

Central Node Summary Log

When installation tasks are initiated from a central host, a summary log file named ct_summary.log is created on the central host. This log file records the final summary report that is generated by the CLI. The ct_summary.log is not overwritten when a new task is performed. As with the ct_history.log file, new entries are appended to the summary log file.


Removing Previously Installed Sun HPC ClusterTools Software

This section describes how to remove Sun HPC ClusterTools software using the ctremove utility. See Table 1 for a summary of the ctremove options.


TABLE 2-3   ctremove Options  
Options Description
General  

–h

Command help.

–l

Execute the command on the local node only.

–R

Specify the full path to be used as the root path.

–x

Turn on command debug at the specified nodes.
 

Specific to Command

–p

List of packages to be selectively removed. Separate names by comma.
 

Centralized Operations Only

–g

Generate node lists of successful and unsuccessful removals.

–k

Specify a central location for storing copies of local log files.

–n

List of nodes targeted for removal. Separate names by comma.

–N

File containing list of nodes targeted for removal. One node per line.

–r

Remote connection method: rsh, ssh, or telnet.

–S

Specify full path to an alternate ssh executable.

General Example of ctremove Command

This section shows the basic steps involved in removing Sun HPC ClusterTools software from one or more platforms.


# cd $INSTALL_LOC/SUNwhpc/HPC8.2.1c/bin/Install_Utilities/bin
# ctremove options

$INSTALL_LOC is the location of the software that will be removed.



Note - If any nodes are active at the time ctremove is initiated, they will be deactivated automatically before the removal process begins.



Removing Nodes From a Central Host

This section shows examples of software removal in which the ctremove command is initiated from a central host.



Note - If you use rsh connections to install or remove software packages on hundreds of nodes at a time, system resource limitations may prevent some node connections from being established. For clusters with hundreds of nodes, it is best to perform these operations on subsets of nodes, one subset at a time, with no more than 200 nodes in a subset.



procedure icon  Remove the Software From Specified Nodes

  •   Use the -N option to specify a file containing a list of nodes to be removed and the -r option to specify the remote connection method.


    ./ctremove –N /tmp/nodelist –r rsh
    

    This command removes the software from the nodes listed in /tmp/nodelist. The remote connection method is rsh.

procedure icon  Remove the Software From Specified Nodes and Generate Log Files

  •   Use the -k option to direct log files to a central location and the -g option to generate lists of successful and unsuccessful node removals.


    ./ctremove –N /tmp/nodelist –r rsh –k /tmp/cluster-logs –g
    

    This command example is the same as the in the previous section, except that it specifies the options –k and –g in addition to -N and -r.

procedure icon  Remove Specified Software Packages

  •   Use the -p option to specify a list of packages to be removed.


    ./ctremove –N /tmp/nodelist –r rsh –p SUNWompi,SUNWompimn
    

    This command removes the packages SUNWompi and SUNWompimn from the nodes listed in /tmp/nodelist. The remote connection method is rsh.

Removing Software From the Local Node

This section shows examples of software removal from a local node.

procedure icon  Remove Software Locally

  •   Use the -l option to remove the software from the node on which the command is run.


    ./ctremove –l
    

procedure icon  Remove Specified Software Packages From the Local Node

  •   Use the -p option to specify a list of packages to be removed. When used with the -l option, the packages are removed only from the local node.


    ./ctremove –l –p SUNWompi,SUNWompimn