This appendix includes instructions for
Installing the software at the command line - "Installing at the Command Line"
Removing the software - "Removing the Software"
Removing and installing individual packages - "Removing and Reinstalling Individual Packages"
The easiest way to configure and install Sun HPC ClusterTools 3.0 software is to use the configuration tool, install_gui, as described in the Sun HPC ClusterTools 3.0 Installation Guide. If you prefer, however, you may install the software from the command line as described in this appendix, with a few references to the installation guide.
Figure A-1 summarizes the steps involved. The solid lines identify tasks that are always performed. The dashed lines indicate special-case tasks.
Before installing Sun HPC ClusterTools 3.0 software, you need to ensure that the hardware and software that make up your cluster meet certain requirements. You must have already installed LSF 3.2.3. Further requirements are outlined in the Sun HPC ClusterTools 3.0 Installation Guide. Review them before proceeding with the instructions in this appendix.
If you are installing the software on a cluster of more than 16 nodes, you will probably want to use the CCM tools to make installation easier. You can use these tools to install on up to 16 nodes at a time. If you need to install software on a cluster of more than 16 nodes, you must install it first on a group of up to 16 nodes, then add more nodes by repeating the installation process on additional groups of up to 16 until you have installed the software on all the nodes in the cluster. For each group of nodes, you will need to create a separate configuration file, each with a unique file name, such as hpc_config1, hpc_config2, and so on.
Many aspects of the Sun HPC ClusterTools 3.0 installation process are controlled by a configuration file called hpc_config, which is similar to the lsf_config file used to install LSF 3.2.3.
Instructions for accessing and editing hpc_config are provided in "Accessing hpc_config" and "Editing hpc_config".
Use a text editor to edit the hpc_config file directly. This file must be located in a directory within a file system that is mounted read/write/execute accessible on all the other nodes in the cluster. A template for hpc_config is provided on the Sun HPC ClusterTools 3.0 distribution CD-ROM to simplify creation of this file.
Before starting the installation process, you should copy this template to a directory on the node chosen to be the installation platform and edit it so that it satisfies your site-specific installation requirements. Choose a node to function as the installation platform and a home directory on that node for hpc_config.
The directory containing hpc_config must be read/write/execute accessible (777 permissions) by all the nodes in the cluster.
The hpc_config template is located in
/cdrom/hpc_3_0_ct/Product/Install_Utilities/config_dir/hpc_config
To access hpc_config on the distribution CD-ROM, perform the following steps on the node chosen to be the installation platform:
Mount the CD-ROM path on all the nodes in the cluster.
Load the Sun HPC ClusterTools distribution CD-ROM in the CD-ROM drawer.
Copy the configuration template onto the node.
# cd config_dir_install # cp /cdrom/hpc_3_0_ct/Product/Install_Utilities/config_dir/hpc_config . |
config_dir_install is a variable representing the directory where the configuration files will reside; all cluster nodes must be able to read from and write to this directory.
Edit the hpc_config file according to the instructions provided in the next section.
If you have already installed the software, you can find a copy of the hpc_config template in the directory /opt/SUNWhpc/bin/Install_Utililites/config_dir.
If you are editing an existing hpc_config file after installing the software using the graphical installation tool, the hpc_config file created by the tool will not contain the comment lines included in the template.
Example A-1 shows the basic hpc_config template, but without most of the comment lines provided in the online template. The template is simplified here to make it easier to read and because each section is discussed in detail following Example A-1. Two examples of edited hpc_config files follow the general description of the template.
The template comprises five sections:
Supported Software Installation - All installations must complete this first section. Since you will be using LSF, complete only Part A of this section.
General installation information - All installations must complete this section. If you are installing the software locally on a single-node cluster, you can stop after completing this section.
Information for NFS and cluster-local installations - If you are installing the software either on an NFS server for remote mounting or locally on each node of a multinode cluster, you need to complete this section, too.
Section I - Supported Software Information LSF_SUPPORT="<choice>" # PART A: Running HPC 3.0 ClusterTools software with LSF software. # Do you want to modify LSF parameters to optimize HPC job launches? MODIFY_LSF_PARAM="<choice>" # Name of the LSF Cluster LSF_CLUSTER_NAME="<clustername>" # Section II - General Installation Information # Type of Installation Configuration INSTALL_CONFIG="<choice>" # Installation Location INSTALL_LOC="/opt" # CD-ROM Mount Point CD_MOUNT_PT="/cdrom/hpc_3_0_ct" #Section III - For Cluster-Local and NFS Installation # Installation Method INSTALL_METHOD="<method>" # Hardware Information NODES="<hostname1> <hostname2> <hostname3>" # SCI Support INSTALL_SCI="<choice>" # Section IV - For NFS Installation # NFS Server NFS_SERVER="" # Location of the Software Installed on the NFS ServerINSTALL_LOC_SERVER="" |
Information for NFS installations - You need to complete this section only if you are installing the software on an NFS server.
For the purposes of initial installation, ignore the fifth section.
You will be using the software with LSF, so enter yes here.
LSF_SUPPORT="yes" |
Since you will be using LSF, complete only Part A of this section.
Allowing the Sun HPC installation script to modify LSF parameters optimizes HPC job launches. Your choice for this variable must be yes or no.
MODIFY_LSF_PARAM="choice" |
Before installing Sun HPC ClusterTools software, you must have installed LSF 3.2.3. When you installed the LSF software, you selected a name for the LSF cluster. Enter this name in the LSF_CLUSTER_NAME field.
LSF_CLUSTER_NAME="clustername" |
All installations must complete this section. If you are installing the software locally on a single-node cluster, you can stop after completing this section.
Three types of installation are possible for Sun HPC ClusterTools 3.0 software:
nfs - Install the software on an NFS server for remote mounting.
smp-local - For single-node clusters only: Install the software locally on the node.
cluster-local - For multinode clusters only: Install the software locally on every node in the cluster.
Specify one of the installation types: nfs, smp-local, or cluster-local. There is no default type of installation.
INSTALL_CONFIG="config_choice" |
The way the INSTALL_LOC path is used varies, depending on which type of installation you have chosen.
Local Installations - For local installations ( smp-local or cluster-local), INSTALL_LOC is the path where the packages will actually be installed.
NFS installations - For NFS installations ( nfs), INSTALL_LOC is the mount point for the software on the NFS clients.
You must enter a full path name. The default location is /opt. The location must have set (or mounted, if this is an NFS installation) read/write (755) permission on all the nodes in the cluster.
INSTALL_LOC="/opt" |
If you choose an installation directory other than the default /opt, a symbolic link is created from /opt/SUNWhpc to the chosen installation point.
Specify a mount point for the CD-ROM. This mount point must be mounted on (that is, NFS-accessible to) all the nodes in the cluster. The default mount point is /cdrom/hpc_3_0_ct. For example:
CD_MOUNT_PT="/cdrom/hpc_3_0_ct" |
If you are installing the software either on an NFS server for remote mounting or locally on each node of a multinode cluster, you need to complete this section.
Specify either rsh or cluster-tool as the method for propagating the installation to all the nodes in the cluster.
cluster-tool - If you choose the cluster-tool option, you will be able to use one of the Cluster Console Manager (CCM) applications, cconsole, ctelnet, or crlogin, to greatly facilitate the installation of Sun HPC ClusterTools 3.0 software on all of the nodes in your cluster in parallel. See Appendix A, Installing and Removing the Software B for information about using CCM tools.
You can use the CCM tools to install on up to 16 nodes at a time. For clusters with more than 16 nodes, you will have to repeat the installation process on groups of up to 16 nodes at a time until you have installed the software on the entire cluster.
rsh - If you choose the rsh method, the software will be installed serially on the cluster notes in the order in which they are listed in hpc_config. The CCM applications cannot be used to install the software in rsh mode.
Also note that this method requires that all nodes are trusted hosts--at least during the installation process.
INSTALL_METHOD="method" |
There are two ways to enter information in this section:
If the cluster nodes are connected to a terminal concentrator, list each node in the following triplet format.
NODES="hostname1/termcon_name/port_id hostname2/termcon_name/port_id ..." |
In each triplet, specify the host name of a node, followed by the host name of the terminal concentrator and the port ID on the terminal concentrator to which that node is connected. Separate the triplet fields with virgules (/). Use spaces between node triplets.
If the cluster nodes are not connected to a terminal concentrator, simply list the node host names, separated by spaces, as follows.
NODES="hostname1 hostname2 hostname3 ..." |
Every node in your Sun HPC cluster must also be in the corresponding LSF cluster. See the discussion of the lsf.cluster.clustername configuration file in the LSF Batch Administrator's Guide for information on LSF clusters.
If you will not be using the CCM tools, you can allow the installation script to derive the node list from the LSF configuration file lsf.cluster.clustername. To do this, either set the NODES variable to NULL or leave the line commented out. You must be installing from one of the nodes in the LSF cluster.
This section tells the script whether to install the SCI-related packages. If your cluster includes SCI, replace choice with yes; otherwise, replace it with no.
INSTALL_SCI="yes" |
A yes entry causes the three SCI packages and two RSM packages to be installed in the /opt directory. A no causes the installation script to skip the SCI and RSM packages.
The SCI and RSM packages are installed locally on every node, not on an NFS server.
You need to complete this section only if you are installing the software on an NFS server.
The format for setting the NFS server host name is the same as for setting the host names for the nodes in the cluster. There are two ways to define the host name of the NFS server:
If you have a terminal concentrator, describe the NFS server in the following triplet format.
NFS_SERVER="hostname/termcon_name/port_id" |
If you do not have a terminal concentrator, simply specify the host name of the NFS server.
NFS_SERVER="hostname" |
The NFS server can be one of the cluster nodes or it can be external (but connected) to the cluster. If the server will be part of the cluster--that is, will also be an execution host for the Sun HPC ClusterTools software--it must be included in the NODES field described in "Hardware Information". If the NFS server will not be part of the cluster, it must be available from all the hosts listed in NODES, but it should not be included in the NODES field.
If you want to install the software on the NFS server in the same directory as the one specified in INSTALL_LOC, leave INSTALL_LOC_SERVER empty (""). If you prefer, you can override INSTALL_LOC by specifying an alternative directory in INSTALL_LOC_SERVER.
INSTALL_LOC_SERVER="directory" |
Recall that the directory specified in INSTALL_LOC defines the mount point for INSTALL_LOC_SERVER on each NFS client.
Example A-2 and Example A-3 illustrate the general descriptions in the preceding sections with edited hpc_config files representing two different types of installations.
Local Install - Example A-2 shows how the file would be edited for a local installation on every node in a cluster. The main characteristics of the installation illustrated by Example A-2 are summarized below:
Section 1:
The software will be used with LSF.
Section I - Supported Software Information LSF_SUPPORT="yes" # PART A: Running the software with LSF # Do you want to modify LSF parameters to optimize HPC job launches? MODIFY_LSF_PARAM="yes" # Name of the LSF Cluster LSF_CLUSTER_NAME="italy" # Section II - General Installation Information # Type of Installation Configuration INSTALL_CONFIG="cluster-local" # Installation Location INSTALL_LOC="/export/home/opt2" # CD-ROM Mount Point CD_MOUNT_PT="/cdrom/hpc_3_0_ct" # Section III - For Cluster-Local and NFS Installation # Installation Method INSTALL_METHOD="rsh" # Hardware Information NODES="napoli pisa milano" # SCI Support INSTALL_SCI="no"#Section IV - For NFS Installation # NFS Server NFS_SERVER="" # Location of the Software Installed on the NFS Server INSTALL_LOC_SERVER="" |
The installation script will modify LSF parameters to optimize HPC launches.
The set of clustered nodes that will be running LSF jobs is named italy.
Section 2:
The packages will be installed locally. This means every node in the system will contain a copy of the packages that make up Sun HPC ClusterTools 3.0 software.
The base directory where the software will be installed is /export/home/opt2/.
The mount point for the software CD-ROM is /cdrom/hpc_3_0_ct.
Section 3:
The software will be installed using the UNIX utility rsh.
The nodes are not connected to a terminal concentrator, so only the node host names are listed in the Hardware Information section.
The cluster in this example does not include SCI hardware.
Section 4 is not completed, as the software is being installed locally on every node.
For the purposes of initial installation, ignore the fifth section.
NFS Install - Example A-3 shows an hpc_config file for an NFS installation. The main features of this installation example are summarized below:
Section 1:
The software will be used with LSF.
The installation script will modify LSF parameters to optimize HPC launches.
The set of clustered nodes that will be running LSF jobs is named italy.
Section 2:
The packages are being installed on an NFS server.
The mount point for the software on the NFS clients is /opt/.
The mount point for the software CD-ROM is /cdrom/hpc_3_0_ct.
Section 3:
The software will be installed using Cluster Console Manager tools.
The nodes are connected to a terminal concentrator, and the cluster console facility will be used. Consequently, each host name must be part of a triplet entry that also includes the name of the terminal concentrator and the ID of the terminal concentrator port to which the node is connected.
This example shows the nodes venice, napoli, and pisa all connected to the terminal concentrator rome via ports 5002, 5003, and 5004.
The cluster in this example does not include SCI hardware.
Section I - Supported Software Information LSF_SUPPORT="yes" # PART A: Running the software with LSF # Do you want to modify LSF parameters to optimize HPC job launches? MODIFY_LSF_PARAM="yes" # Name of the LSF Cluster LSF_CLUSTER_NAME="italy" # Section II - General Installation Information # Type of Installation Configuration INSTALL_CONFIG="nfs" # Installation Location INSTALL_LOC="/opt" # CD-ROM Mount Point CD_MOUNT_PT="/cdrom/hpc_3_0_ct" # Section III - For Cluster-Local and NFS Installation # Installation Method INSTALL_METHOD="cluster-tool" # Hardware Information NODES="venice/rome/5002 napoli/rome/5003 pisa/rome/5004" # SCI Support INSTALL_SCI="no" # Section IV - For NFS Installation # NFS Server NFS_SERVER="mars/rome/5005" # Location of the Software Installed on the NFS Server INSTALL_LOC_SERVER="/export/home/opt2" |
Section 4:
The host name of the NFS server must be supplied (in this example, mars). Because the NFS server is connected to a terminal concentrator, its host name must also be part of a triplet entry analogous to the entries in NODES.
In this case, the NFS server is not one of the nodes in the Sun HPC cluster. All the nodes in the cluster must be able to communicate with it over a network.
The software will be installed on mars in the directory /export/home/opt2.
For the purposes of initial installation, ignore the fifth section.
You can use the CCM tools to install on up to 16 nodes at a time. For clusters with more than 16 nodes, you will have to repeat the installation process on groups of up to 16 nodes at a time until you have installed the software on the entire cluster.
This step is optional. If you have chosen the cluster-tool method of installation and plan to use the CCM tools, you need to run the cluster_tool_setup script first. This loads the CCM administration tools onto a machine and creates a cluster configuration file that is used by CCM applications. See Appendix B, Cluster Management ToolsB for a description of the three CCM applications, cconsole, ctelnet, and crlogin.
cconsole requires the nodes to be connected to a terminal concentrator. The other two, ctelnet and crlogin, do not.
If you want to use cconsole to monitor messages generated while rebooting the cluster nodes, you will need to launch it from a machine outside the cluster. If you launch if from a cluster node, it will be disabled when the node from which it is launched reboots.
Perform the following steps, as root, to run cluster_tool_setup.
Go to the Product directory on the Sun HPC ClusterTools 3.0 distribution CD-ROM.
Note that this directory must be mounted on (accessible by) all nodes in the cluster.
# cd /cdrom/hpc_3_0_ct/Product/Install_Utilities |
If you are running on a node within the cluster, perform Step a. If you are running on a machine outside the cluster, perform Step b.
Within the cluster, run cluster_tool_setup -c.
Run cluster_tool_setup; use the -c tag to specify the directory containing the hpc_config file.
# ./cluster_tool_setup -c /config_dir_install |
Outside the cluster, run cluster_tool_setup -c -f.
Run cluster_tool_setup; use the -c tag to specify the directory containing the hpc_config file, plus a trailing -f tag.
# ./cluster_tool_setup -c /config_dir_install -f |
Set the DISPLAY environment variable to the machine on which you will be running the CCM tools.
# setenv DISPLAY hostname:0 |
(This example uses C-shell syntax.)
Invoke the CCM tool of your choice: cconsole (if the nodes are connected to a terminal concentrator), ctelnet, or crlogin.
All three tools reside in /opt/SUNWcluster/bin. For example,
# /opt/SUNWcluster/bin/ctelnet clustername |
where clustername is the name of the LSF cluster. All three CCM tools require the name of the cluster as an argument.
The CCM tool then creates a Common Window and separate Term Windows for all the nodes in the cluster.
Position the cursor in the Common Window and press Return.
This activates a prompt in each Term Window. Note that the Common Window does not echo keyboard entries. These appear only in the Term Windows.
You can now use CCM to remove previous release packages, as described in "Removing and Reinstalling Individual Packages"", or to install the software packages, as described in "Installing Software Packages"."
This section describes the procedure for installing the Sun HPC ClusterTools packages. Note that the exact procedure for each step will depend on which installation mode you are in, cluster-tool or rsh.
In cluster-tool mode, perform each step in the Common Window. Each entry will be echoed in every Term Window.
In rsh mode, perform each step at the shell prompt of one of the nodes.
See Appendix B, Cluster Management ToolsB for more information about the CCM tools that are available to you in cluster-tool mode.
The hpc_install command writes various SYNC files in the directory containing its configuration file as part of the package installation. If the installation process stops prematurely--if, for example, you press Ctrl-c--some SYNC files may be left. You must remove these files before executing hpc_install again so they don't interfere with the next software installation session.
Log in to each node as root.
Go to the Product directory on the Sun HPC ClusterTools 3.0 distribution CD-ROM.
Note, this directory must be mounted on (accessible by) all nodes in the cluster.
# cd /cdrom/hpc_3_0_ct/Product/Install_Utilities |
Run hpc_install.
# ./hpc_install -c /config_dir_install |
where config_dir_install represents the directory containing the hpc_config file.
The -c tag causes hpc_install to look for a file named hpc_config in the specified directory. If you want to install the software using a configuration file with a different name, you must specify a full path including the new file name after the -c tag.
If the hpc_config file contains an INSTALL_SCI="yes" entry, hpc_install will install the three SCI software packages along with the other Sun HPC ClusterTools packages. When the SCI packages are installed, the installation script will display a message telling you to reboot the nodes. Ignore this message. You must reboot the nodes only after any SCI interface cards are configured. If the system does not include SCI hardware, the nodes do not need to be rebooted.
To remove LSF, see the documentation that came with the software.
The easiest way to remove Sun HPC ClusterTools 3.0 software is by using the configuration tool, install_gui. See the next section for details. If you prefer to remove the software at the command line, you may do so using the provided removal scripts. See "Removing the Software: Command Line" for instructions.
Locate a configuration file or files for the cluster.
To remove the software from your cluster, you will need a configuration file that describes the cluster. Ideally you should use the configuration file you created when installing the software. If you cannot locate that file, you will have to create one. You can use the configuration tool to create the file. (See Chapter 3 of the Sun HPC ClusterTools 3.0 Installation Guide.)
The configuration tool will remove the software from up to 16 nodes at once. If you need to remove software from a cluster of more than 16 nodes, you must remove it first from a group of up to 16 of the nodes in your cluster. Then remove from more nodes by repeating the removal process on additional groups of nodes until you have removed the software from all the nodes in the cluster. The procedure is similar to installing the software on a cluster of more than 16 nodes. See Section 3.1.2 of the installation guide for more information.
Load the Sun HPC ClusterTools 3.0 CD-ROM in the CD-ROM drawer.
The CD-ROM mount point must be mounted on all the nodes in the cluster.
Enable root login access.
By default, most systems allow logins by root only on their console devices. To enable root login access during software removal, you must edit the /etc/default/login file on each node in the cluster. In this file on each node, find this line:
CONSOLE=/dev/console |
and make it into a comment by adding a # before it:
#CONSOLE=/dev/console |
After removing the software, you should disable root login access again if your site's security guidelines require it.
As root, launch the install_gui tool with the configuration file.
You can load the configuration file either from the command line or from within the tool after it has been launched.
At the command line, launch the configuration tool using the name of the configuration file as an argument:
# /cdrom/hpc_3_0_ct/Product/Install_Utilities/install_gui hpc_config |
Alternatively, you can load the configuration file after launching the tool by choosing Load from the File menu.
Select the Remove task and click on the Go button.
For help using the configuration tool, choose Help with Configuration Tool from the Help menu.
Locate a configuration file or files for the cluster.
To remove the software from your cluster, you will need a configuration file that describes the cluster. Ideally you should use the configuration file you created when installing the software. If you cannot locate that file, you will have to create one.
You can use the CCM tools to install on up to 16 nodes at a time. For clusters with more than 16 nodes, you will have to repeat the installation process on groups of up to 16 nodes at a time until you have installed the software on the entire cluster.
Place the Sun HPC ClusterTools 3.0 distribution CD-ROM in the CD-ROM drive.
Go to the directory on the CD-ROM containing the release packages.
This directory must be mounted with read/execute permissions (755) on all the nodes in the cluster:
# cd /cdrom/hpc_3_0_ct/Product/Install_Utilities/ |
Run hpc_remove; use the -c option to specify the directory containing the hpc_config file.
# ./hpc_remove -c /config_dir_install |
The -c tag causes hpc_remove to look for a file named hpc_config in the specified directory. If you want to remove the software using a configuration file with a different name, you must specify a full path including the new file name after the -c tag.
To remove a single package and install (or reinstall) another package in its place, perform the following steps:
#./hpc_remove -c hpc_config_file_path -d PACKAGE_NAME #./hpc_install -c config_dir -d location_of_package/PACKAGE_NAME
For example:
# cd /cdrom/hpc_3_0_ct/Product #./hpc_remove -c /home/hpc_admin -d SUNWhpmsc #./hpc_install -c /home/hpc_admin -d /cdrom/hpc_2_0_sw/Product/SUNWhpmsc