Sun HPC ClusterTools 3.0 Administrator's Guide: With CRE

Verify Basic Functionality

Use the following procedure to test the cluster's ability to perform basic operations.

Run mpinfo

Run mpinfo -N to display information about the cluster nodes. This step requires /opt/SUNWhpc/bin to be in your path.


Example 2-1 Sample mpinfo -N Output for a Two-Node System


# mpinfo -N
NAME   UP  PARTITION  OS     OSREL  NCPU  FMEM   FSWP   LOAD1   LOAD5   LOAD15
node1 y   -          SunOS  5.6    1     7.17   74.76  0.03    0.04    0.05
node2 y   -          SunOS  5.6    1     34.70  38.09  0.06    0.02    0.02

If any nodes are missing from the list or do not have a y entry in the UP column:

Create a Default Partition

You can create a cluster-wide default partition by running an initialization script named part_initialize on any node in the cluster. This will create a single partition named all, which will include all the nodes in the cluster as members.

Then, run mpinfo -N again to verify the successful creation of all. See Example 2-2 for an example of mpinfo -N output when the all partition is present.


Example 2-2 mpinfo -N Output for the Sample Partition all


# /opt/SUNWhpc/bin/part_initialization
# mpinfo -N
NAME     UP  PARTITION  OS     OSREL  NCPU  FMEM   FSWP   LOAD1  LOAD5  LOAD15
node1    y   all        SunOS  5.6    1     8.26   74.68  0.00   0.01   0.03
node2    y   all        SunOS  5.6    1     34.69  38.08  0.00   0.00   0.01

Verify That CRE Executes Jobs

Verify that the CRE can launch jobs on the cluster. For example, use the mprun command to execute hostname on all the nodes in the cluster, as shown below:

# mprun -Ns -np 0 hostnamenode1
node2

mprun is the CRE command that launches message-passing jobs. The combination of -Ns and -np 0 ensures that the CRE will start one hostname process on each node. See the mprun man page for descriptions of -Ns, -np, and the other mprun options. In this example, the cluster contains two nodes, node1 and node2, each of which returns its host name.


Note -

Note that the CRE does not sort or rank the output of mprun by default, so host name ordering may vary from one run to another.