Sun MPI 4.0 User's Guide: With LSF

Chapter 2 Starting Sun MPI Programs

This chapter explains the basic steps for starting up message-passing programs on a Sun HPC cluster using LSF Batch services. It covers the following topics:

For information about developing, compiling, and linking Sun MPI programs, see the Sun MPI 4.0 Programming and Reference Manual.


Note -

Running parallel jobs with LSF Suite 3.2.3 is supported on up to 1024 processors and up to 64 nodes.


Using Parallel Job Queues

Distributed MPI jobs must be submitted via batch queues that have been configured to handle parallel jobs. This parallel capability is just one of the many characteristics that a system administrator can assign when setting up a batch queue.

You can use the command bqueues -l to find out which job queues support parallel jobs, as shown in Figure 2-1.

The bqueues -l output contains status information about all the queues currently defined. Look for a queue that includes the line:

JOB_STARTER: pam

which means it is able to handle parallel (distributed MPI) jobs. In the example shown in Figure 2-1, the queue hpc is defined in this way.


Note -

The pam entry may be followed by a -t or -v. The -t option suppresses printing of process status upon completion and -v specifies that the job is to run in verbose mode.


Figure 2-1 Finding a Parallel Queue With bqueues -l

Graphic

If no queues are currently configured for parallel job support, ask the system administator to set one or more up in this way.

Once you know the name of a queue that supports parallel jobs, submit your Sun MPI jobs explicitly to them. For example, the following command submits the job hpc-job to the queue named hpc for execution on four processes.

hpc-demo% bsub -q hpc -n 4 hpc-job

Additional examples are provided in "Submitting Jobs in Batch Mode" and "Submitting Interactive Batch Jobs".


Note -

To use LSF Batch commands, your PATH variable must include the directory where the LSF Base, Batch, and Parallel components were installed. The default installation directory is /opt/SUNWlsf/bin. Likewise, your PATH variable must include the ClusterTools software installation directory; the default location for ClusterTools components is /opt/SUNWhpc/bin.


bsub Overview

The command for submitting Sun MPI jobs to the LSF Batch system is bsub, just as it is for submitting nonparallel batch jobs. The command syntax is essentially the same as well, except for an additional option, -sunhpc, which applies specifically to Sun MPI jobs. The bsub syntax for parallel jobs is

bsub [basic_options] [-sunhpc sunhpc_args] job

The basic_options entry refers to the set of standard bsub options that are described in the LSF Batch User's Guide. The -sunhpc option allows Sun HPC-specific arguments to be passed to the MPI job job.

"Submitting Jobs in Batch Mode" and "Submitting Interactive Batch Jobs" describe how to use bsub to submit jobs in batch and interactive batch modes, respectively. The -sunhpc option is discussed in "Using the -sunhpc Option".

Refer to the LSF Batch User's Guide for a full discussion of bsub and associated job-submission topics.

Submitting Jobs in Batch Mode

The simplest way to submit a Sun MPI job to the LSF Batch system is in batch mode. For example, the following command submits hpc-job to the queue named hpc in batch mode and requests that the job be distributed across four processors.

hpc-demo% bsub -q hpc -n 4 hpc-job

Batch-mode is enabled by default, but can be disabled by the system administrator via the INTERACTIVE parameter.

You can check to see if a queue is able to handle batch-mode jobs by running bqueues -l queue_name. Then look in the SCHEDULING POLICIES: section of the bqueues output for the following entries.

The example queue shown in Figure 2-1; has a SCHEDULING POLICIES:setting of NO_INTERACTIVE, which allows batch-mode jobs, but not interactive batch.

As soon as hpc-job is submitted in batch mode, LSF Batch detaches it from the terminal session that submitted it.


Note -

If you request more processors than are available, you must use process wrapping to allow multiple processes to be mapped to each processor. Otherwise, LSF Batch will wait indefinitely for the number of resources to become available and the job will never launched. Process wrapping is discussed in "Specify the Number of Processes".


Submitting Interactive Batch Jobs

The interactive batch mode makes full use of the LSF Batch system's job scheduling policies and host selection facilities, but keeps the job attached to the terminal session that submitted it. This mode is well suited to Sun MPI jobs and other resource-intensive applications.

The following example submits hpc-job to the queue named hpc in interactive batch mode. As before, this example is based on the assumption that hpc is configured to support parallel jobs.

hpc-demo% bsub -I -q hpc -n 4 hpc-job 

The -I option specifies interactive batch mode.

The queue must not have interactive mode disabled. To check this, run

hpc-demo% bqueues -l hpc 

and check the SCHEDULING POLICIES: section of the resulting output. If it contains either

SCHEDULING POLICIES:  ONLY_INTERACTIVE

or

SCHEDULING POLICIES:

(that is, no entry), interactive batch mode is enabled.

When the queue accepts the job, it returns a job ID. You can use the job ID later as an argument to various commands that enquire about job status or that control certain aspects of job state. For example, you can suspend a job or remove it from a queue with the bstop jobid and bkill jobid commands. These commands are described in Chapter 7 of the LSF Batch User's Guide.

Using the -sunhpc Option

LSF Suite version 3.2.3 supports the bsub command-line option -sunhpc, which gives users special control over Sun MPI jobs. As mentioned earlier, the -sunhpc option and its arguments must be the last option on the bsub command line:

bsub [basic_options] [-sunhpc sunhpc_args] job

"Redirect stderr" through "Spawn a Job in the Stopped State" describe the -sunhpc arguments.

Redirect stderr

Use the -e argument to redirect stderr to a file named file.Rn, where file is the user-supplied name of the output file. The Rn extension is supplied automatically and indicates the rank of the process producing the stderr output.

For example, to redirect stderr to files named boston.R0, boston.R1, and so forth, enter

hpc-demo% bsub -I -n 4 -q hpc -sunhpc -e boston hpc-job

Redirect stdout

Use the -o argument to redirect stdout to a file named file.Rn, where file is the user-supplied name of the output file. The Rn extension is supplied automatically and indicates the rank of the process producing the stdout output.

For example, to redirect stdout to files named boston.R0, boston.R1, and so forth, enter

hpc-demo% bsub -I -n 4 -q hpc -sunhpc -o boston hpc-job

Collocate Jobs by Specifying Job ID

Use the -j argument to specify the job ID of another job with which the new job should collocate.

For example, to cause job hpc-job to be collocated with a job whose job ID is 4622, enter

hpc-demo% bsub -I -n 4 -q hpc -sunhpc -j 4622 hpc-job

Use bjobs to find out the job ID of a job. See the LSF Batch User's Guide for details.

Collocate Jobs by Specifying Job Name

Use the -J argument to specify the name of another job with which the new job should collocate.

For example, to cause job hpc-job1 to be collocated with a job named hpc-job2, enter

hpc-demo% bsub -I -n 4 -q hpc -sunhpc -J hpc-job2 hpc-job1

Specify the Number of Processes

Use the -n argument to specify the number of processes to run. This argument can be used in concert with the bsub -n argument to cause process wrapping to occur. Process wrapping is the term used to describe a technique for distributing multiple processes to fewer processors than there are processes. As a result, each processor has multiple processes, which are spawned in a cyclical, wrap-around, fashion.

For example, the following will distribute 48 processes across 16 processors, resulting in a 3-process wrap per processor.

hpc-demo% bsub -I -n 16 -q hpc -sunhpc -n 48 hpc-job

If you specify a range of processors rather than a single quantity and a larger number of processes, the process wrapping ratio (number of processes per to processor) will depend on the number of processors that are actually allocated.

For example, the following will distribute 48 processes across at least 8 processors and possibly as many as 16.

hpc-demo% bsub -I -n 8,16 -q hpc -sunhpc -n 48 hpc-job

Consequently, the process-to-processor wrapping ratio may be as high as 6:1 (48 processes across 8 processors) or as low as 3:1 (48 processes across 16 processors).

Spawn a Job in the Stopped State

Use the -s argument to cause a job to be spawned in the STOPPED state. It does this by setting the stop-on-exec flag for the spawned process. This feature can be of value in a program monitoring or debugging tool as a way of gaining control over a parallel program. See the proc(4) man page for details.


Note -

Do not use the -s argument with the Prism debugger. It would add nothing to Prism's capabilities and would be likely to interfere with Prism's control over the debugging session.


The following example shows the -s argument being used to spawn an interactive batch job in the STOPPED state.

hpc-demo% bsub -I -n 1 -q hpc -sunhpc -s hpc-job

To identify processes in the STOPPED state, issue the ps command with the -el argument:

hpc-demo% ps -el
F  S  UID  PID  PPID C PRI NI ADDR     SZ WCHAN TTY  TIME  CMD
19 T  0    0    0    0 0   SY f0274e38 0  ?          0:00  sched

Here, the sched command is in the STOPPED state, as indicated by the T entry in the S (State) column.

Note that, when spawning a process in the STOPPED state, the program's name does not appear in the ps output. Instead, the stopped process is identified as a RES daemon.

Generate Rank-Tagged Output

Use the -t argument to cause all output to be tagged with its MPI rank.


Note -

The -t argument cannot be used when output is redirected by the -e or -o options to -sunhpc.


For example, the following adds a rank-indicator prefix to each line of output.

Graphic