C H A P T E R 5 - Running Programs With mprun in Distributed Resource Management Systems

This chapter describes the options to the mprun command that are used for distributed resource management, and provides instructions for each resource manager. It has four sections:

mprun Options for DRM Integration

Call mprun from within the resource manager, as explained in Integration With Distributed Resource Management Systems. Use the -x flag to specify the resource manager, and the -np and -nr flags to specify the resources you need. In addition, the -Is flag selects the default CRE I/O environment, the -v flag produces verbose output, and the -J flag displays a fuller identification of each process.

Some mprun flags do not make sense for a batch job, and will cause the mprun request to be rejected if used with the -x flag. See Improper Flag Combinations for Batch Jobs.

-Is

When launching mprun from a resource manager, the -Is option selects CRE's default I/O behavior, overriding the I/O behavior of the resource manager. You do not need this option when using any of these mprun flags: -I, -D, -N, -B, -n, -i, or -o.

Each of those flags already invokes CRE's default I/O behavior. You also do not need this option if you prefer to keep the resource manager's default I/O behavior.

-np numprocs [x threads]

Request the specified number of processes in a job. The default is 1 if you are using CRE. For other resource managers, the default is 0. . Use the argument 0 to specify that you want to start one process for each available CPU, based on your resource requirements.

When launching a multithreaded program, use the x threads syntax to specify the number of threads per process. Although the job requires a number of resources equal to numprocs multiplied by threads, only numprocs processes are started. The ranks are numbered from 0 to numprocs minus 1. The processes are allocated across nodes so that each node provides a number of CPUs equal to or greater than threads. If threading requirements cannot be met, the job fails and provides diagnostic messages.

A threads setting of 0 allocates the processes among all available resources. It is equivalent to the -Ns option.

The syntax -np numprocs is equivalent to the syntax -np numprocsx1. The default is -np 1x1.

If your batch job calls MPI_Comm_spawn(3SunMPI) or MPI_Comm_spawn_multiple(3SunMPI), be sure to use the -nr option to reserve the additional resources.

-x resource_manager

The -x option specifies the resource_manager. Supported resource_ managers are:

If you set a default resource manager in the hpc.conf(4) file, mprun is automatically launched with that resource manager and you don't have to use the -x option. To override the default, use the -x resource-manager flag.

-v

Verbose output. Each interaction of the CRE environment with the resource manager is displayed in the output.

Improper Flag Combinations for Batch Jobs

Do not use the following flags with the -x resource_manager flag; if you do, the mprun request will be rejected:

Running Parallel Jobs in the PBS Environment

First reserve the number of resources by invoking the qsub command with the
-l option. The -l option specifies the number of nodes and the number of processes per node. For example, this command sequence reserves four nodes with four processes per node for the job myjob.sh:

% qsub -l nodes=4:ppn=4 myjob.sh

Once you enter the PBS environment, you can launch an individual job or a series of jobs with mprun. Use the -x pbs option to the mprun command. The mprun command launches the job using the rankmap file produced by PBS and stored in the environment variable PBS_NODEFILE. The job ranks are children of PBS, not CRE.

1. Enter the PBS environment interactively with the -I option to qsub, and use the
-l option to reserve resources for the job.

hpc-u2-6% qsub -l nodes=1:ppn=2 -I

The command sequence shown above enters the PBS environment and reserves one node with two processes for the job. Here is the output:

qsub: waiting for job 20.hpc-u2-6 to start

qsub: job 20.hpc-u2-6 ready

Sun Microsystems Inc. SunOS 5.10 Generic January 2005

pbs%

pbs% mprun -x pbs -v hostname

The hostname program uses the rankmap specified by the PBS environment variable PBS_NODEFILE. The output shows the hostname program being run on ranks r0 and r1:

[mprun:/opt/SUNWhpc/lib/pbsrun -v --

/opt/SUNWhpc/lib/mpexec -x pbs -v --/usr/bin/hostname]

[pubsrun:r0-r1:/opt/SUNWhpc/lib/mpexec -x pbs -v

-- /usr/bin/hostname]

[mpexec:r0:/usr/bin/hostname]

[mpexec:r1:/usr/bin/hostname]

As described on page -x resource_manager, the -x flag identifies the resource manager that will be used for the job launched by mprun. In the following examples, the script is called myjob.csh. Here is an example of the script.

mprun -x pbs -v hostname

The line above launches the hostname program in verbose mode, using PBS as the resource manager.

2. Enter the PBS environment and use the -l option to qsub to reserve resources for the job.

hpc-u2% qsub -l nodes=1:ppn=2 myjob.csh

The command sequence shown above enters the PBS environment and reserves one node with two processes for the job that will be launched by the script named myjob.csh.

[mprun:/opt/SUNWhpc/lib/pbsrun -v --/opt/SUNWhpc/lib/mpexec-x pbs -v --/usr/bin/hostname]

[pbsrun:r0-r1:/opt/SUNWhpc/lib/mpexec -x pbs -v -- /usr/bin/hostname]

[mpexec:r0:/usr/bin/hostname]

[mpexec:r1:/usr/bin/hostname]

As you can see, because the mprun command was invoked with the -x pbs option, it calls the pbsrun command, which calls mpexec, which forks into two calls of the hostname program, one for each node.

Running Parallel Jobs in the LSF Environment

burl-ct-v4% bsub -n 4 -q short -Is csh

The command sequence shown above enters the LSF environment in interactive mode, reserves 4 nodes, and selects the short queue. Here is the output:

Job <24559> is submitted to queue <short>

<<Waiting for dispatch...>>

<<Starting on burl-ct-v4>>

burl-ct-v4

pam requires the -g switch, which specifies the generic job launcher framework. mprun requires the -x lsf switch in order to specify that it is running under the LSF resource manager.

 burl-ct-v4-4 137 =>bsub -n 4 -q short -Is

bsub> No command is specified. Job not submitted.

 burl-ct-v4-4 138 =>bsub -n 4 -q short -Is tcsh

Job <1151> is submitted to queue <short>.

<<Waiting for dispatch ...>>

<<Starting on burl-ct-v4-5>>

 burl-ct-v4-5 41 =>pam -g mprun -x lsf -v hostname

[lsf hostlist: burl-ct-v4-5 4]

[aout_exec:  /usr/bin/newtask -p default /opt/SUNWhpc/lib/mpexec -k 4 -x lsf -v -- /hpc/rte/LSF/cluster1/6.2/sparc-sol10-64/bin/TaskStarter -p burl-ct-v4-5:36519 -c /hpc/rte/LSF/cluster1/conf -a SOL64 /usr/bin/hostname]

[mpexec: r0: /hpc/rte/LSF/cluster1/6.2/sparc-sol10-64/bin/TaskStarter -p burl-ct-v4-5:36519 -c /hpc/rte/LSF/cluster1/conf -a SOL64 /usr/bin/hostname]

[mpexec: r1: /hpc/rte/LSF/cluster1/6.2/sparc-sol10-64/bin/TaskStarter -p burl-ct-v4-5:36519 -c /hpc/rte/LSF/cluster1/conf -a SOL64 /usr/bin/hostname]

[mpexec: r2: /hpc/rte/LSF/cluster1/6.2/sparc-sol10-64/bin/TaskStarter -p burl-ct-v4-5:36519 -c /hpc/rte/LSF/cluster1/conf -a SOL64 /usr/bin/hostname]

[mpexec: r3: /hpc/rte/LSF/cluster1/6.2/sparc-sol10-64/bin/TaskStarter -p burl-ct-v4-5:36519 -c /hpc/rte/LSF/cluster1/conf -a SOL64 /usr/bin/hostname]

burl-ct-v4-5

burl-ct-v4-5

burl-ct-v4-5

burl-ct-v4-5

[Job lsf.1151 on burl-ct-v4-5: r0: exit status 0]

[Job lsf.1151 on burl-ct-v4-5: r1: exit status 0]

[Job lsf.1151 on burl-ct-v4-5: r2: exit status 0]

[Job lsf.1151 on burl-ct-v4-5: r3: exit status 0]

Job  mprun -x lsf -v hostname

TID   HOST_NAME   COMMAND_LINE            STATUS            TERMINATION_TIME

===== ========== ================  =======================  ===================

00000 burl-ct-v4 /usr/bin/hostnam  Done                     02/28/2006 13:16:35

00001 burl-ct-v4 /usr/bin/hostnam  Done                     02/28/2006 13:16:35

00002 burl-ct-v4 /usr/bin/hostnam  Done                     02/28/2006 13:16:35

00003 burl-ct-v4 /usr/bin/hostnam  Done                     02/28/2006 13:16:35

 burl-ct-v4-5 42 =>

As described on page -x resource_manager, the -x flag identifies the resource manager that will be used for the job launched by mprun. Here is an example of the script.

mprun -x lsf -v hostname

The line above launches the hostname program in verbose mode, using LSF as the resource manager.

 burl-ct-v4-4 139 =>bsub -n 4 -q short pam -g myjob.csh

Job <1152> is submitted to queue <short>.

 burl-ct-v4-4 140 =>

The command sequence shown above enters the LSF environment, reserves 4 nodes, selects the short queue, and invokes the script myjob.csh, which calls mprun.

Running Parallel Jobs in the SGE Environment

Before you can use SGE with HPC ClusterTools, you need to set up the queue and parallel environment (PE) in SGE/N1GE. For information about how to set up the queue and the PE to work with HPC ClusterTools, refer to the Sun HPC ClusterTools 6 Software Administrator's Guide.

hpc-u2-6% qsh -pe cre 2

The command sequence shown above enters the SGE environment in interactive mode, reserves 2 nodes, and specifies CRE as the parallel processing environment. Here is the output from the command sequence:

waiting for interactive job to be scheduled ...

Your interactive job 24 has been successfully scheduled.

hpc-u2-6% mprun -x sge -v hostname

 [r0: aout: qrsh, args:

          qrsh -inherit -V hpc-u2-7 /opt/SUNWhpc/lib/mpexec

          -x sge -- hostname]

[r1: aout: qrsh, args:

          qrsh -inherit -V hpc-u2-6 /opt/SUNWhpc/lib/mpexec

          -x sge -- hostname]

As described on page -x resource_manager, the -x flag identifies the resource manager that will be used for the job launched by mprun. Here is an example of a script.

set echo

mprun -x sge -v hostname

The line above launches the hostname program in verbose mode, using SGE as the resource manager.

hpc-u2-6% qsub -pe cre 2 myjob.csh

The command sequence shown above enters the CRE environment, reserves 2 nodes, and invokes the script myjob.csh, which calls mprun. Here is the output:

 your job 33 ("myjob.csh") has been submitted

3. To display the output from the job, find the output file and display its contents.

a. Use the ls command to list the files into which the script has loaded the output.

hpc-u2-6% ls *33

myjob.csh.e33 myjob.csh.o33 myjob.csh.pe33 myjob.csh.po33

The file that contains the job's errors is named myjob.csh.e33. The file that contains the job's output has the name myjob.csh.o33.

hpc-u2-6% cat myjob.csh.o33

[r0: aout: qrsh, args:

qursh -inherit -V hpc-u2-6

/opt/SUNWhpc/lib/mpexec -x sge --hostname]

[r1: aout: qrsh, args:

qursh -inherit -V hpc-u2-7

/opt/SUNWhpc/lib/mpexec -x sge --hostname]

`mprun` Options for DRM Integration

`-Is`

`-np` numprocs `[x` threads`]`

`-x` resource_manager

`-v`

Improper Flag Combinations for Batch Jobs

Running Parallel Jobs in the PBS Environment

To Run an Interactive Job in PBS

To Run a Script Job in PBS

Running Parallel Jobs in the LSF Environment

To Run an Interactive Job in LSF

To Run a Script Job in LSF

Running Parallel Jobs in the SGE Environment

To Run an Interactive Job in SGE

To Run a Script Job in SGE

mprun Options for DRM Integration

-Is

-np numprocs [x threads]

-x resource_manager

-v

Improper Flag Combinations for Batch Jobs

Running Parallel Jobs in the PBS Environment

To Run an Interactive Job in PBS

To Run a Script Job in PBS

Running Parallel Jobs in the LSF Environment

To Run an Interactive Job in LSF

To Run a Script Job in LSF

Running Parallel Jobs in the SGE Environment

To Run an Interactive Job in SGE

To Run a Script Job in SGE

`mprun` Options for DRM Integration

`-Is`

`-np` numprocs `[x` threads`]`

`-x` resource_manager

`-v`