Overview
This section provides a set of options that control MPI communication behavior in ways that are likely to affect message-passing performance. It contains two templates with predefined option settings. These templates are shown in Example 7-4 and discussed below.
General-Purpose, Multiuser Template - The first template in the MPIOptions section is designed for general-purpose use at times when multiple message-passing jobs will be running concurrently.
Performance Template - The second template is designed to maximize the performance of message-passing jobs when only one job is allowed to run at a time.
Note - The first line of each template contains the phrase "Queue=xxxx." This is because the queue-based LSF workload management runtime environment uses the same hpc.conf file as the CRE.
The options in the general-purpose template are the same as the default settings for the Sun MPI library. In other words, you do not have to uncomment the general-purpose template to have its option values be in effect. This template is provided in the MPIOptions section so you can see what options are most beneficial when operating in a multiuser mode.
If you want to use the performance template, do the following:
The resulting template should appear as follows:
Begin MPIOptions
coscheduling off
spin on
End MPIOptions
Table 7-1 provides brief descriptions of the MPI runtime options that can be set in hpc.conf. Each description identifies the default value and describes the effect of each legal value.
Some MPI options not only control a parameter directly, they can also be set to a value that passes control of the parameter to an environment variable. Where an MPI option has an associated environment variable, Table 7-1 names the environment variable
Example 7-4 MPIOptions Section
Example
# Following is an example of the options that affect the runtime
# environment of the MPI library. The listings below are identical
# to the default settings of the library. The "queue=hpc" phrase
# makes it an LSF-specific entry, and only for the queue named hpc.
# These options are a good choice for a multiuser queue. To be
# recognized by CRE, the "Queue=hpc" needs to be removed.
#
# Begin MPIOptions queue=hpc
# coscheduling avail
# pbind avail
# spindtimeout 1000
# progressadjust on
# spin off
#
# shm_numpostbox 16
# shm_shortmsgsize 256
# rsm_numpostbox 15
# rsm_shortmsgsize 401
# rsm_maxstripe 2
# End MPIOptions
# The listing below is a good choice when trying to get maximum
# performance out of MPI jobs that are running in a queue that
# allows only one job to run at a time.
#
# Begin MPIOptions Queue=performance
# coscheduling off
# spin on
# End MPIOptions
|
Table 7-1 MPI Runtime Options
|
|
Values
|
|
|
Option
|
Default
|
Other
|
Description
|
|
coscheduling
|
avail
|
|
Allows spind use to be controlled by the environment variable MPI_COSCHED. If MPI_COSCHED=0 or is not set, spind is not used. If MPI_COSCHED=1, spind must be used.
|
|
|
|
on
|
Enables coscheduling; spind is used. This value overrides MPI_COSCHED=0.
|
|
|
|
off
|
Disables coscheduling; spind is not to be used. This value overrides MPI_COSCHED=1.
|
|
pbind
|
avail
|
|
Allows processor binding state to be controlled by the environment variable MPI_PROCBIND. If MPI_PROCBIND=0 or is not set, no processes will be bound to a processor. This is the default.
If MPI_PROCBIND=1, all processes on a node will be bound to a processor.
|
|
|
|
on
|
All processes will be bound to processors. This value overrides MPI_PROCBIND=0.
|
|
|
|
off
|
No processes on a node are bound to a processor. This value overrides MPI_PROCBIND=1.
|
|
spindtimeout
|
1000
|
|
When polling for messages, a process waits 1000 milliseconds for spind to return. This equals the value to which the environment variable MPI_SPINDTIMEOUT is set.
|
|
|
|
integer
|
To change the default timeout, enter an integer value specifying the number of milliseconds the timeout should be.
|
|
progressadjust
|
on
|
|
Allows user to set the environment variable MPI_SPIN.
|
|
|
|
off
|
Disables user's ability to set the environment variable MPI_SPIN.
|
|
shm_numpostbox
|
16
|
|
Sets to 16 the number of postbox entries that are dedicated to a connection endpoint. This equals the value to which the environment variable MPI_SHM_NUMPOSTBOX is set.
|
|
|
|
integer
|
To change the number of dedicated postbox entries, enter an integer value specifying the desired number.
|
|
shm_shortmsgsize
|
256
|
|
Sets to 256 the maximum number of bytes a short message can contain. This equals the default value to which the environment variable MPI_SHM_SHORTMSGSIZE is set.
|
|
|
|
integer
|
To change the maximum-size definition of a short message, enter an integer specifying the maximum number of bytes it can contain.
|
|
rsm_numpostbox
|
15
|
|
Sets to 15 the number of postbox entries that are dedicated to a connection endpoint. This equals the value to which the environment variable MPI_RSM_NUMPOSTBOX is set.
|
|
|
|
integer
|
To change the number of dedicated postbox entries, enter an integer value specifying the desired number.
|
|
rsm_shortmsgsize
|
401
|
|
Sets to 401 the maximum number of bytes a short message can contain. This equals the value to which the environment variable MPI_RSM_SHORTMSGSIZE is set.
|
|
|
|
integer
|
To change the maximum-size definition of a short message, enter an integer specifying the maximum number of bytes it can contain.
|
|
rsm_maxstripe
|
2
|
|
Sets to 2 the maximum number of stripes that can be used. This equals the value to which the environment variable MPI_RSM_MAXSTRIPE is set.
|
|
|
|
integer
|
To change the maximum number of stripes that can be used, enter an integer specifying the desired limit. This value cannot be greater than 2.
|
|
spin
|
off
|
|
Sets the MPI library to avoid spinning while waiting for status. This equals the value to which the environment variable MPI_SPIN is set.
|
|
|
|
on
|
Sets the MPI library to spin.
|