Sun MPI 4.0 User's Guide: With CRE

Specifying the Behavior of I/O Streams

Introducing `mprun` I/O

By default, all standard output (stdout) and standard error (stderr) from an mprun-launched job will be merged and sent to mprun's standard output. This is ordinarily the user's terminal. Likewise, mprun's standard input (stdin) is sent to the standard input of all the processes.

You can redirect mprun's standard input, output, and error using the standard shell syntax. For example,

% mprun -np 4 echo hello > hellos

You can also change what happens to the standard input, output, and error of each process in the job. For example,

% mprun echo hello > message

sends hello across the network from the echo process to the mprun process, which writes it to a file called message.

The mprun command's own options allow you to control I/O in other ways. For example, rather than making remote processes communicate with mprun (when it may not be necessary), you can make each process write to or read from a file on the node on which it is running. For example, you can make each process send its standard output or standard error to a file on its own node. In the following example, each node will write hello to a local file called message:

% mprun -I "1w=message" echo hello

mprun also provides options that you can use to control standard output and standard error streams. For example, you can

Use the -D option to make the standard error from each process go to the standard error of mprun, instead of its standard output. For example,

% mprun -D a.out

sends standard output from a.out to the standard output of mprun and sends the standard error of a.out to the standard error of mprun.

Use the -B option to merge the standard output and standard error streams from each process and direct them to files named out.jid.rank, where jid is the job ID of the job and rank is the rank of this process within the job. The files are located in the job's working directory. There is no standard input stream.

Use the -N option to shut off all standard I/O to all the processes. That is, with this option, you specify that there are to be no stdin, stdout, and stderr connections. Use the -N option for situations in which standard I/O is not necessary; you can reduce the overhead incurred by establishing standard I/O connections for each remote process and then closing those connections as each process ends.

Use the -n option to cause stdin to be read from /dev/null. This can be useful when running mprun in the background, either directly or through a script. Without -n, mprun will block in this situation, even if no reads are posted by the remote job. When -n is specified, the user process encounters an EOF if it attempts to read from stdin. This is comparable to the behavior of the -n option to rsh.

Note -
The set of mprun options that control stdio handling cannot be combined. These options override one another. If more than one is given on a command line, the last one overrides all of the rest. The relevant options are: -D, -N, -B, -n, -i, -o, and -I.

Creating a Custom Configuration

Use the -I option to specify a custom configuration for the I/O streams associated with a job, including standard input, output, and error. The -I option takes as an argument a comma-separated series of file descriptor strings. These strings specify what is to happen with each of the job's I/O streams.

In Solaris, each process has a numbered set of file descriptors associated with it. The standard I/O streams are assigned the first three file descriptors:

0 - standard input (stdio)
1 - standard output (stdout)
2 - standard error (stderr)

The argument list to -I can include a string for each file descriptor associated with a job; if any file descriptor is omitted, its stream won't be connected to any device.

Restriction: If you include strings to redirect both standard output and standard error, you must also redirect standard input. If the job has no standard input, you can redirect file descriptor 0 to /dev/null.

The file descriptor strings in the -I argument list can be in any order. Quotation marks around the strings are optional.

File Descriptor Attributes

The file descriptor string assigns one or more of the following attributes to a file descriptor:

r - File descriptor is to be read from.
w - File descriptor is to be written to.
p - File descriptor is to be attached to a pseudo-terminal (pty).

You must specify either r or w for each file descriptor--that is, whether the file descriptor is to be written to or read from.

Thus, the string

5w

means that the stream associated with file descriptor 5 is to be written. And

0rp

means that the standard input is to be read from the pseudo-terminal.

If you use the p (pty) attribute, you must have one rp and one wp in the complete series of file descriptor strings. In other words, you must specify both reading from and writing to the pty. No other attributes can be associated with rp and wp.

The following attributes are output-related and thus can only be used in conjunction with w:

l - Line-buffered output.
t - Tag the line-buffered output with process rank information.
a - Stream is to be appended to the specified file.

Note -
NFS does not support append operations.

Use the l attribute in combination with the w attribute to line-buffer the output of multiple processes. This takes care of the situation in which output from one process arrives in the middle of output from another process. For example,

% mprun -np 2 echo "Hello"
HelHello
lo

With the l attribute, you ensure that processes don't intrude on each other's output. The following example shows how using the l attribute could prevent the problem illustrated in the previous example:

% mprun -np 2 -I "0r, 1wl" echo "Hello"
Hello
Hello

Use the t attribute in place of l to force line-buffering and, additionally, to prefix each line with the rank of the process producing the output. For example,

% mprun -np 2 -I "0r, 1wt" echo "Hello"
r0:Hello
r1:Hello

The b attribute is input-related and thus can be used only in combination with r. In multiprocess jobs, the b attribute specifies that input is to go only to the first process, rather than to all processes, which is the default behavior.

The m attribute pertains to reading from a pseudo-terminal and thus can be used only with rp. The m attribute in combination with rp causes keystrokes to be echoed multiple times when multiple processes are running. The default is to display multiple keystrokes only once.

File Descriptor String Syntax

You can direct one file descriptor's output to the same location as that specified by another file descriptor by using the syntax

fdattr=@other_fd

For example,

2w=@1

means that the standard error is to be sent wherever the standard output is going. You cannot do this for a file descriptor string that uses the p attribute.

If the behavior of the second file descriptor in this syntax is changed later in the -I argument list, the change does not affect the earlier reference to the file descriptor. That is, the -I argument list is parsed from left to right.

You can tie a file descriptor's output to a file by using the syntax

fdattr=filename

For example,

10w=output

says that the stream associated with file descriptor 10 is to be written to the file output. Once again, however, you cannot use this feature for a file descriptor defined with the p attribute.

In the following example, the standard input is read from the pty, the standard output is written to the pty, and the standard error is sent to the file named errors:

% mprun -I "0rp,1wp,2w=errors" a.out

If you use the w attribute without specifying a file, the file descriptor's output is written to the corresponding output stream of the parent process; the parent process is typically a shell, so the output is typically written to the user's terminal.

For multiprocess jobs, each process creates its own file; the file is opened on the node on which the process runs.

Note -

If output is redirected such that multiple processes open the same file over NFS, the processes will overwrite each other's output.

In specifying the individual file names for processes, you can use the following symbols:

&J - The job ID of the job
&R - The rank of the process within the job

The symbols will be replaced by the actual values. For example, assuming the job ID is 15, this file descriptor string

1w=myfile.&J.&R

redirects standout output from a multiprocess job to a series of files named myfile.15.0, myfile.15.1, myfile.15.2, and so on, one file for each rank of the job.

In the following example, there is no standard input (it comes from /dev/null), and the standard output and standard error are written to the files out.job.rank:

% mprun -I "0r=/dev/null,1w=out.&J.&R,2w=@1" a.out

This is the behavior of the -B option. See "Introducing mprun I/O ". Note the inclusion in this example of a file descriptor string for standard input even though the job has none. This is required because both standard output and standard error are redirected.

`mprun` Options versus Shell Syntax

The default I/O behavior of mprun (merged standard error and standard output) is equivalent to

% mprun -I "0rp,1wp,2w=@1" a.out

The -D option provides separate standard output and standard error streams; it is equivalent to:

% mprun -I "0rp,1wp,2w" a.out

You can use the -o option to force each line of output to be prepended with the rank of the process writing it. This is equivalent to

% mprun -I "0rp,1wt,2w=@1" a.out

If you redirect output to a shared file, you must use standard shell redirection rather than the equivalent -I formulation (-I "lwt=outfile"). The same restriction also applies to the linebuffer formulation (-I "lwt=outfile").

For example, the following command line concatenates the outputs of the individual processes of a job and writes them to outfile.dat:

% mprun -np 4 myprogram > outfile.dat

The following command line concatenates the outputs of the individual processes and appends them to the previous content of the output file:

% mprun -np 4 myprogram >> outfile.dat

The following table describes three mprun command-line options that provide the same control over standard I/O as some -I constructs, but are much simpler to express. Their -I equivalents are also shown.

Table 3-5 mprun Shortcut Summary


Command	Description
mprun -i	Standard input to `mprun` is sent only to rank 0, and not to all other ranks. Equivalent to mprun -I "0rpb,1wp,2w=@1" a.out
mprun -B	Standard output and standard error are written to the file `out.job.rank`. Equivalent to mprun -I "0r=/dev/null,1w=out.&J.&R,2w=@1" a.out
mprun -o	Use line buffering on standard output, prefixing each line with the rank of the process that wrote it. Equivalent to mprun -I "0rp,1wt,2w=@1" a.out

Note -

Specifying -o (forcing processes to prepend rank on output lines), or the equivalent -I syntax (such as -I1wt) will not work if redirection is also specified with -I (such as with -I1w=outfile). Use the standard shell redirection operator instead.

These shortcuts are not exact substitutions. The CRE uses ptys correctly, whether the -I option is present or absent. Also, the CRE merges standard error with standard output when it is appropriate. If either stderr or stdout is redirected (but not both), ptys are not used and stderr and stdout are separated. If both stderr and stdout are redirected, ptys are still not used, but stderr and stdout are combined.

Caution Regarding the Use of `-i` Option

Use the -i option to mprun with caution, since the -i option provides only one stdin connection (to rank 0). If that connection is closed, keyboard signals are no longer forwarded to those remote processes. To signal the job, you must go to another window and issue the mpkill command. For example, if you issue the command mprun -np 2 -i cat and then type the Ctrl-d character (which causes cat to close its stdin and exit), rank 0 will exit. However, rank 1 is still running, and can no longer be signaled from the keyboard.