Storing MPI Experiments

Language:

Because multiprocessing environments can be complex, you should be aware of some issues about storing MPI experiments when you collect performance data from MPI programs. These issues concern the efficiency of data collection and storage, and the naming of experiments. See Where the Data Is Stored for information on naming experiments, including MPI experiments.

Each MPI process that collects performance data creates its own subexperiment. While an MPI process creates an experiment, it locks the experiment directory; all other MPI processes must wait until the lock is released before they can use the directory. Store your experiments on a file system that is accessible to all MPI processes.

If you do not specify an experiment name, the default experiment name is used. Within the experiment, the Collector will create one subexperiment for each MPI rank. The Collector uses the MPI rank to construct a subexperiment name with the form M_rm.er, where m is the MPI rank.

If you plan to move the experiment to a different location after it is complete, then specify the –A copy option with the collect command. To copy or move the experiment, do not use the UNIX cp or mv command. Instead, use the er_cp or er_mv command as described in Chapter 8, Manipulating Experiments.

MPI tracing creates temporary files in /tmp/a.*.z on each node. These files are removed during the MPI_finalize() function call. Make sure that the file systems have enough space for the experiments. Before collecting data on a long-running MPI application, do a short-duration trial run to verify file sizes. Also see Estimating Storage Requirements for information on how to estimate the space needed.

MPI profiling is based on the open source VampirTrace 5.5.3 release. It recognizes several supported VampirTrace environment variables and a new one, VT_STACKS, which controls whether call stacks are recorded in the data. For further information on the meaning of these variables, see the VampirTrace 5.5.3 documentation.

The default value of the environment variable VT_BUFFER_SIZE limits the internal buffer of the MPI API trace collector to 64 Mbytes. After the limit has been reached for a particular MPI process, the buffer is flushed to disk if the VT_MAX_FLUSHES limit has not been reached. By default, VT_MAX_FLUSHES is set to 0. This setting causes the MPI API trace collector to flush the buffer to disk whenever the buffer is full. If you set VT_MAX_FLUSHES to a positive number, you limit the number of flushes allowed. If the buffer fills up and cannot be flushed, events are no longer written into the trace file for that process. The result can be an incomplete experiment, and, in some cases, the experiment might not be readable.

To change the size of the buffer, use the environment variable VT_BUFFER_SIZE. The optimal value for this variable depends on the application that is to be traced. Setting a small value will increase the memory available to the application but will trigger frequent buffer flushes by the MPI API trace collector. These buffer flushes can significantly change the behavior of the application. On the other hand, setting a large value like 2 Gbytes will minimize buffer flushes by the MPI API trace collector but decrease the memory available to the application. If not enough memory is available to hold the buffer and the application data, parts of the application might be swapped to disk, leading also to a significant change in the behavior of the application.

Another important variable is VT_VERBOSE, which turns on various error and status messages. Set this variable to 2 or higher if problems arise.

Normally, MPI trace output data is post-processed when the mpirun target exits. A processed data file is written to the experiment and information about the post-processing time is written into the experiment header. MPI post-processing is not done if MPI tracing is explicitly disabled with –m off. In the event of a failure in post-processing, an error is reported, and no MPI Tabs or MPI tracing metrics are available.

If the mpirun target does not actually invoke MPI, an experiment is still recorded but no MPI trace data is produced. The experiment reports an MPI post-processing error, and no MPI Tabs or MPI tracing metrics will be available.

If the environment variable VT_UNIFY is set to 0, the post-processing routines are not run by collect. They are run the first time er_print or analyzer are invoked on the experiment.

Note - If you copy or move experiments between computers or nodes, you cannot view the annotated source code or source lines in the annotated disassembly code unless you have access to the source files or a copy with the same timestamp. You can put a symbolic link to the original source file in the current directory in order to see the annotated source. You can also use settings in the Settings dialog box. Use the Search Path tab (see Search Path Settings) to manage a list of directories to be used for searching for source files. Use the Pathmaps tab (see Pathmaps Settings) to map the leading part of a file path from one location to another.