If you store the experiments on a common file system and specify an experiment name in the standard format, experiment. n.er, each experiment is given a unique name when the value of n is incremented for each experiment. Experiments are numbered according to the order in which MPI processes obtain the lock on the experiment directory, and cannot be guaranteed to correspond to the MPI rank of the process. If you attach dbx to MPI processes in a running MPI job, experiment numbering is determined by the order of attachment.
If you store each experiment on its own local file system and specify an explicit experiment name, each experiment might receive that same name. For example, suppose you ran an MPI job across a cluster with four single-processor nodes labelled node0, node1, node2 and node3. Each node has a local disk called /scratch, and you store the experiments in directory username on this disk. The experiments created by the MPI job have the following full path names.
node0:/scratch/username/test.1.er node1:/scratch/username/test.1.er node2:/scratch/username/test.1.er node3:/scratch/username/test.1.er
The full name including the node name is unique, but in each experiment directory there is an experiment named test.1.er. If you move the experiments to a common location after the MPI job is completed, you must make sure that the names remain unique. For example, to move these experiments to your home directory, which is assumed to be accessible from all nodes, and rename the experiments, type the following commands.
rsh node0 ’er_mv /scratch/username/test.1.er test.0.er’ rsh node1 ’er_mv /scratch/username/test.1.er test.1.er’ rsh node2 ’er_mv /scratch/username/test.1.er test.2.er’ rsh node3 ’er_mv /scratch/username/test.1.er test.3.er’
For large MPI jobs, you might want to move the experiments to a common location using a script. Do not use the UNIX® commands cp or mv; use er_cp or er_mv as shown in the example above, and described in Manipulating Experiments.
If you do not know which local file systems are available to you, use the df -lk command or ask your system administrator. Always make sure that the experiments are stored in a directory that already exists, that is uniquely defined and that is not in use for any other experiment. Also make sure that the file system has enough space for the experiments. See Estimating Storage Requirements for information on how to estimate the space needed.
If you copy or move experiments between computers or nodes you cannot view the annotated source code or source lines in the annotated disassembly code unless you have access to the load objects and source files that were used to run the experiment, or a copy with the same path and timestamp.