During the collection phase of Prism's TNF performance analysis, Prism creates as many trace collection data files as there are processes in your Sun MPI program. When your program has completed, Prism merges these files in a final data file. You can view this merged file in Prism's TNF data browser, tnfview.
However, the scale of data collection can overwhelm disk storage resources. The following sections are intended to help you to understand how this can happen, and how to control the scale of data collection.
Prism creates one trace collection data file per process in your Sun MPI program. Sun HPC 3.0 ClusterTools supports Sun MPI programs with as many as 1024 processes on LSF, or as many as 256 processes on the Cluster Runtime Environment (CRE).
You can specify the size of the trace data collection files with the size argument of the tnffile command. The trace data collection files are allocated a fixed size, not a variable size limit. For example, to increase the size from the default value of 128 Kbytes to two megabytes,
(prism all) tnffile myfile.tnf 2048
Trace data collection files operate as circular buffers. As the file fills up with trace data records, older records are overwritten. Once the data collection process has been completed and the data has been merged in the final trace file, Prism will issue a warning message reporting that older records in the trace buffer have been overwritten, if that is the case. For example:
Maximum file size reached - some events have been lost.
It is difficult to predict the precise number of records that will fit in a given buffer size. Some probes report extra data--probe records vary in length. However, the average event generates a record roughly 16 bytes in length.
Change (lessen) the number of probes that you enable.
Change (shorten) the duration of the time during which collection is active.
The file size of the final, merged trace data file is approximately equal to the number of processes times the buffer size. However, the final trace data file will be smaller if the individual trace data buffers are not full.
The loading of the final, merged, trace data file into tnfview can take a length of time proportionate to the size of the data file.
Prism uses /usr/tmp for storing trace data files by default because that directory resides locally on each machine. For that reason, the processes that generate trace records are not slowed by writing their TNF probe records across a network connection.
You can use another directory for trace data collection files. To direct Prism to create trace data files in your chosen directory, set the PRISM_TNFDIR or TMPDIR environment variables to the directory you choose. For example,
% setenv PRISM_TNFDIR directory