Highly cyclical code is a good example of code that can benefit from TNF performance analysis, such as in a program that alternates between broadcasts and gathers. For example, look for evidence of bad load balancing, such as barrier:compute cycles where the compute phase in one rank is far shorter than others, spending more time in barrier than the other ranks.
You can create intervals based on library routines that enable you to measure the timing of your own code, not just the timing of the library routines themselves. Create intervals that combine an *_End event that precedes the routines you want to measure with a corresponding *_Start event following those routines (the reverse of normal order).
You can use Prism's TNF performance analysis features with or without using the -g compiler option. For further information about the effects of using the -g option, see " Compiling and Linking Your Program". For information on combining the -g option with optimizations, see "Combining Debug and Optimization Options".
Ragged edges can appear in your data. Since message passing activity in different processes can vary, the earliest time when a trace file contains interesting data can vary from process to process