Simple Performance Optimization Tool (SPOT) 2.0 User's Guide

Time Based Profile of the Application

The index page of the SPOT report shows a summary of which routines consumed the most runtime. Following the More hyperlink below the summary leads to a page which allows exploration of the application in more depth. Figure 3–11 shows this page for the test application.

Figure 3–11 Profile Providing Data and Links to Specific Routines

Profile of Specific Routines

The hyperlinks at the top of the page allow the data to be reordered according the the various columns. The columns are as follows:

On the right of the page are links to the routines:

Figure 3–12 shows how time is attributed at the source code level. The line starting with ## and highlighted in yellow indicates the line of source which has a high count for one of the events. In this case it has a high count for user time and also dynamic instruction count. The source code also includes compiler commentary about the two loops shown in the code.

Figure 3–12 How Time is Attributed at the Source Code Level

Source view

The disassembly view normally holds much more specific information, as shown in Figure 3–13.

Figure 3–13 Disassembly View

Disassembly View

Again a hot line of disassembly is shown highlighted in yellow. The execution counts for the individual assembly language instructions are also shown, so it is visible that the loop is entered once, and iterated nearly 170 million times. The hyperlinks enable rapid navigation to either the line of source that generated the disassembly instruction or the target of a branch instruction.

The final page generated is a page of the callers and callees of the various functions. Callers are the functions that call a given routine, the callees are the functions that the routine calls. An example of this is shown in Figure 3–14.

Figure 3–14 Page Showing the Callers and Callees of Functions

Caller-Callee Report

The caller-callee information is quite complex to read. The routine of focus is indicated by an asterisk. For example take the second section which is for the routine main. The routine main has an asterisk on the left of it, meaning that it is the selected routine. There are two routines above it _start and <Total>. <Total> is a synthetic metric representing the runtime of the entire code. This information is interpreted as “the routine main gets called by the routine _start”. Below the routine main there are four other routines; these routines are routines that get called by main.

The first column is the attributed user time, which is the amount of time that can be attributed to the selected routine. This is best explained by examining the main routine again. For the routine _start there is about 120 seconds of user time attributed to the routine; this time is the time that _start spends calling the routine of interest, in this case main. The attributed time for the routine main is zero– which indicates that no time is actually spent in that routine. The attributed time for the four routines below main will sum up to the 120 seconds.

The routine fp_routine shows a second example. In this case 27 seconds are spent by the routine main calling fp_routine. However, all those 27 seconds are directly spent in the routine fp_routine.

The hyperlinks in the caller-callee page allow navigation up and down the call graph, and also to the disassembly code for the actual routines.

The profile data discussed in this section was collected with collect. The tool collect can also be invoked stand-alone, outside of spot. The experiment data collected by collect can also be examined by using analyzer or er_print. Experiment data collected by collect can also be converted to HTML format by using er_html as a stand-alone tool, outside of spot. See the man pages for collect, analyzer, and er_print for more details on these tools. Also, type er_html -h and consult the er_html man page for more information on using er_html.