Sun Studio 12: Performance Analyzer

Chapter 4 The Performance Analyzer Tool

The Performance Analyzer is a graphical data-analysis tool that analyzes performance data collected by the Collector using the collect command, the IDE, or the collector commands in dbx. The Collector gathers performance information to create an experiment during the execution of a process, as described in Chapter 3, Collecting Performance Data. The Performance Analyzer reads in such experiments, analyzes the data, and displays the data in tabular and graphical displays. A command-line version of the Analyzer is available as the er_print utility, which is described in Chapter 6, The er_print Command Line Performance Analysis Tool.

Starting the Performance Analyzer

To start the Performance Analyzer, type the following on the command line:

% analyzer [control-options] [experiment-list]

Alternatively, use the Explorer in the IDE to navigate to an experiment and open it. The experiment-list command argument is a blank-separated list of experiment names, experiment group names, or both.

You can specify multiple experiments or experiment groups on the command line. If you specify an experiment that has descendant experiments inside it, all descendant experiments are automatically loaded, but the display of data for the descendant experiments is disabled. To load individual descendant experiment s you must specify each experiment explicitly or create an experiment group.

To create an experiment group, you can use the -g argument to the collect utility. To manually create an experiment group, create a plain text file whose first line is as follows:

#analyzer experiment group

Then add the names of the experiments on subsequent lines. The file extension must be erg.

You can also use the File menu in the Analyzer window to add experiments or experiment groups. To open experiments recorded on descendant processes, you must type the file name in the Open Experiment dialog box (or Add Experiment dialog box) because the file chooser does not permit you to open an experiment as a directory.

When the Analyzer displays multiple experiments, however they were loaded, data from all the experiments is aggregated.

You can preview an experiment or experiment group for loading by single-clicking on its name in either the Open Experiment dialog or the Add Experiment dialog.

You can also start the Performance Analyzer from the command line to record an experiment as follows:

% analyzer [Java-options] [control-options] target [target-arguments]

The Analyzer starts up with the Performance Tools Collect window showing the named target and its arguments, and settings for collecting an experiment. See Recording Experiments for details.

Analyzer Options

These options control the behavior of the Analyzer and are divided into three groups:

Java options
Control options
Information options

Java Options

`-j | --jdkhome` `jvm-path`

Specify the path to the JVM software for running the Analyzer. When the -j option is not specified, the default path is taken first by examining environment variables for a path to the JVM, in the order JDK_HOME and then JAVA_PATH. If neither environment variable is set, the default path is where the Java^TM 2 Software Development Kit was installed by the Sun Studio installer. If the SDK was not installed, the JVM found on the user’s PATH is used. Use the -j option to override all the default paths.

`-J` `jvm-options`

Specify the JVM options.

Control Options

`-f | --fontsize` `size`

Specify the font size to be used in the Analyzer GUI.

`-v | --verbose`

Print version information and Java runtime arguments before starting.

Information Options

These options do not invoke the Performance Analyzer GUI, but print information about analyzer to standard output. The individual options below are stand-alone options; they cannot be combined with other analyzer options nor combined with target or experiment-list arguments.

`-V | --version`

Print version information and exit.

`-? | --h | --help`

Print usage information and exit.

Analyzer Default Settings

The Analyzer uses resource files named .er.rc to determine default values for various settings upon startup. The system wide er.rc defaults file is read first, then an .er.rc file in the user’s home directory, if present, then an .er.rc file in the current directory. Defaults from the .er.rc file in your home directory override the system defaults, and defaults from the .er.rc file in the current directory override both home and system defaults. The .er.rc files are used by the Analyzer and the er_print utility. Any settings in .er.rc that apply to source and disassembly compiler commentary are also used by the er_src utility.

See the sections Default Settings for Analyzer for more information about the .er.rc files. See Commands That Set Defaults and Commands That Set Defaults Only For the Performance Analyzer for information about setting defaults with er_print commands.

Performance Analyzer GUI

The Analyzer window has a menu bar, a tool bar, and a split pane that contains tabs for the various data displays.

The Menu Bar

The menu bar contains a File menu, a View menu, a Timeline menu, and a Help menu.

The File menu is for opening, adding, and dropping experiments and experiment groups. The File menu allows you to collect data for an experiment using the Performance Analyzer GUI. For details on using the Performance Analyzer to collect data, refer to Recording Experiments. From the File menu, you can also create a mapfile, which is used to optimize the size of an executable or optimize its effective cache behavior. For more details on mapfiles, refer to Generating Mapfiles and Function Reordering.

The View menu allows you to configure how experiment data is displayed.

The Timeline menu, as its name suggests, helps you to navigate the timeline display, described in Analyzer Data Displays.

The Help menu provides online help for the Performance Analyzer, provides a summary of new features, has quick-reference and shortcut sections, and has a troubleshooting section.

Toolbar

The toolbar provides sets of icons as shortcuts, and includes a Find function to help you locate text or highlighted lines in the tabs. For more details about the Find function, refer to Finding Text and Data

Analyzer Data Displays

The Performance Analyzer uses a split-window to divide the data presentation into two panes. Each pane is tabbed to allow you to select different data displays for the same experiment or experiment group.

Data Display, Left Pane

The left pane displays tabs for the principal Analyzer displays in the order in which they appear:

The Races tab
The Deadlocks tab
The Functions tab
The Callers-Callees tab
Dual-Source tab
Source-Disassembly tab
The Source tab
The Lines tab
The Disassembly tab
The PCs tab
The Timeline tab
The Leaklist tab
The DataObjects tab
The DataLayout tab
The Inst-Freq tab
The Statistics tab
The Experiments tab
Various Memory Objects Tabs
Various Index Object tabs

If you invoke the Analyzer without a target, you are prompted for an experiment to open.

By default, the first visible tab is selected. Only tabs applicable to the data in the loaded experiments are displayed.

Whether a tab is displayed in the left pane of Analyzer window when you open an experiment is determined by a tabs directive in the .er.rc files read when you start the Analyzer and the applicability of the tab to the data in the experiment. You can use the Tabs tab in the Set Data Presentation dialog box (see Tabs Tab) to select the tabs you want to display for an experiment.

The Races Tab

The Races tab shows a list of all the data races detected in a data-race experiment. For more information, see Sun Studio 12: Thread Analyzer User’s Guide.

The Deadlocks tab

The Deadlocks tab shows a list of all the deadlocks detected in a deadlock experiment. For more information, see Sun Studio 12: Thread Analyzer User’s Guide.

The Functions Tab

The Functions tab shows a list consisting of functions and their metrics. The metrics are derived from the data collected in the experiment. Metrics can be either exclusive or inclusive. Exclusive metrics represent usage within the function itself. Inclusive metrics represent usage within the function and all the functions it called.

The list of available metrics for each kind of data collected is given in the collect(1) man page. Only the functions that have non-zero metrics are listed.

Time metrics are shown as seconds, presented to millisecond precision. Percentages are shown to a precision of 0.01%. If a metric value is precisely zero, its time and percentage is shown as “0.” If the value is not exactly zero, but is smaller than the precision, its value is shown as “0.000” and its percentage as “0.00”. Because of rounding, percentages may not sum to exactly 100%. Count metrics are shown as an integer count.

The metrics initially shown are based on the data collected and on the default settings read from various .er.rc files. When the Performance Analyzer is initially installed, the defaults are as follows:

For clock-based profiling, the default set consists of inclusive and exclusive User CPU time.
For synchronization delay tracing, the default set consists of inclusive synchronization wait count and inclusive synchronization time.
For hardware counter overflow profiling, the default set consists of inclusive and exclusive times (for counters that count in cycles) or event counts (for other counters).
For heap tracing, the default set consists of heap leaks and bytes leaked.

If more than one type of data has been collected, the default metrics for each type are shown.

The metrics that are shown can be changed or reorganized; see the online help for details.

To search for a function, use the Find tool. For further details about the Find tool, refer to Finding Text and Data.

To select a single function, click on that function.

To select several functions that are displayed contiguously in the tab, select the first function of the group, then Shift-click on the last function of the group.

To select a several functions that are not displayed contiguously in the tab, select the first function of the group, then select the additional functions by Ctrl-clicking on each function.

When you click the Compose Filter Clause button on the toolbar, the Filter dialog box opens with Advanced tab selected and the Filter clause text box loaded with a filter clause that reflects the selection(s) in the Functions tab.

The Callers-Callees Tab

The Callers-Callees tab shows the selected function in a pane in the center, with callers of that function in a pane above, and callees of that function in a pane below.

In addition to showing exclusive and inclusive metric values for each function, the tab also shows attributed metrics. For the selected function, the attributed metric represents the exclusive metric for that function. For the callees, the attribute metric represents the portion of the callee’s inclusive metric that is attributable to calls from the center function. The sum of attributed metrics for the callees and the selected function will add up to the inclusive metric for the selected function.

For the callers, the attributed metrics represent the portion of the selected function’s inclusive metric that is attributable to calls from the callers. The sum of the attributed metrics for all callers should also add up to the inclusive metric for the selected function.

The metrics shown in the Callers-Callees tab can be changed or reorganized; see the online help for details.

Clicking once on a function in the caller or callee pane selects that function, causing the window contents to be redrawn so that the selected function appears in the center pane.

The Dual-Source Tab

The Dual-Source tab shows the two source contexts involved in the selected data race or deadlock. The tab is shown only if data-race-detection or deadlock experiments are loaded.

The Source-Disassembly Tab

The Source-Disassembly tab shows the annotated source in an upper pane, and the annotated disassembly in a lower pane. The tab is not visible by default. Use the Set Data Presentation option on the View menu to add the Source-Disassembly tab.

The Source Tab

If available, the Source tab shows the file containing the source code of the selected function, annotated with performance metrics for each source line. The full names of the source file, the corresponding object file and the load object are given in the column heading for the source code. In the rare case where the same source file is used to compile more than one object file, the Source tab shows the performance data for the object file containing the selected function.

The Analyzer looks for the file containing the selected function under the absolute pathname as recorded in the executable. If the file is not there, the Analyzer tries to find a file of the same basename in the current working directory. If you have moved the sources, or the experiment was recorded in a different file system, you can put a symbolic link from the current directory to the real source location in order to see the annotated source.

When you select a function in the Functions tab and the Source tab is opened, the source file displayed is the default source context for that function. The default source context of a function is the file containing the function’s first instruction, which for C code is the function’s opening brace. Immediately following the first instruction, the annotated source file adds an index line for the function. The source window displays index lines as text in red italics within angle brackets in the form:

<Function: f_name>

A function might have an alternate source context, which is another file that contains instructions attributed to the function. Such instructions might come from include files or from other functions inlined into the selected function. If there are any alternate source contexts, the beginning of the default source context includes a list of extended index lines that indicate where the alternate source contexts are located.

<Function: f, instructions from source file src.h>

Double clicking on an index line that refers to another source context opens the file containing that source context, at the location associated with the indexed function.

To aid navigation, alternate source contexts also start with a list of index lines that refer back to functions defined in the default source context and other alternate source contexts.

The source code is interleaved with any compiler commentary that has been selected for display. The classes of commentary shown can be set in the Set Data Presentation dialog box. The default classes can be set in a .er.rc defaults file.

The metrics displayed in the Source tab can be changed or reorganized; see the online help for details.

Lines with metrics that are equal to or exceed a threshold percentage of the maximum of that metric for any line in the source file are highlighted to make it easier to find the important lines. The threshold can be set in the Set Data Presentation dialog box. The default threshold can be set in a .er.rc defaults file. Tick marks are shown next to the scrollbar, corresponding to the position of over-threshold lines within the source file. For example, if there were two over-threshold lines near the end of the source file, two ticks would be shown next to the scrollbar near the bottom of the source window. Positioning the scrollbar next to a tick mark will position the source lines displayed in the source window so that the corresponding over-threshold line is displayed.

The Lines Tab

The Lines tab shows a list consisting of source lines and their metrics. Source lines are labeled with the function from which they came and the line number and source file name. If no line-number information is available for a function, or the source file for the function is not known, all of the function’s program counters (PCs) appear aggregated into a single entry for the function in the lines display. PCs from functions that are from load-objects whose functions are hidden appear aggregated as a single entry for the load-object in the lines display. Selecting a line in the Lines tab shows all the metrics for that line in the Summary tab. Selecting the Source or Disassembly tab after selecting a line from the Lines tab positions the display at the appropriate line.

The Disassembly Tab

The Disassembly tab shows a disassembly listing of the object file containing the selected function, annotated with performance metrics for each instruction.

Interleaved within the disassembly listing is the source code, if available, and any compiler commentary chosen for display. The algorithm for finding the source file in the Disassembly tab is the same as the algorithm used in the Source tab.

Just as with the Source tab, index lines are displayed in Disassembly tab. But unlike with the Source tab, index lines for alternate source contexts cannot be used directly for navigation purposes. Also, index lines for alternate source contexts are displayed at the start of where the #included or inlined code is inserted, rather than just being listed at the beginning of the Disassembly view. Code that is #included or inlined from other files shows as raw disassembly instructions without interleaving the source code. However, placing the cursor on one of these instructions and selecting the Source tab opens the source file containing the #included or inlined code. Selecting the Disassembly tab with this file displayed opens the Disassembly view in the new context, thus displaying the disassembly code with interleaved source code.

The classes of commentary shown can be set in the Set Data Presentation dialog box. The default classes can be set in a .er.rc defaults file.

The Analyzer highlights lines with metrics that are equal to or exceed a metric-specific threshold, to make it easier to find the important lines. You can set the threshold in the Set Data Presentation dialog box. You can set the default threshold in a .er.rc defaults file. As with the Source tab, tick marks are shown next to the scrollbar, corresponding to the position of over-threshold lines within the disassembly code.

The PCs Tab

The PCs tab shows a list consisting of program counters (PCs) and their metrics. PCs are labeled with the function from which they came and the offset within that function. PCs from functions that are from load-objects whose functions are hidden appear aggregated as a single entry for the load-object in the PCs display. Selecting a line in the PCs tab shows all the metrics for that PC in the Summary tab . Selecting the Source tab or Disassembly tab after selecting a line from the PCs tab positions the display at the appropriate line.

See the section Call Stacks and Program Execution for more information about PCs.

The Timeline Tab

The Timeline tab shows a chart of the events and the sample points recorded by the Collector as a function of time. Data is displayed in horizontal bars. For each experiment there is a bar for sample data and a set of bars for each LWP. The set for an LWP consists of one bar for each data type recorded: clock-based profiling, hardware counter overflow profiling, synchronization tracing, heap tracing, and MPI tracing.

The bars that contain sample data show a color-coded representation of the time spent in each microstate for each sample. Samples are displayed as a period of time because the data in a sample point represents time spent between that point and the previous point. Clicking a sample displays the data for that sample in the Event tab.

The profiling data or tracing data bars show an event marker for each event recorded. The event markers consist of a color-coded representation of the call stack recorded with the event, as a stack of colored rectangles. Clicking a colored rectangle in an event marker selects the corresponding function and PC and displays the data for that event and that function in the Event tab. The selection is highlighted in both the Event tab and the Legend tab, and selecting the Source tab or Disassembly tab positions the tab display at the line corresponding to that frame in the call stack.

For some kinds of data, events may overlap and not be visible. Whenever two or more events would appear at exactly the same position, only one is drawn; if there are two or more events within one or two pixels, all are drawn. although they may not be visually distinguishable. In either case, a small gray tick mark is displayed below the drawn events indicating the overlap.

The Timeline tab of the Set Data Presentation dialog box allows you to change the types of event-specific data that are shown; to select the display of event-specific data for threads, LWPs, or CPUs; to choose to align the call stack representation at the root or at the leaf; and to choose the number of levels of the call stack that are displayed.

You can change the types of event-specific data shown in the Timeline tab, as well as the colors mapped to selected functions. For details about using the Timeline tab, refer to the online help.

The LeakList Tab

The LeakList tab shows two lines, the upper one representing leaks, and the lower one representing allocations. Each contains a call stack, similar to that shown in the Timeline tab, in the center with a bar above proportional to the bytes leaked or allocated, and a bar below proportional to the number of leaks or allocations.

Selection of a leak or allocation displays the data for the selected leak or allocation in the Leak tab, and selects a frame in the call stack, just as it does in the Timeline tab.

You can display the LeakList tab by selecting it in the Tabs tab of the Set Data Presentation dialog box (see Tabs Tab). You can make the LeakList tab visible only if one or more of the loaded experiments contains heap trace data.

The DataObjects Tab

The DataObjects tab shows the list of data objects with their metrics. The tab is applicable only to hardware counter overflow experiments where the aggressive backtracking option was enabled, and for source files that were compiled with the -xhwcprof option in the C compiler.

You can display the tab by selecting it in the Tabs tab of the Set Data Presentation dialog box (see Tabs Tab). You can make the DataObjects tab visible only if one or more of the loaded experiments contains a dataspace profile.

The tab shows hardware counter memory operation metrics against the various data structures and variables in the program.

To select a single data object, click on that object.

To select several objects that are displayed contiguously in the tab, select the first object, then press Shift while clicking on the last object.

To select several objects that are not displayed contiguously in the tab, select the first object, then select the additional objects by pressing Ctrl while clicking on each object.

The DataLayout Tab

The DataLayout tab shows the annotated data object layouts for all program data objects with data-derived metric data. The layouts appear in the tab sorted by the data sort metrics values for the structure as a whole. The tab shows each aggregate data object with the total metrics attributed to it, followed by all of its elements in offset order. Each element, in turn, has its own metrics and an indicator of its size and location in 32–byte blocks.

The DataLayout tab can be displayed by selecting it in the Tabs tab of the Set Data Presentation dialog box (see Tabs Tab). As with the DataObjects tab, you can make the DataLayout tab visible only if one or more of the loaded experiments contains a dataspace profile.

To select a single data object, click on that object.

To select several objects that are displayed contiguously in the tab, select the first object, then press the Shift key while clicking on the last object.

To select several objects that are not displayed contiguously in the tab, select the first object, then select the additional objects by pressing the Ctrl key while clicking on each object.

The Inst-Freq Tab

The Inst-Freq, or instruction-frequency, tab shows a summary of the frequency with which each type of instruction was executed in a count-data experiment. The tab also shows data about the frequency of execution of load, store, and floating-point instructions. In addition, the tab includes information about annulled instructions and instructions in a branch delay slot.

The Statistics Tab

The Statistics tab shows totals for various system statistics summed over the selected experiments and samples. The totals are followed by the statistics for the selected samples of each experiment. For information on the statistics presented, see the getrusage(3C) and proc (4) man pages.

The Experiments Tab

The Experiments tab is divided into two panels. The top panel contains a tree that includes nodes for the load objects in all the loaded experiments, and for each experiment load. When you expand the Load Objects node, a list of all load objects is displayed with various messages about their processing. When you expand the node for an experiment, two areas are displayed: a Notes area and an Info area.

The Notes area displays the contents of any notes file in the experiment. You can edit the notes by typing directly in the Notes area. The Notes area includes its own toolbar with buttons for saving or discarding the notes and for undoing or redoing any edits since the last save.

The Info area contains information about the experiments collected and the load objects accessed by the collection target, including any error messages or warning messages generated during the processing of the experiment or the load objects.

The bottom panel lists error and warning messages from the Analyzer session.

The Index Tabs

Each Index tab shows the metric values from data attributed to various index objects, such as Threads, Cpus, and Seconds. Inclusive and Exclusive metrics are not shown, since Index objects are not hierarchical. Only a single metric of each type is shown.

Several Index tabs are predefined: Threads, Cpus, Samples, and Seconds. You can define a custom index object by clicking on the Add Custom Index Tab button in the Set Data Presentation dialog box to open the Add Index Objects dialog box.

A radio button at the top of each Index tab lets you select either a Text display or a Graphical display. The Text display is similar to the display in the DataObjects tab and uses the same metric settings. The Graphical display shows a graphical representation of the relative values for each index object, with a separate histogram for each metric sorted by the data sort metric.

When you click the Filter Data button on the toolbar, the Filter Data dialog box opens. Click the Advanced tab and the Filter clause text box loaded with a filter clause that reflects the selections in the IndexObjects tab.

The MemoryObjects Tabs

Each MemoryObjects tab shows the metric values for dataspace metrics attributed to the various memory objects such as pages. If one or more of the loaded experiments contains a dataspace profile, you can select the memory objects for which you want to display tabs in the Tabs tab of the Set Data Presentation dialog box. Any number of MemoryObjects tabs can be displayed.

Various MemoryObject tabs are predefined. You can define a custom memory object by clicking the Add Custom Object button in the Set Data Presentation dialog box to open the Add Memory Objects dialog box.

A radio button on each MemoryObjects tab lets you select either a Text display or a Graphical display. The Text display is similar to the display in the DataObjects tab and uses the same metric settings. The Graphical display shows a graphical representation of the relative values for each memory object, with a separate histogram for each metric sorted by the data sort metric.

Data Display, Right Pane

The right pane contains the Summary tab, the Event tab, the Race Detail tab, Deadlock Detail tab, and Leak tab. By default the Summary tab is displayed.

The Summary Tab

The Summary tab shows all the recorded metrics for the selected function or load object, both as values and percentages, and information on the selected function or load object. The Summary tab is updated whenever a new function or load object is selected in any tab.

The Event Tab

The Event tab shows detailed data for the event that is selected in the Timeline tab, including the event type, leaf function, LWP ID, thread ID, and CPU ID. Below the data panel the call stack is displayed with the color coding for each function in the stack. Clicking a function in the call stack makes it the selected function.

When a sample is selected in the Timeline tab, the Event tab shows the sample number, the start and end time of the sample, and the microstates with the amount of time spent in each microstate and the color coding.

The Leak Tab

The Leak tab shows detailed data for the selected leak or allocation in the Leaklist tab. Below the data panel, the Leak tab shows the call stack at the time when the selected leak or allocation was detected. Clicking a function in the call stack makes it the selected function.

The Race Detail Tab

The Race Detail tab shows detailed data for the selected datarace in the Races tab. See the Sun Studio 12: Thread Analyzer User’s Guide for more information.

The Deadlock Detail Tab

The Deadlock Detail tab shows detailed data for the selected Deadlock in the Deadlocks tab. See the Sun Studio 12: Thread Analyzer User’s Guide for more information.

Setting Data Presentation Options

You can control the presentation of data from the Set Data Presentation dialog box. To open this dialog box, click the Set Data Presentation button in the toolbar or choose View -> Set Data Presentation.

The Set Data Presentation dialog box has a tabbed pane with the following tabs:

Metrics
Sort
Source/Disassembly
Formats
Timeline
Search Path
Pathmaps
Tabs

The dialog box has a Save button with which you can store the current settings, including any custom-defined memory objects.

Note –

Since the defaults for the Analyzer, the er_print utility and the er_src utility are set by a common .er.rc file, output from the er_print utility and er_src utility is affected as a result of saving changes in the Set Data Preferences dialog box.

Metrics Tab

The Metrics tab shows all of the available metrics. Each metric has check boxes in one or more of the columns labeled Time, Value and %, depending on the type of metric. Alternatively, instead of setting individual metrics, you can set all metrics at once by selecting or deselecting the check boxes in the bottom row of the dialog box and then clicking on the Apply to all metrics button.

Sort Tab

The Sort tab shows the order of the metrics presented, and the choice of metric to sort by.

Source/Disassembly Tab

The Source/Disassembly tab presents a list of check boxes that you can use to select the information presented, as follows:

The compiler commentary that is shown in the source listing and the disassembly listing
The threshold for highlighting important lines in the source listing and the disassembly listing
The interleaving of source code in the disassembly listing
The metrics on the source lines in the disassembly listing
The display of instructions in hexadecimal in the disassembly listing.

Formats Tab

The Formats tab presents a choice for the long form, short form, or mangled form of C++ function names and Java method names. If you select the Append SO name to Function name checkbox, the name of the shared object in which the function or method is located is appended to the function name or method name.

The Formats tab also presents a choice for View Mode of User, Expert, or Machine. The View Mode setting controls the processing of Java experiments and OpenMP experiments.

For Java experiments:

User mode shows Java call stacks for Java threads, and does not show housekeeping threads.
Expert mode shows Java call stacks for Java threads when the user’s Java code is being executed, and machine call stacks when JVM code is being executed or when the JVM software does not report a Java call stack. It shows machine call stacks for housekeeping threads.
Machine mode shows machine call stacks for all threads.

For OpenMP experiments:

User mode and expert mode show master-thread call stacks and slave-thread call stacks reconciled, and add special functions, with the names of form <OMP-*>, when the OpenMP runtime is performing certain operations.
Machine mode shows machine call stacks for all threads.

For all other experiments, all three modes show the same data.

Timeline Tab

The Timeline tab presents choices for the types of event-specific data that are shown, the display of event-specific data for threads, LWP, or CPUs; the alignment of the call stack representation at the root or at the leaf; and the number of levels of the call stack that are displayed.

Search Path Tab

The Search Path tab allows you to manage a list of directories to be used for searching for source and object files. The special name $expts refers to the experiments loaded; all other names should be paths in the file system.

Pathmaps Tab

The Pathmaps tab enables you to map the leading part of a file path from one location to another. You specify a set of prefix pairs: the original prefix and a new prefix. The path is then mapped from the original prefix to the new prefix for a given path. Multiple pathmaps may be specified, and each will be tried in turn to find a file.

Tabs Tab

You can use the Tabs tab of the Set Data Presentation dialog box to select the tabs to be displayed in the Analyzer window.

The Tabs tab lists the applicable tabs for the current experiment. The standard tabs are listed in the left column. The Index tabs are listed in the center column, and the defined Memory tabs are listed in the right column.

In the left column, click the checkboxes to select or deselect standard tabs for display.

In the center column, click the check boxes to select or deselect Index tabs for display. The predefined Index tabs are Threads, Cpus, Samples, and Seconds. To add a tab for another index object, click the Add Custom Index Tab button to open the Add Index Object dialog. In the Object name text box, type the name of the new object. In the Formula text box, type an index expression to be used to map the recorded physical address or virtual address to the object index. For information on the rules for index expressions, see indxobj_define indxobj_type index_exp

In the right column, click the check boxes to select or deselect Memory Object tabs for display. To add a custom object, click the Add Custom Object button to open the Add Memory Object dialog box. In the Object name text box, type the name of the new custom memory object. In the Formula text box, type an index expression to be used to map the recorded physical address or virtual address to the object index. For information on the rules for index expressions, see mobj_define mobj_type index_exp

When you have added a custom index object or memory object, a checkbox for that object is added to the the Tabs tab and is selected by default.

Saving Data Presentation Options

The Set Data Presentation dialog box has a Save button to store the current settings.

Note –

Since the defaults for the Analyzer, the er_print utility and the er_src utility are set by a common .er.rc file, output from the er_print utility and the er_src utility is affected as a result of saving changes in the Analyzer’s Set Data Preferences dialog box.

Finding Text and Data

The Analyzer has a Find tool available through the toolbar, with two options for search targets that are given in a combo box. You can search for text in the Name column of the Functions tab or Callers-Callees tabs and in the code column of the Source tab and Disassembly tab. You can search for a high-metric item in the Source tab and Disassembly tab. The metric values on the lines containing high-metric items are highlighted in green. Use the arrow buttons next to the Find field to search up or down.

Showing or Hiding Functions

By default, all functions in each load object are shown in the Functions tab and Callers-Callees tab. You can hide all the functions in a load object using the Show/Hide Functions dialog box; see the online help for details.

When the functions in a load object are hidden, the Functions tab and Callers-Callees tab show a single entry representing the aggregate of all functions from the load object. Similarly, the Lines tab and PCs tab show a single entry aggregating all PCs from all functions from the load object.

In contrast to filtering, metrics corresponding to hidden functions are still represented in some form in all displays.

Filtering Data

By default, data is shown in each tab for all experiments, all samples, all threads, all LWPs, and all CPUs. A subset of data can be selected using the Filter Data dialog box.

The Filter Data dialog box has a Simple tab and an Advanced tab. In the Simple tab, you can select the experiments for which you want to filter data. You can then specify the samples, threads, LWPs, and CPUs for which to display metrics. In the Advanced tab, you can specify a filter expression that evaluates as true for any data record you want to include in the display. For information on the grammar to use in a filter expression, see Expression Grammar.

When you have made a selection in the Functions tab, DataLayout tab, DataObjects tab, or a MemoryObject tab in the Analyzer, clicking the Compose Filter Clause button on the toolbar opens the Advanced tab of the Filter Data dialog box and loads the Filter clause text box with a clause that reflects the selection.

For details about using the Filter Data dialog box, refer to the online help.

Experiment Selection

The Analyzer allows filtering by experiment when more than one experiment is loaded. The experiments can be loaded individually, or by naming an experiment group.

Sample Selection

Samples are numbered from 1 to N, and you can select any set of samples. The selection consists of a comma-separated list of sample numbers or ranges such as 1–5.

Thread Selection

Threads are numbered from 1 to N, and you can select any set of threads. The selection consists of a comma-separated list of thread numbers or ranges. Profile data for threads only covers that part of the run where the thread was actually scheduled on an LWP.

LWP Selection

LWPs are numbered from 1 to N, and you can select any set of LWPs. The selection consists of a comma-separated list of LWP numbers or ranges. If synchronization data is recorded, the LWP reported is the LWP at entry to a synchronization event, which might be different from the LWP at exit from the synchronization event.

On Linux systems, threads and LWPs are synonymous.

CPU Selection

Where CPU information is recorded (Solaris OS), any set of CPUs can be selected. The selection consists of a comma-separated list of CPU numbers or ranges.

Recording Experiments

When you invoke the Analyzer with a target name and target arguments, it starts up with the Performance Tools Collect window open, which allows you to record an experiment on the named target. If you invoke the Analyzer with no arguments, or with an experiment list, you can record a new experiment by choosing File -> Collect Experiment to open the Performance Tools Collect window

The Collect Experiment tab of the Performance Tools Collect window has a panel you use to specify the target, its arguments, and the various parameters to be used to run the experiment. They correspond to the options available in the collect command, as described in Chapter 3, Collecting Performance Data.

Immediately below the panel is a Preview Command button, and a text field. When you click the button, the text field is filled in with the collect command that would be used when you click the Run button.

In the Data to Collect tab, you can select the types of data you want to collect.

The Input/Output tab has two panels: one that receives output from the Collector itself, and a second for output from the process.

A set of buttons allows the following operations:

Run the experiment
Terminate the run
Send Pause, Resume, and Sample signals to the process during the run (enabled if the corresponding signals are specified),
Close the window.

If you close the window while an experiment is in progress, the experiment continues. If you reopen the window, it shows the experiment in progress, as if it had been left open during the run. If you attempt to exit the Analyzer while an experiment is in progress, a dialog box is posted asking whether you want the run terminated or allowed to continue.

Generating Mapfiles and Function Reordering

In addition to analyzing the data, the Analyzer also provides a function-reordering capability. Based on the data in an experiment, the Analyzer can generate a mapfile which, when used with the static linker (ld) to relink the application, creates an executable with a smaller working set size, or better I-cache behavior, or both.

The order of the functions that is recorded in the mapfile and used to reorder the functions in the executable is determined by the metric that is used for sorting the function list. Exclusive User CPU time or Exclusive CPU Cycle time are normally used for producing a mapfile. Some metrics, such as those from synchronization delay or heap tracing, or name or address do not produce meaningful ordering for a mapfile.

Default Settings for Analyzer

The Analyzer processes directives from an .er.rc file in the current directory, if present; from a .er.rc file in your home directory, if present; and from a system-wide .er.rc file. These files can contain default settings for which tabs are visible when you load an experiment into the Analyzer. The tabs are named by the er_print command for the corresponding report except for the Experiments tab and the Timeline tab.

The .er.rc files can also contain default settings for metrics, for sorting, and for specifying compiler commentary options and highlighting thresholds for source and disassembly output. They also specify default settings for the Timeline tab, and for name formatting, and setting View mode. The files can also contain directives to control the search path or pathmaps for source files and object files.

The .er.rc files can also contain definitions for custom MemoryObjects and IndexObjects.

The .er.rc files can also contain a setting for en_desc mode to control whether or not descendant experiments are selected and read when the founder experiment is read. The setting for en_desc may be on, off, or =regexp to specifying reading and loading all descendants, no descendants, or reading and loading those descendants whose lineage or executable name match the given regular expression, respectively.

In the Analyzer GUI, you can save an .er.rc file by clicking the Save button in the Set Data Presentation dialog, which you can open from the View menu. Saving an .er.rc file from the Set Data Presentation dialog box not only affects subsequent invocations of the Analyzer, but also the er_print utility and er_src utility.