Sun Studio 12 Update 1: Performance Analyzer

Chapter 6 The er_print Command Line Performance Analysis Tool

This chapter explains how to use the er_print utility for performance analysis. The er_print utility prints an ASCII version of the various displays supported by the Performance Analyzer. The information is written to standard output unless you redirect it to a file. You must give the er_print utility the name of one or more experiments or experiment groups generated by the Collector as arguments. You can use the er_print utility to display the performance metrics for functions, for callers and callees; the source code listing and disassembly listing; sampling information; data-space data; and execution statistics.

This chapter covers the following topics.

For a description of the data collected by the Collector, see Chapter 2, Performance Data.

For instructions on how to use the Performance Analyzer to display information in a graphical format, see Chapter 4, The Performance Analyzer Tool and the online help.

er_print Syntax

The command-line syntax for the er_print utility is:


er_print [ -script script | -command | - | -V ] experiment-list

The options for the er_print utility are:

-

Read er_print commands entered from the keyboard.

-script script

Read commands from the file script, which contains a list of er_print commands, one per line. If the -script option is not present, er_print reads commands from the terminal or from the command line.

-command [argument]

Process the given command.

-V

Display version information and exit.

Multiple options can appear on the er_print command line. They are processed in the order they appear. You can mix scripts, hyphens, and explicit commands in any order. The default action if you do not supply any commands or scripts is to enter interactive mode, in which commands are entered from the keyboard. To exit interactive mode type quit or Ctrl-D.

After each command is processed, any error messages or warning messages arising from the processing are printed. You can print summary statistics on the processing with the procstats command.

The commands accepted by the er_print utility are listed in the following sections.

You can abbreviate any command with a shorter string as long as the command is unambiguous. You can split a command into multiple lines by terminating a line with a backslash, \. Any line that ends in \ will have the \ character removed, and the content of the next line appended before the line is parsed. There is no limit, other than available memory, on the number of lines you can use for a command

You must enclose arguments that contain embedded blanks in double quotes. You can split the text inside the quotes across lines.

Metric Lists

Many of the er_print commands use a list of metric keywords. The syntax of the list is:


metric-keyword-1[:metric-keyword2…]

For dynamic metrics, those based on measured data, a metric keyword consists of three parts: a metric flavor string, a metric visibility string, and a metric name string. These are joined with no spaces, as follows.


flavorvisibilityname

For static metrics, those based on the static properties of the load objects in the experiment (name, address, and size), a metric keyword consists of a metric name, optionally preceded by a metric visibility string, joined with no spaces:


[visibility]name

The metric flavor and metric visibility strings are composed of flavor and visibility characters.

The allowed metric flavor characters are given in Table 6–1. A metric keyword that contains more than one flavor character is expanded into a list of metric keywords. For example, ie.user is expanded into i.user:e.user.

Table 6–1 Metric Flavor Characters

Character 

Description 

e

Show exclusive metric value 

i

Show inclusive metric value 

a

Show attributed metric value (for callers-callees metric only) 

Show data space metric value (for data-derived metrics only) 

The allowed metric visibility characters are given in Table 6–2. The order of the visibility characters in the visibility string does not matter: it does not affect the order in which the corresponding metrics are displayed. For example, both i%.user and i.%user are interpreted as i.user:i%user .

Metrics that differ only in the visibility are always displayed together in the standard order. If two metric keywords that differ only in the visibility are separated by some other keywords, the metrics appear in the standard order at the position of the first of the two metrics.

Table 6–2 Metric Visibility Characters

Character 

Description 

.

Show metric as a time. Applies to timing metrics and hardware counter metrics that measure cycle counts. Interpreted as “+” for other metrics.

%

Show metric as a percentage of the total program metric. For attributed metrics in the callers-callees list, show metric as a percentage of the inclusive metric for the selected function. 

+

Show metric as an absolute value. For hardware counters, this value is the event count. Interpreted as a “.” for timing metrics.

!

Do not show any metric value. Cannot be used in combination with other visibility characters. 

When both flavor and visibility strings have more than one character, the flavor is expanded first. Thus ie.%user is expanded to i.%user:e.%user, which is then interpreted as i.user:i%user:e.user:e%user .

For static metrics, the visibility characters period (.), plus (+), and percent sign (%), are equivalent for the purposes of defining the sort order. Thus sort i%user, sort i.user, and sort i+user all mean that the Analyzer should sort by inclusive user CPU time if it is visible in any form, and sort i!user means the Analyzer should sort by inclusive user CPU time, whether or not it is visible.

You can use the visibility character exclamation point (!) to override the built-in visibility defaults for each flavor of metric.

If the same metric appears multiple times in the metric list, only the first appearance is processed and subsequent appearances are ignored. If the named metric is not on the list, it is appended to the list.

Table 6–3 lists the available er_print metric name strings for timing metrics, synchronization delay metrics, memory allocation metrics, MPI tracing metrics, and the two common hardware counter metrics. For other hardware counter metrics, the metric name string is the same as the counter name. You can get a list of all the available metric name strings for the loaded experiments with the metric_list command. A list of counter names can be obtained by using the collect command with no arguments. See Hardware Counter Overflow Profiling Data for more information on hardware counters.

Table 6–3 Metric Name Strings

Category 

String 

Description 

Timing metrics 

user

User CPU time 

 

wall

Wall-clock time 

 

total

Total LWP time 

 

system

System CPU time 

 

wait

CPU wait time 

 

ulock

User lock time 

 

text

Text-page fault time 

 

data

Data-page fault time 

 

owait

Other wait time 

Clock‐based profiling metrics 

mpiwork

Time spent inside the MPI runtime doing work, such as processing requests or messages 

 

mpiwait

Time spent inside the MPI runtime, but waiting for an event, buffer, or message 

 

ompwork

Time spent doing work either serially or in parallel 

 

ompwait

Time spent when OpenMP runtime is waiting for synchronization 

Synchronization delay metrics 

sync

Synchronization wait time 

 

syncn

Synchronization wait count 

MPI tracing metrics 

mpitime

Time spent in MPI calls 

 

mpisend

Number of MPI send operations 

 

mpibytessent

Number of bytes sent in MPI send operations 

 

mpireceive

Number of MPI receive operations 

 

mpibytesrecv

Number of bytes received in MPI receive operations 

 

mpiother

Number of calls to other MPI functions 

Memory allocation metrics 

alloc

Number of allocations 

 

balloc

Bytes allocated 

 

leak

Number of leaks 

 

bleak

Bytes leaked 

Hardware counter overflow metrics 

cycles

CPU cycles 

 

insts

Instructions issued 

Thread Analyzer metrics 

raccesses

Data race accesses 

 

deadlocks

Deadlocks 

In addition to the name strings listed in Table 6–3, two name strings can only be used in default metrics lists. These are hwc, which matches any hardware counter name, and any, which matches any metric name string. Also note that cycles and insts are common to SPARC® platforms and x86 platforms, but other flavors also exist that are architecture-specific. To list all available counters, use the collect command with no arguments.

To see the metrics available from the experiments you have loaded, use the metric_list command.

Commands That Control the Function List

The following commands control how the function information is displayed.

functions

Write the function list with the currently selected metrics. The function list includes all functions in load objects that are selected for display of functions, and any load objects whose functions are hidden with the object_select command.

You can limit the number of lines written by using the limit command (see Commands That Control Output).

The default metrics printed are exclusive and inclusive user CPU time, in both seconds and percentage of total program metric. You can change the current metrics displayed with the metrics command, which you must issue before you issue the functions command. You can also change the defaults with the dmetrics command in an .er.rc file.

For applications written in the Java programming language, the displayed function information varies depending on whether the View mode is set to user, expert, or machine.

In all three modes, data is reported in the usual way for any C, C++, or Fortran code called by a Java target.

metrics metric_spec

Specify a selection of function-list metrics. The string metric_spec can either be the keyword default, which restores the default metric selection, or a list of metric keywords, separated by colons. The following example illustrates a metric list.


% metrics i.user:i%user:e.user:e%user

This command instructs the er_print utility to display the following metrics:

By default, the metric setting used is based on the dmetrics command, processed from .er.rc files, as described in Commands That Set Defaults. If a metrics command explicitly sets metric_spec to default, the default settings are restored as appropriate to the data recorded.

When metrics are reset, the default sort metric is set in the new list.

If metric_spec is omitted, the current metrics setting is displayed.

In addition to setting the metrics for the function list, the metrics command sets metrics for caller-callees and metrics for data-derived output to the same settings. See cmetrics metric_spec, data_metrics metric_spec, and indxobj_metrics metric_spec for further information.

When the metrics command is processed, a message is printed showing the current metric selection. For the preceding example the message is as follows.


current: i.user:i%user:e.user:e%user:name

For information on the syntax of metric lists, see Metric Lists. To see a listing of the available metrics, use the metric_list command.

If a metrics command has an error in it, it is ignored with a warning, and the previous settings remain in effect.

sort metric_spec

Sort the function list on metric_spec. The visibility in the metric name does not affect the sort order. If more than one metric is named in the metric_spec, use the first one that is visible. If none of the metrics named are visible, ignore the command. You can precede the metric_spec with a minus sign (-) to specify a reverse sort.

By default, the metric sort setting is based on the dsort command, processed from .er.rc files, as described in Commands That Set Defaults. If a sort command explicitly sets metric_spec to default, the default settings are used.

The string metric_spec is one of the metric keywords described in Metric Lists, as shown in this example.


% sort i.user

This command tells the er_print utility to sort the function list by inclusive user CPU time. If the metric is not in the experiments that have been loaded, a warning is printed and the command is ignored. When the command is finished, the sort metric is printed.

fsummary

Write a summary panel for each function in the function list. You can limit the number of panels written by using the limit command (see Commands That Control Output).

The summary metrics panel includes the name, address and size of the function or load object, and for functions, the name of the source file, object file and load object, and all the recorded metrics for the selected function or load object, both exclusive and inclusive, as values and percentages.

fsingle function_name [N]

Write a summary panel for the specified function. The optional parameter N is needed for those cases where there are several functions with the same name. The summary metrics panel is written for the Nth function with the given function name. When the command is given on the command line, N is required; if it is not needed it is ignored. When the command is given interactively without N but N is required, a list of functions with the corresponding N value is printed.

For a description of the summary metrics for a function, see the fsummary command description.

Commands That Control the Callers-Callees List

The following commands control how the caller and callee information is displayed.

callers-callees

Print the callers-callees panel for each of the functions, in the order specified by the function sort metric (sort).

Within each caller-callee report, the callers and callees are sorted by the caller-callee sort metrics (csort). You can limit the number of panels written by using the limit command (see Commands That Control Output). The selected (center) function is marked with an asterisk, as shown in this example.


Attr.     Excl.     Incl.      Name
User CPU  User CPU  User CPU
 sec.      sec.       sec.
4.440     0.        42.910     commandline
0.        0.         4.440    *gpf
4.080     0.         4.080     gpf_b
0.360     0.         0.360     gpf_a

In this example, gpf is the selected function; it is called by commandline, and it calls gpf_a and gpf_b.

cmetrics metric_spec

Specify a selection of callers-callees metrics. By default, the caller-callee metrics is set to match the function list metrics whenever they are changed. If metric_spec is omitted, the current caller-callee metrics setting is displayed.

The string metric_spec is one of the metric keywords described in Metric Lists, as shown in this example.


% cmetrics i.%user:a.%user

This command instructs er_print to display the following metrics.

When the cmetrics command is finished, a message is printed showing the current metric selection. For the preceding example the message is as follows.


current: i.%user:a.%user:name

By default, the caller-callee metrics are set to match the function list metrics whenever the function list metrics are changed.

Caller-callee attributed metrics are inserted in front of the corresponding exclusive metric and inclusive metric, with visibility corresponding to the logical or of the visibility setting for those two. Static metric settings are copied to the caller-callee metrics. If the metric-name is not in the list, it is appended to it.

A list of all the available metric-name values for the experiments loaded can be obtained with the cmetric_list command.

If a cmetrics command has an error in it, it is ignored with a warning, and the previous settings remain in effect.

csingle function_name [N]

Write the callers-callees panel for the named function. The optional parameter N is needed for those cases where there are several functions with the same name. The callers-callees panel is written for the Nth function with the given function name. When the command is given on the command line, N is required; if it is not needed it is ignored. When the command is given interactively without N but N is required, a list of functions with the corresponding N value is printed.

csort metric_spec

Sort the callers-callees display by the specified metric. The string metric_spec is one of the metric keywords described in Metric Lists, as shown in this example.


% csort a.user

If metric-spec is omitted, the current callers-callees sort metric is displayed.

The csort metric must be either an attributed metric, or a static metric. If multiple metrics are specified, sort by the first visible one that matches.

Whenever metrics are set, either explicitly or by default, the caller-callee sort metric is set based on the function metrics as follows:

This command tells the er_print utility to sort the callers-callees display by attributed user CPU time. When the command finishes, the sort metric is printed.

Commands That Control the Leak and Allocation Lists

This section describes the commands that relate to memory allocations and deallocations.

leaks

Display a list of memory leaks, aggregated by common call stack. Each entry presents the total number of leaks and the total bytes leaked for the given call stack. The list is sorted by the number of bytes leaked.

allocs

Display a list of memory allocations, aggregated by common call stack. Each entry presents the number of allocations and the total bytes allocated for the given call stack. The list is sorted by the number of bytes allocated.

Commands That Control the Source and Disassembly Listings

The following commands control how annotated source and disassembly code is displayed.

pcs

Write a list of program counters (PCs) and their metrics, ordered by the current sort metric. The list includes lines that show aggregated metrics for each load object whose functions are hidden with the object_select command.

psummary

Write the summary metrics panel for each PC in the PC list, in the order specified by the current sort metric.

lines

Write a list of source lines and their metrics, ordered by the current sort metric. The list includes lines that show aggregated metrics for each function that does not have line-number information, or whose source file is unknown, and lines that show aggregated metrics for each load object whose functions are hidden with the object_select command.

lsummary

Write the summary metrics panel for each line in the lines list, in the order specified by the current sort metric.

source { filename | function_name } [ N]

Write out annotated source code for either the specified file or the file containing the specified function. The file in either case must be in a directory in your path. If the source was compiled with the GNU Fortran compiler, you must add two underscore characters after the function name as it appears in the source.

Use the optional parameter N (a positive integer) only in those cases where the file or function name is ambiguous; in this case, the Nth possible choice is used. If you give an ambiguous name without the numeric specifier the er_print utility prints a list of possible object-file names; if the name you gave was a function, the name of the function is appended to the object-file name, and the number that represents the value of N for that object file is also printed.

The function name can also be specified as function”file” , where file is used to specify an alternate source context for the function. Immediately following the first instruction, an index line is added for the function. Index lines are displayed as text within angle brackets in the following form:

<Function: f_name>

The default source context for any function is defined as the source file to which the first instruction in that function is attributed. It is normally the source file compiled to produce the object module containing the function. Alternate source contexts consist of other files that contain instructions attributed to the function. Such contexts include instructions coming from include files and instructions from functions inlined into the named function. If there are any alternate source contexts, include a list of extended index lines at the beginning of the default source context to indicate where the alternate source contexts are located in the following form:

<Function: f, instructions from source file src.h>


Note –

If you use the -source argument when invoking the er_print utility on the command line, the backslash escape character must prepend the file quotes. In other words, the function name is of the form function\”file\”. The backslash is not required, and should not be used, when the er_print utility is in interactive mode.


Normally when the default source context is used, metrics are shown for all functions from that file. Referring to the file explicitly shows metrics only for the named function.

disasm { filename | function_name } [ N]

Write out annotated disassembly code for either the specified file, or the file containing the specified function. The file must be in a directory in your path.

The optional parameter N is used in the same way as for the source command.

scc com_spec

Specify the classes of compiler commentary that are shown in the annotated source listing. The class list is a colon-separated list of classes, containing zero or more of the following message classes.

Table 6–4 Compiler Commentary Message Classes

Class 

Meaning 

b[asic]

Show the basic level messages. 

v[ersion]

Show version messages, including source file name and last modified date, versions of the compiler components, compilation date and options. 

pa[rallel]

Show messages about parallelization. 

q[uery]

Show questions about the code that affect its optimization. 

l[oop]

Show messages about loop optimizations and transformations. 

pi[pe]

Show messages about pipelining of loops. 

i[nline]

Show messages about inlining of functions. 

m[emops]

Show messages about memory operations, such as load, store, prefetch. 

f[e]

Show front-end messages. 

co[degen]

Show code generator messages. 

cf

Show compiler flags at the bottom of the source. 

all

Show all messages. 

none

Do not show any messages. 

The classes all and none cannot be used with other classes.

If no scc command is given, the default class shown is basic. If the scc command is given with an empty class-list, compiler commentary is turned off. The scc command is normally used only in an .er.rc file.

sthresh value

Specify the threshold percentage for highlighting metrics in the annotated source code. If the value of any metric is equal to or greater than value % of the maximum value of that metric for any source line in the file, the line on which the metrics occur have ## inserted at the beginning of the line.

dcc com_spec

Specify the classes of compiler commentary that are shown in the annotated disassembly listing. The class list is a colon-separated list of classes. The list of available classes is the same as the list of classes for annotated source code listing shown in Table 6–4. You can add the following options to the class list.

Table 6–5 Additional Options for the dcc Command

Option 

Meaning 

h[ex]

Show the hexadecimal value of the instructions. 

noh[ex]

Do not show the hexadecimal value of the instructions. 

s[rc]

Interleave the source listing in the annotated disassembly listing. 

nos[rc]

Do not interleave the source listing in the annotated disassembly listing. 

as[rc]

Interleave the annotated source code in the annotated disassembly listing. 

dthresh value

Specify the threshold percentage for highlighting metrics in the annotated disassembly code. If the value of any metric is equal to or greater than value % of the maximum value of that metric for any instruction line in the file, the line on which the metrics occur have ## inserted at the beginning of the line.

cc com_spec

Specify the classes of compiler commentary that are shown in the annotated source and disassembly listing. The class list is a colon-separated list of classes. The list of available classes is the same as the list of classes for annotated source code listing shown in Table 6–4.

setpath path_list

Set the path used to find source and object files. path_list is a colon-separated list of directories. If any directory has a colon character in it, escape it with a backslash. The special directory name, $expts, refers to the set of current experiments, in the order in which they were loaded; you can abbreviate it with a single $ character.

The default setting is: $expts:.. The compiled-in full pathname is used if a file is not found in searching the current path setting.

setpath with no argument prints the current path.

addpath path_list

Append path_list to the current setpath settings.

pathmap old-prefix new-prefix

If a file cannot be found using the path_list set by addpath or setpath, you can specify one or more path remappings with the pathmap command. In any pathname for a source file, object file, or shared object that begins with the prefix specified with old-prefix, the old prefix is replaced by the prefix specified with new-prefix. The resulting path is then used to find the file. Multiple pathmap commands can be supplied, and each is tried until the file is found.

Commands That Control the Data Space List

Data space commands are applicable only to hardware counter experiments where aggressive backtracking was specified, and for objects in files that were compiled with the -xhwcprof option, which is available on SPARC platforms. See the Sun Studio 12 Update 1: Fortran User’s Guide, Sun Studio 12 Update 1: C User’s Guide, or the Sun Studio 12 Update 1: C++ User’s Guide for further information.

data_objects

Write the list of data objects with their metrics.

data_single name [N]

Write the summary metrics panel for the named data object. The optional parameter N is needed for those cases where the object name is ambiguous. When the directive is on the command-line, N is required; if it is not needed, it is ignored.

data_layout

Write the annotated data object layouts for all program data objects with data-derived metric data, sorted by the current data sort metric values for the structures as a whole. Each aggregate data object is shown with the total metrics attributed to it, followed by all of its elements in offset order, each with their own metrics and an indicator of its size and location relative to 32-byte blocks.

data_metrics metric_spec

Set the data-derived metrics. The metric_spec is defined in Metric Lists.

By default, the data-derived metrics are set to match the function list metrics whenever they are changed. The data-derived metrics corresponding to any visible exclusive metric or inclusive metric that has a data-derived flavor, are set with visibility corresponding to the logical or of the visibility setting for those two.

Static metric settings are copied to the data-derived metrics. If the metric name is not in the list, the metric name is appended to the list.

If metric_spec is omitted, the current data-derived metrics setting is displayed.

A list of all the available metric-name values for the experiments loaded can be obtained with the data_metric_list command.

If the metric_spec has any errors, it is ignored, and the data-derived metrics are left unchanged.

data_sort

Set the sort metric for data objects. The prefix d is needed for dynamic metrics, but may be omitted for static metrics. The data_sort metric must be either a data-derived metric or a static metric.

If multiple metrics are specified, sort by the first visible one that matches. Whenever metrics are set, either explicitly or by default, set the data-derived sort metric based on the function metrics:

Commands That Control Memory Object Lists

Memory object commands are applicable only to hardware counter experiments where aggressive backtracking was specified, and for objects in files that were compiled with the -xhwcprof option, which is available on SPARC platforms. See the Sun Studio 12 Update 1: Fortran User’s Guide, the Sun Studio 12 Update 1: C User’s Guide, or the Sun Studio 12 Update 1: C++ User’s Guide for further information.

Memory objects are components in the memory subsystem, such as cache lines, pages, and memory banks. The object is determined from an index computed from the virtual or physical address as recorded. Memory objects are predefined for virtual and physical pages, for sizes of 8KB, 64KB, 512KB, and 4 MB. You can define others with the mobj_define command.

The following commands control the memory object lists.

memobj mobj_type

Write the list of the memory objects of the given type with the current metrics. Metrics used and sorting as for the data space list. You can also use the name mobj_type directly as the command.

mobj_list

Write the list of known types of memory objects, as used for mobj_type in the memobj command.

mobj_define mobj_type index_exp

Define a new type of memory objects with a mapping of VA/PA to the object given by the index_exp. The syntax of the expression is described in Expression Grammar.

The mobj_type must not already be defined. Its name must be entirely composed of alphanumeric characters or the ’_’ character, and begin with an alphabetic character.

The index_exp must be syntactically correct. If it is not syntactically correct, an error is returned and the definition is ignored.

The <Unknown> memory object has an index of -1, and the expression used to define a new memory object should support recognizing <Unknown>. For example, for VADDR-based objects, the expression should be of the following form:

VADDR>255?expression :-1

and for PADDR-based objects, the expression should be of the following form:

PADDR>0?expression:-1

Commands That Control Index Object Lists

Index objects commands are applicable to all experiments. An index object list is a list of objects for whom an index can be computed from the recorded data. Index objects are predefined for Threads, Cpus, Samples, and Seconds. You can define other index objects with the indxobj_define command.

The following commands control the index-object lists.

indxobj indxobj_type

Write the list of the index objects that match the given type, along with their metrics. Metrics and sorting for index objects is the same as those for the function list, but containing exclusive metrics only. The name indxobj_type can also be used directly as the command.

indxobj_list

Write the list of known types of index objects, as used for indxobj_type in the indxobj command.

indxobj_define indxobj_type index_exp

Define a new type of index object with a mapping of packets to the object given by the index_exp. The syntax of the expression is described in Expression Grammar.

The indxobj_type must not already be defined. Its name is case-insensitive, must be entirely composed of alphanumeric characters or the ’_’ character, and begin with an alphabetic character.

The index_exp must be syntactically correct, or an error is returned and the definition is ignored. If the index_exp contains any blanks, it must be surrounded by double quotes (“).

The <Unknown> index object has an index of -1, and the expression used to define a new index object should support recognizing <Unknown>.

For example, for index objects based on virtual or physical PC, the expression should be of the following form:


VIRTPC>0?VIRTPC:-1

indxobj_metrics metric_spec

Specify a selection of metrics for index objects. The metric_spec may only contain exclusive metrics and static metrics, since index objects are not hierarchical.

For information on the syntax of metric lists, see Metric Lists. To see a listing of the available metrics, use the metric_list command.

indxobj_sort metric_spec

Sort the index object lists by the specified metric. The indxobj_sort metric must be either an exclusive metric or a static metric. If multiple metrics are specified, sort is done according to the first visible one that matches.

Commands for the OpenMP Index Objects

The following commands let you print information for OpenMP index objects.

OMP_preg

Print a list of the OpenMP parallel regions executed in the experiment with their metrics. This command is available only for experiments with OpenMP 3.0 performance data.

OMP_task

Print a list of the OpenMP tasks executed in the experiment with their metrics. This command is available only for experiments with OpenMP 3.0 performance data.

Commands That Support the Thread Analyzer

The following commands are in support of the Thread Analyzer. See the Sun Studio 12: Thread Analyzer User’s Guide for more information about the data captured and shown.

races

Writes a list of all dataraces in the experiments. Data-race reports are available only from experiments with data-race-detection data.

rdetail race_id

Writes the detailed information for the given race_id. If the race_id is set to all, detailed information for all dataraces is shown. Data-race reports are available only from experiments with data-race-detection data.

deadlocks

Writes a list of all detected real and potential deadlocks in the experiments. Deadlock reports are available only from experiments with deadlock-detection data.

ddetail deadlock_id

Writes the detailed information for the given deadlock_id. If the deadlock_id is set to all, detailed information for all deadlocks is shown. Deadlock reports are available only from experiments with deadlock-detection data.

Commands That List Experiments, Samples, Threads, and LWPs

This section describes the commands that list experiments, samples, threads, and LWPs.

experiment_list

Display the full list of experiments loaded with their ID number. Each experiment is listed with an index, which is used when selecting samples, threads, or LWPs, and a PID, which can be used for advanced filtering.

The following example shows an experiment list.


(er_print) experiment_list
ID Experiment
== ==========
1 test.1.er
2 test.6.er

sample_list

Display the list of samples currently selected for analysis.

The following example shows a sample list.


(er_print) sample_list
Exp Sel     Total
=== ======= =====
  1 1-6        31
  2 7-10,15    31

lwp_list

Display the list of LWPs currently selected for analysis.

thread_list

Display the list of threads currently selected for analysis.

cpu_list

Display the list of CPUs currently selected for analysis.

Commands That Control Filtering of Experiment Data

You can specify filtering of experiment data in two ways:

Specifying a Filter Expression

You can specify a filter expression with the filters command.

filters filter_exp

filter_exp is an expression that evaluates as true for any data record that should be included, and false for records that should not be included. The grammar of the expression is described in Expression Grammar.

Selecting Samples, Threads, LWPs, and CPUs for Filtering

Selection Lists

The syntax of a selection is shown in the following example. This syntax is used in the command descriptions.


[experiment-list:]selection-list[+[
experiment-list:]selection-list … ]

Each selection list can be preceded by an experiment list, separated from it by a colon and no spaces. You can make multiple selections by joining selection lists with a + sign.

The experiment list and the selection list have the same syntax, which is either the keyword all or a list of numbers or ranges of numbers (n-m) separated by commas but no spaces, as shown in this example.


2,4,9-11,23-32,38,40

The experiment numbers can be determined by using the exp_list command.

Some examples of selections are as follows.


1:1-4+2:5,6
all:1,3-6

In the first example, objects 1 through 4 are selected from experiment 1 and objects 5 and 6 are selected from experiment 2. In the second example, objects 1 and 3 through 6 are selected from all experiments. The objects may be LWPs, threads, or samples.

Selection Commands

The commands to select LWPs, samples, CPUs, and threads are not independent. If the experiment list for a command is different from that for the previous command, the experiment list from the latest command is applied to all three selection targets, LWPs, samples, and threads, in the following way.

sample_select sample_spec

Select the samples for which you want to display information. The list of samples you selected is displayed when the command finishes.

lwp_select lwp_spec

Select the LWPs about which you want to display information. The list of LWPs you selected is displayed when the command finishes.

thread_select thread_spec

Select the threads about which you want to display information. The list of threads you selected is displayed when the command finishes.

cpu_select cpu_spec

Select the CPUs about which you want to display information. The list of CPUs you selected is displayed when the command finishes.

Commands That Control Load Object Expansion and Collapse

object_list

Display a two-column list showing the status and names of all load objects. The show/hide/api status of each load object is shown in the first column, and the name of the object is shown in the second column. The name of each load object is preceded either by a show that indicates that the functions of that object are shown in the function list (expanded), by a hide that indicates that the functions of that object are not shown in the function list (collapsed), or by API-only if only those functions representing the entry point into the load object are shown. All functions for a collapsed load object map to a single entry in the function list representing the entire load object.

The following is an example of a load object list.


(er_print) object_list
Sel  Load Object
==== ==================
hide <Unknown>
show <Freeway>
show <libCstd_isa.so.1>
show <libnsl.so.1>
show <libmp.so.2>
show <libc.so.1>
show <libICE.so.6>
show <libSM.so.6>
show <libm.so.1>
show <libCstd.so.1>
show <libX11.so.4>
show <libXext.so.0>
show <libCrun.so.1>
show <libXt.so.4>
show <libXm.so.4>
show <libsocket.so.1>
show <libgen.so.1>
show <libcollector.so>
show <libc_psr.so.1>
show <ld.so.1>
show <liblayout.so.1>

object_show object1,object2,...

Set all named load objects to show all their functions. The names of the objects can be either full path names or the basename. If the name contains a comma character, the name must be surrounded by double quotation marks. If the string “all” is used to name the load object, functions are shown for all load objects.

object_hide object1,object2,...

Set all named load objects to hide all their functions. The names of the objects can be either full path names or the basename. If the name contains a comma character, the name must be surrounded by double quotation marks. If the string “all” is used to name the load object, functions are shown for all load objects.

object_api object1,object2,...

Set all named load objects to show all only the functions representing entry points into the library. The names of the objects can be either full path names or the basename. If the name contains a comma character, the name must be surrounded by double quotation marks. If the string “all” is used to name the load object, functions are shown for all load objects.

objects_default

Set all load objects according to the initial defaults from .er.rc file processing.

object_select object1,object2,...

Select the load objects for which you want to display information about the functions in the load object. Functions from all named load objects are shown; functions from all others are hidden. object-list is a list of load objects, separated by commas but no spaces. If functions from a load object are shown, all functions that have non-zero metrics are shown in the function list. If a functions from a load object are hidden, its functions are collapsed, and only a single line with metrics for the entire load object instead of its individual functions is displayed.

The names of the load objects should be either full path names or the basename. If an object name itself contains a comma, you must surround the name with double quotation marks.

Commands That List Metrics

The following commands list the currently selected metrics and all available metric keywords.

metric_list

Display the currently selected metrics in the function list and a list of metric keywords that you can use in other commands (for example, metrics and sort) to reference various types of metrics in the function list.

cmetric_list

Display the currently selected metrics in the callers-callees list and a list of metric keywords that you can use in other commands (for example, cmetrics and csort) to reference various types of metrics in the callers-callees list.


Note –

Attributed metrics can be specified for display only with the cmetrics command, not the metrics command or the data_metrics command, and displayed only with the callers-callees command, not the functions command or data_objects command.


data_metric_list

Display the currently selected data-derived metrics and a list of metrics and keyword names for all data-derived reports. Display the list in the same way as the output for the metric_list command, but include only those metrics that have a data-derived flavor and static metrics.


Note –

Data-derived metrics can be specified for display only with the data_metrics command, not the metrics command or the cmetrics command, and displayed only with the data_objects command, not the functions command or callers-callees command


indx_metric_list

Display the currently selected index-object metrics. Display the list in the same way as the metric_list command, but include only those metrics that have an exclusive flavor, and static metrics.

Commands That Control Output

The following commands control er_print display output.

outfile { filename | - }

Close any open output file, then open filename for subsequent output. When opening filename, clear any pre-existing content. If you specify a dash (-) instead of filename, output is written to standard output. If you specify two dashes (--) instead of filename, output is written to standard error.

appendfile filename

Close any open output file and open filename, preserving any pre-existing content, so that subsequent output is appended to the end of the file. If filename does not exist, the functionality of the appendfile command is the same as for the outfile command.

limit n

Limit output to the first n entries of the report; n is an unsigned positive integer.

name { long | short } [ :{ shared_object_name | no_shared_object_name } ]"

Specify whether to use the long or the short form of function names (C++ and Java only). If shared_object_name is specified, append the shared-object name to the function name.

viewmode { user | expert | machine }

Set the mode to one of the following:

user

For Java experiments, show the Java call stacks for Java threads, and do not show housekeeping threads. The function list includes a function <JVM-System> representing aggregated time from non-Java threads. When the JVM software does not report a Java call stack, time is reported against the function <no Java callstack recorded>.

For OpenMP experiments, show reconstructed call stacks similar to those obtained when the program is compiled without OpenMP. Add special functions, with the names of form <OMP-*>, when the OpenMP runtime is performing certain operations.

expert

For Java experiments, show the Java call stacks for Java threads when the user’s Java code is being executed, and machine call stacks when JVM code is being executed or when the JVM software does not report a Java call stack. Show the machine call stacks for housekeeping threads.

For OpenMP experiments, show compiler generated functions representing parallelized loops, tasks, and such, which are aggregated with user functions in user mode. Add special functions, with the names of form <OMP-*>, when the OpenMP runtime is performing certain operations.

machine

For Java experiments and OpenMP experiments, show the machine call stacks for all threads.

For all experiments other than Java experiments and OpenMP experiments, all three modes show the same data.

Commands That Print Other Information

header exp_id

Display descriptive information about the specified experiment. The exp_id can be obtained from the exp_list command. If the exp_id is all or is not given, the information is displayed for all experiments loaded.

Following each header, any errors or warnings are printed. Headers for each experiment are separated by a line of dashes.

If the experiment directory contains a file named notes, the contents of that file are prepended to the header information. A notes file may be manually added or edited or specified with -Ccomment” arguments to the collect command.

exp_id is required on the command line, but not in a script or in interactive mode.

ifreq

Write a list of instruction frequency from the measured count data. The instruction frequency report can only be generated from count data. This command applies only on SPARC processors running the Solaris OS.

objects

List the load objects with any error or warning messages that result from the use of the load object for performance analysis. The number of load objects listed can be limited by using the limit command (see Commands That Control Output).

overview exp_id

Write out the sample data of each of the currently selected samples for the specified experiment. The exp_id can be obtained from the exp_list command. If the exp_id is all or is not given, the sample data is displayed for all experiments. exp_id is required on the command line, but not in a script or in interactive mode.

statistics exp_id

Write out execution statistics, aggregated over the current sample set for the specified experiment. For information on the definitions and meanings of the execution statistics that are presented, see the getrusage(3C) and proc(4) man pages. The execution statistics include statistics from system threads for which the Collector does not collect any data.

The exp_id can be obtained from the experiment_list command. If the exp_id is not given, the sum of data for all experiments is displayed, aggregated over the sample set for each experiment. If exp_id is all, the sum and the individual statistics for each experiment are displayed.

Commands That Set Defaults

You can use the following commands to set the defaults for er_print and for the Performance Analyzer. You can use these commands only for setting defaults: they cannot be used as input for the er_print utility. They can be included in a defaults file named .er.rc.Commands that apply only to defaults for the Performance Analyzer are described in Commands That Set Defaults Only For the Performance Analyzer.

You can include a defaults file in your home directory to set defaults for all experiments, or in any other directory to set defaults locally. When the er_print utility, the er_src utility, or the Performance Analyzer is started, the current directory and your home directory are scanned for defaults files, which are read if they are present, and the system defaults file is also read. Defaults from the .er.rc file in your home directory override the system defaults, and defaults from the .er.rc file in the current directory override both home and system defaults.


Note –

To ensure that you read the defaults file from the directory where your experiment is stored, you must start the Performance Analyzer or the er_print utility from that directory.


The defaults file can also include the scc, sthresh , dcc, dthresh, cc, setpath, addpath, pathmap, name, mobj_define, object_show, object_hide, object_api, indxobj_define, tabs, rtabs, and viewmode commands. You can include multiple dmetrics, dsort, addpath, pathmap, mobj_define, and indxobj_define commands in a defaults file, and the commands from all .er.rc files are concatenated. For all other commands, the first appearance of the command is used and subsequent appearances are ignored.

dmetrics metric_spec

Specify the default metrics to be displayed or printed in the function list. The syntax and use of the metric list is described in the section Metric Lists. The order of the metric keywords in the list determines the order in which the metrics are presented and the order in which they appear in the Metric chooser in the Performance Analyzer.

Default metrics for the Callers-Callees list are derived from the function list default metrics by adding the corresponding attributed metric before the first occurrence of each metric name in the list.

dsort metric_spec

Specify the default metric by which the function list is sorted. The sort metric is the first metric in this list that matches a metric in any loaded experiment, subject to the following conditions:

The syntax and use of the metric list is described in the section Metric Lists.

The default sort metric for the Callers-Callees list is the attributed metric corresponding to the default sort metric for the function list.

en_desc { on | off | =regexp}

Set the mode for reading descendant experiments to on (enable all descendants) or off (disable all descendants). If the =regexp is used, enable data from those experiments whose lineage or executable name matches the regular expression.

Commands That Set Defaults Only For the Performance Analyzer

tabs tab_spec

Set the default set of tabs to be visible in the Analyzer. The tabs are named by the er_print command that generates the corresponding reports, including mobj_type for MemoryObject tabs or indxobj_type for IndexObject tabs. mpi_timeline specifies the MPI Timeline tab, mpi_chart specifies the MPI Chart tab, timeline specifies the Timeline tab, and headers specifies the Experiments tab.

Only those tabs that are supported by the data in the loaded experiments are shown.

rtabs tab_spec

Set the default set of tabs to be visible when the Analyzer is invoked with the tha command, for examining Thread Analyzer experiments. Only those tabs that are supported by the data in the loaded experiments are shown.

tlmode tl_mode

Set the display mode options for the Timeline tab of the Performance Analyzer. The list of options is a colon-separated list. The allowed options are described in the following table.

Table 6–6 Timeline Display Mode Options

Option 

Meaning 

lw[p]

Display events for LWPs 

t[hread]

Display events for threads 

c[pu]

Display events for CPUs 

r[oot]

Align call stack at the root 

le[af]

Align call stack at the leaf 

d[epth] nn

Set the maximum depth of the call stack that can be displayed 

The options lwp, thread, and cpu are mutually exclusive, as are root and leaf. If more than one of a set of mutually exclusive options is included in the list, the last one is the only one that is used.

tldata tl_data

Select the default data types shown in the Timeline tab of the Performance Analyzer. The types in the type list are separated by colons. The allowed types are listed in the following table.

Table 6–7 Timeline Display Data Types

Type 

Meaning 

sa[mple]

Display sample data 

c[lock]

Display clock profiling data 

hw[c]

Display hardware counter profiling data 

sy[nctrace]

Display thread synchronization tracing data 

mp[itrace]

Display MPI tracing data 

he[aptrace]

Display heap tracing data 

Miscellaneous Commands

mapfile load-object { mapfilename | - }

Write a mapfile for the specified load object to the file mapfilename . If you specify a dash (-) instead of mapfilename, er_print writes the mapfile to standard output.

procstats

Print the accumulated statistics from processing data.

script file

Process additional commands from the script file file.

version

Print the current release number of the er_print utility

quit

Terminate processing of the current script, or exit interactive mode.

help

Print a list of er_print commands.

Expression Grammar

A common grammar is used for an expression defining a filter and an expression used to compute a memory object index.

The grammar specifies an expression as a combination of operators and operands. For filters, if the expression evaluates to true, the packet is included; if the expression evaluates to false, the packet is excluded. For memory objects or index objects, the expression is evaluated to an index that defines the particular memory object or index object referenced in the packet.

Operands in an expression are either constants, or fields within a data record, including THRID, LWPID, CPUID , STACK, LEAF, VIRTPC, PHYSPC, VADDR, PADDR, DOBJ, TSTAMP, SAMPLE, EXPID, PID, or the name of a memory object. Operand names are case-insensitive. Operators include the usual logical operators and arithmetic (including shift) operators, in C notation, with C precedence rules, and an operator for determining whether an element is in a set (IN) or whether any or all of a set of elements is contained in a set (SOME IN or IN, respectively). If-then-else constructs are specified as in C, with the ? and : operators. Use parentheses to ensure proper parsing of all expressions. On the er_print command lines, the expression cannot be split across lines. In scripts or on the command line, the expression must be inside double quotes if it contains blanks.

Filter expressions evaluate to a boolean value, true if the packet should be included, and false if it should not be included. Thread, LWP, CPU, experiment-id, process-pid, and sample filtering are based on a relational expression between the appropriate keyword and an integer, or using the IN operator and a comma-separated list of integers.

Time-filtering is used by specifying one or more relational expressions between TSTAMP and a time, given in integer nanoseconds from the start of the experiment whose packets are being processed. Times for samples can be obtained using the overview command. Times in the overview command are given in seconds, and must be converted to nanoseconds for time-filtering. Times can also be obtained from the Timeline display in the Analyzer.

Function filtering can be based either on the leaf function, or on any function in the stack. Filtering by leaf function is specified by a relational expression between the LEAF keyword and an integer function id, or using the IN operator and the construct FNAME(“ regexp”), where regexp is a regular expression as specified on the regexp(5) man page. The entire name of the function, as given by the current setting of name, must match.

Filtering based on any function in the call stack is specified by determining if any function in the construct FNAME(“regexp ”) is in the array of functions represented by the keyword STACK: (FNAME(“myfunc”) SOME IN STACK).

Data object filtering is analogous to stack function filtering, using the DOBJ keyword and the construct DNAME(“ regexp”) enclosed in parentheses.

Memory object filtering is specified using the name of the memory object, as shown in the mobj_list command, and the integer index of the object, or the indices of a set of objects. (The <Unknown> memory object has index -1.)

Index object filtering is specified using the name of the index object, as shown in the indxobj_list command, and the integer index of the object, or the indices of a set of objects. (The <Unknown> index object has index -1.)

Data object filtering and memory object filtering are meaningful only for hardware counter packets with dataspace data; all other packets are excluded under such filtering.

Direct filtering of virtual addresses or physical addresses is specified with a relational expression between VADDR or PADDR, and the address.

Memory object definitions (see mobj_define mobj_type index_exp) use an expression that evaluates to an integer index, using either the VADDR keyword or PADDR keyword. The definitions are applicable only to hardware counter packets for memory counters and dataspace data. The expression should return an integer, or -1 for the <Unknown> memory object.

Index object definitions (see indxobj_define indxobj_type index_exp) use an expression that evaluates to an integer index. The expression should return an integer, or -1 for the <Unknown> index object.

er_print command Examples