JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Studio 12.3: Performance Analyzer     Oracle Solaris Studio 12.3 Information Library
search filter icon
search icon

Document Information

Preface

1.  Overview of the Performance Analyzer

2.  Performance Data

3.  Collecting Performance Data

4.  The Performance Analyzer Tool

5.  The er_print Command Line Performance Analysis Tool

6.  Understanding the Performance Analyzer and Its Data

7.  Understanding Annotated Source and Disassembly Data

How the Tools Find Source Code

Annotated Source Code

Performance Analyzer Source Tab Layout

Identifying the Original Source Lines

Index Lines in the Source Tab

Compiler Commentary

Common Subexpression Elimination

Loop Optimizations

Inlining of Functions

Parallelization

Special Lines in the Annotated Source

Source Line Metrics

Interpreting Source Line Metrics

Metric Formats

Annotated Disassembly Code

Interpreting Annotated Disassembly

Instruction Issue Grouping

Instruction Issue Delay

Attribution of Hardware Counter Overflows

Special Lines in the Source, Disassembly and PCs Tabs

Outline Functions

Compiler-Generated Body Functions

Dynamically Compiled Functions

Java Native Functions

Cloned Functions

Static Functions

Inclusive Metrics

Branch Target

Annotations for Store and Load Instructions

Viewing Source/Disassembly Without An Experiment

-func

-{source,src} item tag

-{disasm,dis} item tag

-{cc,scc,dcc} com-spec

-outfile filename

-V

8.  Manipulating Experiments

9.  Kernel Profiling

Index

Special Lines in the Source, Disassembly and PCs Tabs

The Performance Analyzer displays some lines in the Source, Disassembly and PCs tabs that do not directly correspond to lines of code, instructions, or program counters. The following sections describe these special lines.

Outline Functions

Outline functions can be created during feedback-optimized compilations. They are displayed as special index lines in the Source tab and Disassembly tab. In the Source tab, an annotation is displayed in the block of code that has been converted into an outline function.

                   Function binsearchmod inlined from source file ptralias2.c into the 
0.       0 .         58.         if( binsearchmod( asize, &element ) ) {
0.240    0.240       59.             if( key != (element << 1) ) {
0.       0.          60.                 error |= BINSEARCHMODPOSTESTFAILED;
                        <Function: main -- outline code from line 60 [_$o1B60.main]>
0.040    0.040     [ 61]                 break;
0.       0.          62.             }
0.       0.          63.         }

In the Disassembly tab, the outline functions are typically displayed at the end of the file.

                        <Function: main -- outline code from line 85 [_$o1D85.main]>
0.       0.             [ 85] 100001034:  sethi       %hi(0x100000), %i5
0.       0.             [ 86] 100001038:  bset        4, %i3
0.       0.             [ 85] 10000103c:  or          %i5, 1, %l7
0.       0.             [ 85] 100001040:  sllx        %l7, 12, %l5
0.       0.             [ 85] 100001044:  call        printf ! 0x100101300
0.       0.             [ 85] 100001048:  add         %l5, 336, %o0
0.       0.             [ 90] 10000104c:  cmp         %i3, 0
0.       0.             [ 20] 100001050:  ba,a        0x1000010b4
                        <Function: main -- outline code from line 46 [_$o1A46.main]>
0.       0.             [ 46] 100001054:  mov         1, %i3
0.       0.             [ 47] 100001058:  ba          0x100001090
0.       0.             [ 56] 10000105c:  clr         [%i2]
                        <Function: main -- outline code from line 60 [_$o1B60.main]>
0.       0.             [ 60] 100001060:  bset        2, %i3
0.       0.             [ 61] 100001064:  ba          0x10000109c
0.       0.             [ 74] 100001068:  mov         1, %o3

The name of the outline function is displayed in square brackets, and encodes information about the section of outlined code, including the name of the function from which the code was extracted and the line number of the beginning of the section in the source code. These mangled names can vary from release to release. The Analyzer provides a readable version of the function name. For further details, refer to Outline Functions.

If an outline function is called when collecting the performance data for an application, the Analyzer displays a special line in the annotated disassembly to show inclusive metrics for that function. For further details, see Inclusive Metrics.

Compiler-Generated Body Functions

When a compiler parallelizes a loop in a function, or a region that has parallelization directives, it creates new body functions that are not in the original source code. These functions are described in Overview of OpenMP Software Execution.

The compiler assigns mangled names to body functions that encode the type of parallel construct, the name of the function from which the construct was extracted, the line number of the beginning of the construct in the original source, and the sequence number of the parallel construct. These mangled names vary from release to release of the microtasking library, but are shown demangled into more comprehensible names.

The following shows a typical compiler-generated body function as displayed in the functions list in machine mode.

7.415      14.860      psec_ -- OMP sections from line 9 [_$s1A9.psec_]
3.873       3.903      craydo_ -- MP doall from line 10 [_$d1A10.craydo_]

As can be seen from the above examples, the name of the function from which the construct was extracted is shown first, followed by the type of parallel construct, followed by the line number of the parallel construct, followed by the mangled name of the compiler-generated body function in square brackets. Similarly, in the disassembly code, a special index line is generated.

0.       0.            <Function: psec_ -- OMP sections from line 9 [_$s1A9.psec_]>
0.       7.445         [24]    1d8cc:  save        %sp, -168, %sp
0.       0.            [24]    1d8d0:  ld          [%i0], %g1
0.       0.            [24]    1d8d4:  tst         %i1
0.       0.            <Function: craydo_ -- MP doall from line 10 [_$d1A10.craydo_]>
0.       0.030         [ ?]    197e8:  save        %sp, -128, %sp
0.       0.            [ ?]    197ec:  ld          [%i0 + 20], %i5
0.       0.            [ ?]    197f0:  st          %i1, [%sp + 112]
0.       0.            [ ?]    197f4:  ld          [%i5], %i3

With Cray directives, the function may not be correlated with source code line numbers. In such cases, a [ ?] is displayed in place of the line number. If the index line is shown in the annotated source code, the index line indicates instructions without line numbers, as shown below.

                     9. c$mic  doall shared(a,b,c,n) private(i,j,k)
                  
                   Loop below fused with loop on line 23
                   Loop below not parallelized because autoparallelization 
                     is not enabled
                   Loop below autoparallelized
                   Loop below interchanged with loop on line 12
                   Loop below interchanged with loop on line 12
3.873     3.903         <Function: craydo_ -- MP doall from line 10 [_$d1A10.craydo_],
                      instructions without line numbers>
0.        3.903     10.            do i = 2, n-1

Note - Index lines and compiler-commentary lines do not wrap in the real displays.


Dynamically Compiled Functions

Dynamically compiled functions are functions that are compiled and linked while the program is executing. The Collector has no information about dynamically compiled functions that are written in C or C++, unless the user supplies the required information using the Collector API function collector_func_load(). Information displayed by the Function tab, Source tab, and Disassembly tab depends on the information passed to collector_func_load() as follows:

For more information about the Collector API functions, see Dynamic Functions and Modules.

For Java programs, most methods are interpreted by the JVM software. The Java HotSpot virtual machine, running in a separate thread, monitors performance during the interpretive execution. During the monitoring process, the virtual machine may decide to take one or more interpreted methods, generate machine code for them, and execute the more-efficient machine-code version, rather than interpret the original.

For Java programs, there is no need to use the Collector API functions; the Analyzer signifies the existence of Java HotSpot-compiled code in the annotated disassembly listing using a special line underneath the index line for the method, as shown in the following example.

                   11.    public int add_int () {
                   12.       int       x = 0;
2.832     2.832      <Function: Routine.add_int: HotSpot-compiled leaf instructions>
0.        0.         [ 12] 00000000: iconst_0
0.        0.         [ 12] 00000001: istore_1

The disassembly listing only shows the interpreted byte code, not the compiled instructions. By default, the metrics for the compiled code are shown next to the special line. The exclusive and inclusive CPU times are different than the sum of all the inclusive and exclusive CPU times shown for each line of interpreted byte code. In general, if the method is called on several occasions, the CPU times for the compiled instructions are greater than the sum of the CPU times for the interpreted byte code, because the interpreted code is executed only once when the method is initially called, whereas the compiled code is executed thereafter.

The annotated source does not show Java HotSpot-compiled functions. Instead, it displays a special index line to indicate instructions without line numbers. For example, the annotated source corresponding to the disassembly extract shown above is as follows:

                     11.    public int add_int () {
2.832     2.832        <Function: Routine.add_int(), instructions without line numbers>
0.        0.         12.       int       x = 0;
                       <Function: Routine.add_int()>

Java Native Functions

Native code is compiled code originally written in C, C++, or Fortran, called using the Java Native Interface (JNI) by Java code. The following example is taken from the annotated disassembly of file jsynprog.java associated with demo program jsynprog.

                     5. class jsynprog
                        <Function: jsynprog.<init>()>
0.       5.504          jsynprog.JavaCC() <Java native method>
0.       1.431          jsynprog.JavaCJava(int) <Java native method>
0.       5.684          jsynprog.JavaJavaC(int) <Java native method>
0.       0.             [  5] 00000000: aload_0
0.       0.             [  5] 00000001: invokespecial <init>()
0.       0.             [  5] 00000004: return

Because the native methods are not included in the Java source, the beginning of the annotated source for jsynprog.java shows each Java native method using a special index line to indicate instructions without line numbers.

0.       5.504          <Function: jsynprog.JavaCC(), instructions without line 
                           numbers>
0.       1.431          <Function: jsynprog.JavaCJava(int), instructions without line 
                           numbers>
0.       5.684          <Function: jsynprog.JavaJavaC(int), instructions without line 
                           numbers>

Note - The index lines do not wrap in the real annotated source display.


Cloned Functions

The compilers have the ability to recognize calls to a function for which extra optimization can be performed. An example of such is a call to a function where some of the arguments passed are constants. When the compiler identifies particular calls that it can optimize, it creates a copy of the function, which is called a clone, and generates optimized code.

In the annotated source, compiler commentary indicates if a cloned function has been created:

0.       0.       Function foo from source file clone.c cloned, 
                   creating cloned function _$c1A.foo; 
                   constant parameters propagated to clone
0.       0.570     27.    foo(100, 50, a, a+50, b);

Note - Compiler commentary lines do not wrap in the real annotated source display.


The clone function name is a mangled name that identifies the particular call. In the above example, the compiler commentary indicates that the name of the cloned function is _$c1A.foo. This function can be seen in the function list as follows:

0.350     0.550     foo
0.340     0.570     _$c1A.foo

Each cloned function has a different set of instructions, so the annotated disassembly listing shows the cloned functions separately. They are not associated with any source file, and therefore the instructions are not associated with any source line numbers. The following shows the first few lines of the annotated disassembly for a cloned function.

0.       0.           <Function: _$c1A.foo>
0.       0.           [?]    10e98:  save        %sp, -120, %sp
0.       0.           [?]    10e9c:  sethi       %hi(0x10c00), %i4
0.       0.           [?]    10ea0:  mov         100, %i3
0.       0.           [?]    10ea4:  st          %i3, [%i0]
0.       0.           [?]    10ea8:  ldd         [%i4 + 640], %f8

Static Functions

Static functions are often used within libraries, so that the name used internally in a library does not conflict with a name that the user might use. When libraries are stripped, the names of static functions are deleted from the symbol table. In such cases, the Analyzer generates an artificial name for each text region in the library containing stripped static functions. The name is of the form <static>@0x12345, where the string following the @ sign is the offset of the text region within the library. The Analyzer cannot distinguish between contiguous stripped static functions and a single such function, so two or more such functions can appear with their metrics coalesced. Examples of static functions can be found in the functions list of the jsynprog demo, reproduced below.

0.       0.       <static>@0x18780
0.       0.       <static>@0x20cc
0.       0.       <static>@0xc9f0
0.       0.       <static>@0xd1d8
0.       0.       <static>@0xe204

In the PCs tab, the above functions are represented with an offset, as follows:

0.       0.       <static>@0x18780 + 0x00000818
0.       0.       <static>@0x20cc + 0x0000032C
0.       0.       <static>@0xc9f0 + 0x00000060
0.       0.       <static>@0xd1d8 + 0x00000040
0.       0.       <static>@0xe204 + 0x00000170

An alternative representation in the PCs tab of functions called within a stripped library is:

<library.so> -- no functions found + 0x0000F870

Inclusive Metrics

In the annotated disassembly, special lines exist to tag the time taken by outline functions.

The following shows an example of the annotated disassembly displayed when an outline function is called:

0.       0.        43.         else
0.       0.        44.         {
0.       0.        45.                 printf("else reached\n");
0.       2.522         <inclusive metrics for outlined functions>

Branch Target

An artificial line, <branch target>, shown in the annotated disassembly listing, corresponds to a PC of an instruction where the backtracking to find its effective address fails because the backtracking algorithm runs into a branch target.

Annotations for Store and Load Instructions

When you compile with the -xhwcprof option, the compilers generate additional information for store (st) and load (ld) instructions. You can view the annotated st and ld instructions in disassembly listings.