Index

A

accessible documentation, 1

adding experiments to the Performance Analyzer, 1

address spaces, text and data regions, 1

aliased functions, 1

alternate entry points in Fortran functions, 1

analyzer command, 1

API, Collector, 1

arc, call graph, defined, 1

asynchronous I/O library, interaction with data collection, 1

attaching the Collector to a running process, 1

attributed metrics

defined, 1

displayed in the Callers-Callees tab, 1

effect of recursion on, 1

illustrated, 1

use of, 1

B

body functions, compiler-generated

defined, 1

displayed by the Performance Analyzer, 1

names, 1

propagation of inclusive metrics, 1

C

C++ name demangling, setting default library in .er.rc file, 1

call stacks

defined, 1

effect of tail-call optimization on, 1

in the Event tab, 1

incomplete unwind, 1

mapping addresses to program structure, 1

navigating, 1

representation in the Timeline tab, 1

unwinding, 1

callers-callees metrics

attributed, defined, 1

default, 1

displaying list of in er_print, 1

printing for a single function in er_print, 1

printing in er_print, 1

selecting in er_print, 1

sort order in er_print, 1

clock-based profiling

accuracy of metrics, 1

collecting data in dbx, 1

collecting data with collect, 1

comparison with gethrtime and gethrvtime, 1

data in profile packet, 1

defined, 1

distortion due to overheads, 1

high-resolution, 1

metrics, 1, 2

cloned functions, 1

collect command

address space (-a) option (obsolete), 1

clock-based profiling (-p) option, 1

collecting data with, 1

data limit (-L) option, 1

dry run (-n) option, 1

experiment directory (-d) option, 1

experiment group (-g) option, 1

experiment name (-o) option, 1

follow descendant processes (-F) option, 1

hardware-counter overflow profiling (-h) option, 1

heap tracing (-H) option, 1

Java version (-j) option, 1

listing the options of, 1

MPI tracing (-m) option, 1

pause and resume data recording (-y) option, 1

periodic sampling (-S) option, 1

readme display (-R) option, 1

record sample point (-l) option, 1

stop target after exec (-x) option, 1

synchronization wait tracing (-s) option, 1

syntax, 1

verbose (-v) option, 1

version (-V) option, 1

Collector

API, using in your program, 1

attaching to a running process, 1

defined, 1, 2

disabling in dbx, 1

enabling in dbx, 1

running in dbx, 1

running with collect, 1

color coding

for all functions, 1

for functions in event markers, 1

in the Timeline tab, 1

common subexpression elimination, 1

comparing experiments, 1

compiler commentary

classes defined, 1

description of, 1

example, 1

in the Disassembly tab, 1

in the Source tab, 1

selecting for annotated disassembly listing in er_print, 1

selecting for annotated source listing in er_print, 1

selecting for display in the Source and Disassembly tabs, 1

compiler-generated body functions

defined, 1

displayed by the Performance Analyzer, 1

names, 1

propagation of inclusive metrics, 1

compilers, accessing, 1

compiling

for data collection and analysis, 1

for gprof, 1

for prof, 1

for tcov, 1

for tcov Enhanced, 1

copying an experiment, 1

correlation, effect on metrics, 1

D

data collection

controlling from your program, 1

disabling from your program, 1

disabling in dbx, 1

enabling in dbx, 1

from MPI programs, 1

linking for, 1

MPI program, using collect, 1

MPI program, using dbx, 1

pausing for collect, 1

pausing from your program, 1

pausing in dbx, 1

rate of, 1

resuming for collect, 1

resuming from your program, 1

resuming in dbx, 1

using collect, 1

using dbx, 1

dbx

collecting data under MPI, 1

running the Collector in, 1

dbx collector subcommands

address_space (obsolete), 1

close (obsolete), 1

dbxsample, 1

disable, 1

enable, 1

enable_once (obsolete), 1

hwprofile, 1

limit, 1

pause, 1

profile, 1

quit (obsolete), 1

resume, 1

sample, 1

sample record, 1

show, 1

status, 1

store, 1

store filename (obsolete), 1

synctrace, 1

defaults

read by the Performance Analyzer, 1

saving from the Performance Analyzer, 1

setting in a defaults file, 1

descendant processes

collecting data for all followed, 1

collecting data for selected, 1

example, 1

experiment location, 1

experiment names, 1

followed by Collector, 1

limitations on data collection for, 1

directives, parallelization

attribution of metrics to, 1

microtasking library calls from, 1

disassembly code, annotated

description, 1

for cloned functions, 1

for Java compiled methods, 1

hardware counter metric attribution, 1

in the Disassembly tab, 1

instruction issue dependencies, 1

interpreting, 1

location of executable, 1

metric formats, 1

printing in er_print, 1

setting preferences in er_print, 1

setting preferences in the Performance Analyzer, 1

setting the highlighting threshold in er_print, 1

viewing with er_src, 1

disk space, estimating for experiments, 1

documentation index, 1

documentation, accessing, 1 - 2

dropping experiments from the Performance Analyzer, 1

dynamically compiled functions

Collector API for, 1

definition, 1

in the Source tab, 1

E

entry points, alternate, in Fortran functions, 1

environment variables

JAVA_PATH, 1

JDK_1_4_HOME, 1

JDK_HOME, 1

LD_LIBRARY_PATH, 1

LD_PRELOAD, 1

PATH, 1

SUN_PROFDATA, 1

SUN_PROFDATA_DIR, 1

TCOVDIR, 1, 2

er_archive utility, 1

er_cp utility, 1

er_export utility, 1

er_mv utility, 1

er_print commands

address_space (obsolete), 1

allocs, 1

callers-callees, 1

cmetric_list, 1

cmetrics, 1

csingle, 1

csort, 1

dcc, 1

disasm, 1

dmetrics, 1

dsort, 1

exp_list, 1

fsingle, 1

fsummary, 1

functions, 1

gdemangle, 1

header, 1

help, 1

leaks, 1

limit, 1

lwp_list, 1

lwp_select, 1

mapfile, 1

metric_list, 1

metrics, 1

name, 1

object_list, 1

object_select, 1

objects, 1

osummary (obsolete), 1

outfile, 1

overview, 1

quit, 1

sample_list, 1

sample_select, 1

scc, 1

script, 1

sort, 1

source, 1

src, 1

statistics, 1

sthresh, 1

thread_list, 1

thread_select, 1

Version, 1

version, 1

er_print utility

command-line options, 1

metric keywords, 1

metric lists, 1

purpose, 1

syntax, 1

er_rm utility, 1

er_src utility, 1

error messages, from Performance Analyzer session, 1

errors reported by tcov, 1

event markers

color coding, 1

description, 1

exclusive metrics

defined, 1

for PLT instructions, 1

how computed, 1

illustrated, 1

use of, 1

execution statistics

comparison of times with the <Total> function, 1

in the Statistics tab, 1

printing in er_print, 1

experiment directory

default, 1

specifying in dbx, 1

specifying with collect, 1

experiment groups

default name, 1

defined, 1

name restrictions, 1

removing, 1

specifying name in dbx, 1

specifying name with collect, 1

experiment names

default, 1

MPI default, 1, 2

MPI, using MPI_comm_rank and a script, 1

restrictions, 1

specifying in dbx, 1

specifying with collect, 1

experiments

See also experiment directory; experiment groups; experiment names

adding to the Performance Analyzer, 1

comparing, 1

copying, 1

default name, 1

defined, 1

dropping from the Performance Analyzer, 1

groups, 1

header information in er_print, 1

header information in the Experiments tab, 1

limiting the size of, 1, 2

listing in er_print, 1

location, 1

moving, 1, 2

moving MPI, 1

MPI storage issues, 1

naming, 1

removing, 1

storage requirements, estimating, 1

terminating from your program, 1

where stored, 1, 2

explicit multithreading, 1

F

fast traps, 1

Fortran

alternate entry points, 1

Collector API, 1

subroutines, 1

function calls

between shared objects, 1

imputed, in OpenMP programs, 1

in single-threaded programs, 1

recursive, example, 1

recursive, metric assignment to, 1

function list

printing in er_print, 1

sort order, specifying in er_print, 1

function names, C++

choosing long or short form in er_print, 1

setting default demangling library in .er.rc file, 1

function-list metrics

displaying list of in er_print, 1

selecting default in .er.rc file, 1

selecting in er_print, 1

setting default sort order in .er.rc file, 1

functions

@plt, 1

address within a load object, 1

aliased, 1

alternate entry points (Fortran), 1

cloned, 1

Collector API, 1, 2

color coding for Timeline tab, 1

definition of, 1

dynamically compiled, 1, 2

global, 1

inlined, 1

Java methods displayed, 1

MPI, traced, 1

non-unique, names of, 1

outline, 1

searching for in the Functions and Callers-Callees tabs, 1

selected, 1

static, in stripped shared libraries, 1

static, with duplicate names, 1

system library, interposition by Collector, 1

<Total>, 1

<Unknown>, 1

variation in addresses of, 1

wrapper, 1

G

gprof

fallacy, 1

limitations, 1

output from, interpreting, 1

summary, 1

using, 1

H

hardware counter library, libcpc.so, 1

hardware counter list

description of fields, 1

obtaining with collect, 1

obtaining with dbx collector, 1

hardware counters

choosing with collect, 1

choosing with dbx collector, 1

list described, 1

names, 1

obtaining a list of, 1, 2

overflow value, 1

hardware-counter overflow profiling

collecting data with collect, 1

collecting data with dbx, 1

data in profile packet, 1

defined, 1

example, 1

limitations, 1

hardware-counter overflow value

consequences of too small or too large, 1

defined, 1

experiment size, effect on, 1

setting in dbx, 1

setting with collect, 1

heap tracing

collecting data in dbx, 1

collecting data with collect, 1

limitations, 1

metrics, 1

preloading the Collector library, 1

high metric values

in annotated disassembly code, 1, 2

in annotated source code, 1, 2

searching for in the Source and Disassembly tabs, 1

high-resolution profiling, 1

I

inclusive metrics

defined, 1

effect of recursion on, 1

for PLT instructions, 1

how computed, 1

illustrated, 1

use of, 1

inlined functions, 1

input file

terminating in er_print, 1

to er_print, 1

instruction issue

delay, 1

grouping, effect on annotated disassembly, 1

intermediate files, use for annotated source listings, 1

interposition by Collector on system library functions, 1

J

Java memory allocations, 1

Java methods

annotated disassembly code for, 1

annotated source code for, 1

dynamically compiled, 1, 2

in the Functions tab, 1

Java monitors, 1

Java profiling, limitations, 1

JAVA_PATH environment variable, 1

JDK_1_4_HOME environment variable, 1

JDK_HOME environment variable, 1

K

keywords, metric, er_print utility, 1

L

LD_LIBRARY_PATH environment variable, 1

LD_PRELOAD environment variable, 1

leaf PC, defined, 1

leaks, memory: definition, 1

libaio.so, interaction with data collection, 1

libcollector.so shared library

preloading, 1

using in your program, 1

libcpc.so, use of, 1

libraries

interposition on, 1

libaio.so, 1

libcollector.so, 1, 2, 3

libcpc.so, 1, 2

libthread.so, 1, 2, 3, 4

MPI, 1, 2

static linking, 1

stripped shared, and static functions, 1

system, 1

limitations

descendant process data collection, 1

experiment group names, 1

experiment name, 1

hardware-counter overflow profiling, 1

Java profiling, 1

profiling interval value, 1

tcov, 1

tracing data, 1

limiting output in er_print, 1

limiting the experiment size, 1, 2

load objects

addresses of functions, 1

contents of, 1

defined, 1

information on in Experiments tab, 1

listing selected, in er_print, 1

printing list in er_print, 1

searching for in the Functions and Callers-Callees tabs, 1

selecting in er_print, 1

symbol tables, 1

lock file management

tcov, 1

tcov Enhanced, 1

LWPs

creation by threads library, 1

data display in Timeline tab, 1

listing selected, in er_print, 1

selecting in er_print, 1

selecting in the Performance Analyzer, 1

M

man pages, accessing, 1

MANPATH environment variable, setting, 1

mapfiles

generating with er_print, 1

generating with the Performance Analyzer, 1

reordering a program with, 1

memory allocations, 1

memory leaks, definition, 1

metrics

clock-based profiling, 1, 2

default, 1

defined, 1

effect of correlation, 1

hardware counter, attributing to instructions, 1

heap tracing, 1

interpreting for instructions, 1

interpreting for source lines, 1

memory allocation, 1

MPI tracing, 1

synchronization wait tracing, 1

timing, 1

microstates

contribution to metrics, 1

switching, 1

microtasking library routines, 1

moving an experiment, 1, 2

MPI experiments

default name, 1

loading into the Performance Analyzer, 1

moving, 1

storage issues, 1

MPI programs

attaching to, 1

collecting data from, 1

collecting data with collect, 1

collecting data with dbx, 1

experiment names, 1, 2, 3

experiment storage issues, 1

MPI tracing

collecting data in dbx, 1

collecting data with collect, 1

data in profile packet, 1

functions traced, 1

interpretation of metrics, 1

limitations, 1

metrics, 1

preloading the Collector library, 1

multithreaded applications

attaching the Collector to, 1

execution sequence, 1

multithreading

explicit, 1

parallelization directives, 1

N

naming an experiment, 1

navigating program structure, 1

non-unique function names, 1

O

OpenMP parallelization, 1

optimizations

common subexpression elimination, 1

tail-call, 1

options, command-line, er_print utility, 1

outline functions, 1

output file, in er_print, 1

overview data, printing in er_print, 1

P

parallel execution

call sequence, 1

directives, 1

PATH environment variable, 1, 2

pausing data collection

for collect, 1

from your program, 1

in dbx, 1

PC (program counter), defined, 1

Performance Analyzer

adding experiments to, 1

callers-callees metrics, default, 1

configuring the display, 1

defined, 1, 2

display defaults, 1

dropping experiments from, 1

main window, 1

mapfiles, generating, 1

saving settings, 1

searching for functions and load objects, 1

starting, 1

performance data, conversion into metrics, 1

PLT (Program Linkage Table), 1, 2

@plt function, 1

preloading libcollector.so, 1

process address-space text and data regions, 1

prof

limitations, 1

output from, 1

summary, 1

using, 1

profile bucket, tcov Enhanced, 1, 2

profile packet

clock-based data, 1

hardware-counter overflow data, 1

MPI tracing data, 1

size of, 1

synchronization wait tracing data, 1

profiled shared libraries, creating

for tcov, 1

for tcov Enhanced, 1

profiling interval

defined, 1

experiment size, effect on, 1

limitations on value, 1

setting with dbx collector, 1

setting with the collect command, 1

profiling, defined, 1

program counter (PC), defined, 1

program execution

call stacks described, 1

explicit multithreading, 1

OpenMP parallel, 1

shared objects and function calls, 1

signal handling, 1

single-threaded, 1

tail-call optimization, 1

traps, 1

Program Linkage Table (PLT), 1, 2

program structure, mapping call-stack addresses to, 1

program, reordering with a mapfile, 1

R

recursive function calls

apparent, in OpenMP programs, 1

example, 1

metric assignment to, 1

removing an experiment or experiment group, 1

reordering a program with a mapfile, 1

resuming data collection

for collect, 1

from your program, 1

in dbx, 1

S

samples

circumstances of recording, 1

defined, 1

information contained in packet, 1

listing selected, in er_print, 1

manual recording in dbx, 1

manual recording with collect, 1

periodic recording in dbx, 1

periodic recording with collect, 1

recording from your program, 1

recording when dbx stops a process, 1

representation in the Timeline tab, 1

selecting in er_print, 1

selecting in the Performance Analyzer, 1

sampling interval

defined, 1

setting in dbx, 1

setting with the collect command, 1

searching for functions and load objects in the Performance Analyzer, 1

setuid, use of, 1

shared objects, function calls between, 1

shell prompts, 1

signal handlers

installed by Collector, 1, 2

user program, 1

signals

calls to handlers, 1

profiling, 1

profiling, passing from dbx to collect, 1

use for manual sampling with collect, 1

use for pause and resume with collect, 1

single-threaded program execution, 1

sort order

callers-callees metrics, in er_print, 1

function list, specifying in er_print, 1

source code, annotated

compiler commentary, 1

description, 1

for cloned functions, 1

from tcov, 1

in the Disassembly tab, 1

interpreting, 1

location of source files, 1

metric formats, 1

parallelization directives in, 1

printing in er_print, 1

required compiler options, 1

setting compiler commentary classes in er_print, 1

setting preferences in the Performance Analyzer, 1

setting the highlighting threshold in er_print, 1

<Unknown> line, 1

use of intermediate files, 1

viewing with er_src, 1

stack frames

defined, 1

from trap handler, 1

reuse of in tail-call optimization, 1

starting the Performance Analyzer, 1

static functions

duplicate names, 1

in stripped shared libraries, 1

static linking, effect on data collection, 1

storage requirements, estimating for experiments, 1

summary metrics

displaying in the Summary tab, 1

for a single function, printing in er_print, 1

for all functions, printing in er_print, 1

SUN_PROFDATA environment variable, 1

SUN_PROFDATA_DIR environment variable, 1

symbol tables, load-object, 1

synchronization delay events

data in profile packet, 1

defined, 1

metric defined, 1

synchronization wait time

defined, 1, 2

metric, defined, 1

with unbound threads, 1

synchronization wait tracing

collecting data in dbx, 1

collecting data with collect, 1

data in profile packet, 1

defined, 1

example, 1

limitations, 1

metrics, 1

preloading the Collector library, 1

wait time, 1, 2

syntax

er_archive utility, 1

er_export utility, 1

er_print utility, 1

er_src utility, 1

T

tail-call optimization, 1

tcov

annotated source code, 1

compiling a program for, 1

errors reported by, 1

limitations, 1

lock file management, 1

output, interpreting, 1

profiled shared libraries, creating, 1

summary, 1

using, 1

tcov Enhanced

advantages of, 1

compiling a program for, 1

lock file management, 1

profile bucket, 1, 2

profiled shared libraries, creating, 1

using, 1

TCOVDIR environment variable, 1, 2

threads

bound and unbound, 1, 2

creation of, 1

library, 1, 2, 3, 4

listing selected, in er_print, 1

main, 1

scheduling of, 1, 2

selecting in er_print, 1

selecting in the Performance Analyzer, 1

system, 1, 2

wait mode, 1

worker, 1, 2

threshold, highlighting

defined, 1

in annotated disassembly code, er_print, 1

in annotated source code, er_print, 1

selecting for the Source and Disassembly tabs, 1

threshold, synchronization wait tracing

calibration, 1

defined, 1

effect on collection overhead, 1

setting with dbx collector, 1

setting with the collect command, 1

TLB (translation lookaside buffer) misses, 1, 2, 3

<Total> function

comparing times with execution statistics, 1

described, 1

traps, 1

typographic conventions, 1

U

<Unknown> function

callers and callees, 1

mapping of PC to, 1

<Unknown> line, in annotated source code, 1

unwinding the stack, 1

V

version information

for collect, 1

for er_cp, 1

for er_mv, 1

for er_print, 1

for er_rm, 1

for er_src, 1

for the Performance Analyzer, 1

W

warning messages, 1

wrapper functions, 1