Analyzing Program Performance with Sun WorkShop |
Traditional Profiling Tools
The tools discussed in this appendix are standard utilities for timing programs and obtaining performance data to analyze. The profiling tools
prof
andgprof
are provided with Solaris versions 2.6, 7, and 8 of the Solaris Operating Environment SPARC Platform Edition and Solaris Intel Platform Edition.tcov
is a code coverage tool.
Note If you want to track execution counts (how many times a function is called, how often a line of source code executes), use the traditional profiling tools. If you want a detailed analysis of where your program is spending time, you can get more accurate information using the Sampling Collector and Analyzer. See Chapter 3 and Chapter 4 for information on using these applications.
Not all the traditional profiling tools work on modules written in programming languages other than C. See the sections on each tool for more information about languages.
This chapter covers the following topics:
- Basic Concepts
- Using prof to Generate a Program Profile
- Using gprof to Generate a Call Graph Profile
- Using tcov for Statement-Level Analysis
- Using tcov Enhanced for Statement-Level Analysis
- Creating Profiled Shared Libraries
Basic Concepts
prof
,gprof
, andtcov
extend the Sun WorkShop development environment by enabling you to collect and use performance data.
prof
generates a program profile in a flat file.gprof
generates a call graph profile.tcov
generates statement-level information in a copy of the source file, annotated to show which lines are used and how often.TABLE A-1 describes the information generated by these standard performance profiling tools.
Using
prof
to Generate a Program Profile
prof
generates a statistical profile of the CPU time used by a program and counts the number of times each function in a program is entered. Different, or more detailed data, is provided by thegprof
call-graph profile and thetcov
code coverage tools.Using
prof
involves three basic steps:
- Compiling the program for
prof
- Running the program to produce a profile data file
- Using
prof
to generate a report that summarizes the dataTo compile a program for profiling with
prof
, use the-p
option to the compiler. For example, to compile a C source file namedindex.assist.c
for profiling, you use this compiler command:
%cc -p -o index.assist index.assist.c
The compiler produces a program called
index.assist
.Now you run the
index.assist
program. Each time you run the program, profiling data is sent to a profile file calledmon.out
. Every time you run the program a newmon.out
file is created that overwrites the old version.Finally, you would use the
prof
command to generate a report:
%index.assist
%ls mon.out
mon.out%prof index.assist
Output Example
The
prof
output resembles this example.
Sample
prof
OutputThe sample display shows that most of the program execution time is spent in the
compare_strings()
routine; after that, most of the CPU time is spent in the_strlen()
library routine. To make improvements to this program, concentrate on thecompare_strings()
function.The results of the profiling sample run are listed under these column headings:
%Time
--The percentage of the total CPU time consumed by this routine of the program.Seconds
--The total CPU time accounted for by this function.Cumsecs
--A running sum of the number of seconds accounted for by this function and those listed before it.#Calls
--The number of times this routine is called.msecs/call
--The average number of milliseconds this routine consumes each time it is called.Name
--The name of the routine.What results can be derived from the profile data? The
compare_strings()
function consumes nearly 20% of the total CPU time. To improve the runtime ofindex.assist
, you can either improve the algorithm thatcompare_strings()
uses, or cut down the number of calls tocompare_strings()
.It is not obvious from the flat call graph that
compare_strings()
is heavily recursive, but you can deduce this by using the call graph profile described in Using gprof to Generate a Call Graph Profile. In this particular case, improving the algorithm also reduces the number of calls.
Note For Solaris 7 and 8 platforms, the profile of CPU time is accurate for programs that use multiple CPUs, but the fact that the counts are not locked may affect the accuracy of the counts for functions.
Using
gprof
to Generate a Call Graph ProfileWhile the flat profile from
prof
can provide valuable data for performance improvements, a more detailed analysis can be obtained by using a call graph profile to display a list identifying which modules are called by other modules, and which modules call other modules. Sometimes removing calls altogether can result in performance improvements.
Note gprof
allocates the time within a function to the callers in proportion to the number of times that each arc is traversed. Because all calls are not equivalent in performance, this behavior might lead to incorrect assumptions. See "The gprof Fallacy" on page 19 for an example.
Like
prof
,gprof
generates a statistical profile of the CPU time that is used by a program and it counts the number of times that each function is entered.gprof
also counts the number of times that each arc in the program's call graph is traversed. An arc is a caller-callee pair.
Note For Solaris 7 and 8 platforms, the profile of CPU time is accurate for programs that use multiple CPUs, but the fact that the counts are not locked may affect the accuracy of the counts for functions.
Using
gprof
involves three basic steps:
- Compiling the program for
gprof
- Running the program to produce a profile data file
- Using
gprof
to generate a report that summarizes the dataTo compile the program for call graph profiling, use the
-xpg
option for the C compiler or the-pg
option for the Fortran compiler. For example:
%cc -xpg -o index.assist index.assist.c
Now run the
index.assist
program. Each time you run a program compiled forgprof
, call-graph profile data is sent to a file calledgmon.out
. This file is overwritten each time you run the program.Finally, use the
gprof
command to generate a report of the results of the profile. The output fromgprof
can be large. You might find the report easier to read if you redirect it to a file. To redirect the output fromgprof
to a file namedg.output
, use this sequence of commands:
%index.assist
%ls gmon.out
gmon.out%gprof index.assist > g.output
The output from
gprof
consists of two major items:
- The full call graph profile, which shows fragments of output from a profiling run.
- The "flat" profile, which is similar to the summary the
prof
command supplies.The output from
gprof
contains an explanation of what the various parts of the summary mean.gprof
also identifies the granularity of the sampling:
granularity: each sample hit covers 4 byte(s) for 0.07% of 14.74 secondsThe "4 bytes" means resolution to a single instruction. The "0.07% of 14.74
seconds" means that each sample, representing ten milliseconds of CPU time, accounts for 0.07% of the run.
The following example is part of the call graph profile.
If you assume that there are 761 lines of data in the input file to the
index.assist
program, you can conclude:
fgets()
is called 762 times. The last call tofgets()
returns an end-of-file.- The
insert_index_entry()
function is called 760 times frommain()
.- In addition to the 760 times that
insert_index_entry()
is called frommain(), insert_index_entry()
also calls itself 10,392 times.insert_index_entry()
is heavily recursive.compare_entry()
, which is called frominsert_index_entry()
, is called 11,152 times, which is equal to 760+10,392 times. There is one call tocompare_entry()
for every time thatinsert_index_entry()
is called. This is correct. If there were a discrepancy in the number of calls, you could suspect some problem in the program logic.insert_page_entry()
is called 820 times in total: 761 times frommain()
while the program is building index nodes, and 59 times frominsert_index_entry()
. This frequency indicates that there are 59 duplicated index entries, so their page number entries are linked into a chain with the index nodes. The duplicate index entries are then freed; hence the 59 calls tofree()
.Using
tcov
for Statement-Level Analysis
tcov
gives line-by-line information on how a program executes. It produces a copy of the source file, annotated to show which lines are used and how often. It also summarizes information about basic blocks.tcov
does not product any time-based data.Using
tcov
involves three basic steps:
- Compiling the program to produce a
tcov
experiment- Running the experiment
- Using
tcov
to create summaries of execution counts for each statement in the programCompiling for
tcov
To compile a program for code coverage, use the
-xa
option to the C compiler. For example, you compile a program namedindex.assist.c
for use withtcov
with this command:
%
cc -xa -o index.assist index.assist.c
Use the
-a
compiler option instead of-xa
with the C++ orFortran
compilers.The C compiler generates an
index.assist.d
file, containing database entries for the basic blocks present inindex.assist.c
. When the programindex.assist
is run to completion, the compiler updates theindex.assist.d
file.
Note Althoughtcov
works with both C and C++ programs, it does not support files that contain#line
or#file
directives.tcov
does not enable test coverage analysis of the code in the#include
header files. Applications compiled with-xa
(C),-a
(other compilers), and+d
(C++) run more slowly than they normally would. The+d
option inhibits expansion of C++ inline functions, and updating the.d
file for each execution takes considerable time.
The
index.assist.d
file is created in the directory specified by the environment variableTCOVDIR
. IfTCOVDIR
is not set,index.assist.d
is created in the current directory.Having compiled
index.assist.c
, you can runindex.assist
:
%index.assist
%ls *.d
index.assist.dNow you can run
tcov
to produce a file containing the summaries of execution counts for each statement in the program.tcov
uses theindex.assist.d
file to generate anindex.assist.tcov
file containing an annotated list of your code. The output shows the number of times each source statement is executed. At the end of the file, there is a short summary.
%tcov index.assist.c
%ls *.tcov
index.assist.tcovThis small fragment of the C code from one of the modules of
index.assist
shows theinsert_index_entry()
function, which is called so recursively.
The numbers to the left of the C code show how many times each statement was executed. The
insert_index_entry
function is called 11,152 times
.
tcov
places a summary like the following at the end of the annotated program listing forindex.assist.tcov
:
A program compiled for code coverage analysis can be run multiple times (with potentially varying input);
tcov
can be used on the program after each run to compare behavior.Creating
tcov
Profiled Shared LibrariesIt is possible to create a
tcov
profiled shareable library and use it in place of one where binaries have already been linked. Include the-xa
(C) or-a
(other compilers) option when creating the shareable libraries. For example:
%cc -G -xa -o foo.so.1 foo.o
This command includes a copy of the
tcov
profiling subroutines in the shareable libraries, so that clients of the library do not need to relink. If a client of the library is also linked for profiling, then the version of thetcov
subroutines used by the client is used to profile the shareable library.Locking Files
tcov
uses a simple file-locking mechanism for updating the block coverage database in the.d
files. It employs a single file,tcov.lock
, for this purpose. Consequently, only one executable compiled with-xa
(C) or-a
(other compilers) should be running on the system at a time. If the execution of the program compiled with the-xa
(or-a
) option is manually terminated, then thetcov.lock
file must be deleted manually.Files compiled with the
-xa
or-a
option call the profiling tool subroutines automatically when a program is linked fortcov
profiling. At program exit, these subroutines combine the information collected at runtime for filexyz.f
(for example) with the existing profiling information stored in filexyz.d
. To ensure this information is not corrupted by several people simultaneously running a profiled binary, axyz.d.lock
lock file is created forxyz.d
for the duration of the update. If there are any errors in opening or readingxyz.d
or its lock file, or if there are inconsistencies between the runtime information and the stored information, the information stored inxyz.d
is not changed.An edit and recompile of
xyz.d
may change the number of counters inxyz.d
. This is detected if an old profiled binary is run.If too many people are running a profiled binary, some of them cannot obtain a lock. An error message similar to the following is displayed after a delay of several seconds:
The stored information is not updated. This locking is safe across a network. Since locking is performed on a file-by-file basis, other files may be correctly updated.
The profiling subroutines attempt to deal with automounted file systems that have become inaccessible. They still fail if the file system containing a coverage data file is mounted with different names on different machines, or if the user running the profiled binary does not have permission to write to either the coverage data file or the directory containing it. Be sure all the directories are uniformly named and writable by anyone expected to run the binary.
Errors Reported by
tcov
Runtime RoutinesThe following error messages may be reported by the
tcov
runtime routines:
- The user running the binary lacks permission to read or write to the coverage data file. The problem also occurs if the coverage data file has been deleted.
tcov_exit: Could not open coverage data file 'coverage_data_file_name' because 'system_error_message_string'.- The user running the binary lacks permission to write to the directory containing the coverage data file. The problem also occurs if the directory containing the coverage data file is not mounted on the machine where the binary is being run.
tcov_exit: Could not write coverage data file'coverage_data_file_name' because'system_error_message_string'.- Too many users are trying to update a coverage data file at the same time. The problem also occurs if a machine has crashed while a coverage data file is being updated, leaving behind a lock file. In the event of a crash, the longer of the two files should be used as the post-crash coverage data file. Manually remove the lock file.
tcov_exit: Failed to create lock file 'lock_file_name' for coverage data file 'coverage_data_file_name' after 5 tries. Is someone else running this executable?- No memory is available, and the standard I/O package will not work. You cannot update the coverage data file at this point.
tcov_exit: Stdio failure, probably no memory left.- The lock file name is longer by six characters than the coverage data file name. Therefore, the derived lock file name may not be legal.
tcov_exit: Coverage data file path name too long (length characters) 'coverage_data_file_name'.- A library or binary that has
tcov
profiling enabled is simultaneously being run, edited, and recompiled. The old binary expects a coverage data file of a certain size, but the editing often changes that size. If the compiler creates a new coverage data file at the same time that the old binary is trying to update the old coverage data file, the binary may see an apparently empty or corrupt coverage file.
tcov_exit: Coverage data file 'coverage_data_file_name' is too short. Is it out of date?Using
tcov
Enhanced for Statement-Level AnalysisLike the original
tcov
,tcov
Enhanced gives line-by-line information on how a program executes. It produces a copy of the source file, annotated to show which lines are used and how often. It also gives a summary of information about basic blocks.tcov
Enhanced works with both C and C++ source files.Advantages of
tcov
Enhanced
tcov
Enhanced overcomes some of the shortcomings of the originaltcov
:
- It provides more complete support for C++.
- It supports code found in
#include
header files and corrects a flaw that obscured coverage numbers for template classes and functions.- Its runtime is more efficient than the original
tcov
runtime.- It is supported for all the platforms that the compilers support.
Compiling for
tcov
EnhancedTo use
tcov
Enhanced, follow the same basic steps as the originaltcov
:
- Compile a program for a
tcov
Enhanced experiment.- Run the experiment.
- Analyze the results using
tcov
.For the original
tcov
, you compile with the-xa
option. To compile a program for code coverage withtcov
Enhanced, you use the-xprofile=tcov
option for all compilers. Using a program namedindex.assist
as an example, you would compile for use withtcov
Enhanced with this command:
%cc -xprofile=tcov -o index.assist index.assist.c
tcov
Enhanced, unliketcov
, does not produce a.d
file. The coverage data file is not created until the program is run. Then one coverage data file is produced as opposed to one file for each module compiled for coverage analysis.After you compile
index.assist.c
, you runindex.assist
to create the profile file:
%index.assist
%ls -dF *.profile
index.assist.profile/%ls *.profile
tcovdBy default, the name of the directory where the
tcovd
file is stored is derived from the name of the executable. Furthermore, that directory is created in the directory the executable was run in (the originaltcov
created the.d
files in the directory where the modules were compiled).The directory where the
tcovd
file is stored is also known as the profile bucket. The profile bucket can be overridden by using theSUN_PROFDATA
environment variable. Doing this may be useful if the name of the executable is not the same as the value inargv[0]
(for example, the invocation of the executable was through a symbolic link with a different name).You can also override the directory where the profile bucket is created. To specify a location different from the run directory, specify the path using the
SUN_PROFDATA_DIR
environment variable. Absolute or relative pathnames can be specified in this variable. Relative pathnames are relative to the program's current working directory at program completion.
TCOVDIR
is supported as a synonym forSUN_PROFDATA_DIR
for backward compatibility. Any setting ofSUN_PROFDATA_DIR
causesTCOVDIR
to be ignored. If bothSUN_PROFDATA_DIR
andTCOVDIR
are set, a warning is displayed when the profile bucket is generated.SUN_PROFDATA_DIR
takes precedence overTCOVDIR
. The variables are used at runtime by a program compiled with-xprofile=tcov,
and are used by thetcov
command.
Note This scheme is also used by the profile feedback mechanism.
Now that some coverage data has been produced, you could generate a report that relates the raw data back to the source files:
%tcov -x index.profile index.assist.c
%ls *.tcov
index.assist.c.tcovThe output of this report is identical to the one from the previous example (for the original
tcov
).Creating Profiled Shared Libraries
Creating shared libraries for use with
tcov
Enhanced is accomplished by using the analogous compiler options:
%cc -G -xprofile=tcov -o foo.so.1 doo.o
Locking Files
tcov
Enhanced uses a simple file-locking mechanism for updating the block coverage data file. It employs a single file created in the same directory as thetcovd
file. The file name istcovd.temp.lock
. If execution of the program compiled for coverage analysis is manually terminated, then the lock file must be deleted manually.The locking scheme does an exponential back-off if there is a contention for the lock. If, after five tries, the
tcov
runtime cannot acquire the lock, it exits, and the data is lost for that run. In this case, the following message is displayed:
tcov_exit: temp file exists, is someone else running this executable?
tcov
Directories and Environment VariablesWhen you compile a program for
tcov
and run the program, the running program generates a profile bucket. If a previous profile bucket exists, the program uses that profile bucket. If a profile bucket does not exist, it creates the profile bucket.The profile bucket specifies the directory where the profile output is generated. The name and location of the profile output are controlled by defaults that you can modify with environment variables.
Note tcov
uses the same defaults and environment variables that are used by the compiler options that you use to gather profile feedback:-xprofile=collect
and-xprofile=use
. For more information about these compiler options, see the documentation for your compiler.
The default profile bucket the program creates is named after the executable with a
.profile
extension and is created in the directory where the executable is run. Therefore, if you are in/home/userdir
, and run a program called/usr/bin/xyz
, the default behavior is to create a profile bucket calledxyz.profile
in/home/userdir
.You set the following environment variables to modify the defaults:
SUN_PROFDATA
- Can be used to specify the name of the profile bucket at runtime. The value of this variable is always appended to the value of
SUN_PROFDATA_DIR
if both variables are set.SUN_PROFDATA_DIR
- Can be used to specify the name of the directory containing the profile bucket. It is used at runtime and in the
tcov
command.TCOVDIR
TCOVDIR
is supported as a synonym forSUN_PROFDATA_DIR
to maintain backward compatibility. Any setting ofSUN_PROFDATA_DIR
causesTCOVDIR
to be ignored. If bothSUN_PROFDATA_DIR
andTCOVDIR
are set, a warning is displayed when the profile bucket is generated.TCOVDIR
is used at runtime by a program compiled with-xprofile=tcov
and it is used by thetcov
command.Overriding the Default Definitions
To override the default, use the environment variables to change the profile bucket:
1. Change the name of the profile bucket by using theSUN_PROFDATA
environment variable.2. Change the directory where the profile-bucket is placed by using theSUN_PROFDATA_DIR
environment variable.The environment variables override the default location and name of the profile bucket. Both can be overridden independently. For example, if you only choose to set
SUN_PROFDATA_DIR
, the profile bucket will go into the directory where you setSUN_PROFDATA_DIR
. The default name (which is the executable name followed by a.profile
extension) will still be the name used for the profile bucket.Absolute and Relative Pathnames
There are two forms of directories you can specify by using
SUN_PROFDATA_DIR
on the Profile Feedback compile line: absolute pathnames (which start with a `/
'), and relative pathnames. If you use an absolute pathname, the profile bucket is dropped into that directory. If you specify a relative pathname, then it is relative to the current working directory where the executable is being run.For example, if you are in
/home/userdir
and run a program called/usr/bin/xyz
withSUN_PROFDATA_DIR
set to..
, then the profile bucket is called/home/userdir/../xyz.profile
. The value specified in the environment variable was relative, and therefore, it was relative to/home/userdir
. Also, the default profile bucket name is used, which is named after the executable.
TCOVDIR
andSUN_PROFDATA_DIR
The previous version of
tcov
(enabled by compiling with the-xa
or-a
flag) used an environment variable calledTCOVDIR
.TCOVDIR
specified the directory where thetcov
counter files go to instead of next to the source files. To retain compatibility with this environment variable, the newSUN_PROFDATA_DIR
environment variable behaves like theTCOVDIR
environment variable. If both variables are set, a warning is output andSUN_PROFDATA_DIR
takes precedence overTCOVDIR
.For
-xprofile=tcov
By default the profile bucket is called <
argv[0]
>.profile
in the current directory.If you set
SUN_PROFDATA
, the profile bucket is called$SUN_PROFDATA
, wherever it is located.If you set
SUN_PROFDATA_DIR
, the profile bucket is placed in the specified directory.
SUN_PROFDATA
andSUN_PROFDATA_DIR
are independent. If both are specified, the profile bucket name is generated by usingSUN_PROFDATA_DIR
to find the profile bucket and,SUN_PROFDATA
is used to name the profile bucket in that directory.A UNIX process can change its current working directory during the execution of a program. The current working directory used to generate the profile bucket is the current working directory of the program at exit. In the rare case where a program actually does change its current working directory during execution, you can use the environment variables to control where the profile bucket is generated.
For the
tcov
ProgramThe
-xprofile=bucket
option specifies the name of the profile bucket to use for thetcov
profile.SUN_PROFDATA_DIR
orTCOVDIR
are prepended to this argument, if they are set.
Sun Microsystems, Inc. Copyright information. All rights reserved. Feedback |
Library | Contents | Previous | Next | Index |