Linker and Libraries Guide

Profiling Shared Objects

The runtime linker is capable of generating profiling information for any shared objects processed during the running of an application. This is possible because the runtime linker is responsible for binding shared objects to an application and is therefore able to intercept any global function bindings (these bindings take place through .plt entries -- see "When Relocations Are Performed" for details of this mechanism).

The profiling of a shared object is enabled by specifying its name with the LD_PROFILE environment variable. You can analyze one shared object at a time using this environment variable. However, the setting of the environment variable can be used to analyze one or more applications use of the shared object. In the following example the use of libc by the single invocation of the command ls(1) is analyzed:


$ LD_PROFILE=libc.so.1  ls -l

In the following example the environment variable setting will cause any application's use of libc to accumulate the analyzed information for the duration that the environment variable is set:


$ LD_PROFILE=libc.so.1; export LD_PROFILE
$ ls -l
$ make
$ ...

When profiling is enabled, a profile data file is created, if it doesn't already exist, and is mapped by the runtime linker. In the above examples, this data file is /var/tmp/libc.so.1.profile. 64-bit libraries require an extended profile format and are written using the .profilex suffix. You can also specify an alternative directory to store the profile data using the LD_PROFILE_OUTPUT environment variable.

This profile data file is used to deposit profil(2) data and call count information related to the specified shared objects use. This profiled data can be directly examined with gprof(1).


Note -

gprof(1) is most commonly used to analyze the gmon.out profile data created by an executable that has been compiled with the -xpg option of cc(1). The runtime linkers profile analysis does not require any code to be compiled with this option. Applications whose dependent shared objects are being profiled should not make calls to profil(2), because this system call does not provide for multiple invocations within the same process. For the same reason, these applications must not be compiled with the -xpg option of cc(1), as this compiler-generated mechanism of profiling is also built on top of profil(2).


One of the most powerful features of this profiling mechanism is to allow the analysis of a shared object as used by multiple applications. Frequently, profiling analysis is carried out using one or two applications. However, a shared object, by its very nature, can be used by a multitude of applications. Analyzing how these applications use the shared object can offer insights into where energy might be spent to improvement the overall performance of the shared object.

The following example shows a performance analysis of libc over a build of several applications within a source hierarchy:


$ LD_PROFILE=libc.so.1 ; export LD_PROFILE
$ make
$ gprof -b /usr/lib/libc.so.1 /var/tmp/libc.so.1.profile
.....

granularity: each sample hit covers 4 byte(s) ....

                                  called/total     parents
index  %time    self descendents  called+self    name      index
                                  called/total     children
.....
-----------------------------------------------
                0.33        0.00      52/29381     _gettxt [96]
                1.12        0.00     174/29381     _tzload [54]
               10.50        0.00    1634/29381     <external>
               16.14        0.00    2512/29381     _opendir [15]
              160.65        0.00   25009/29381     _endopen [3]
[2]     35.0  188.74        0.00   29381         _open [2]
-----------------------------------------------
.....
granularity: each sample hit covers 4 byte(s) ....

   %  cumulative    self              self    total         
 time   seconds   seconds    calls  ms/call  ms/call name   
 35.0     188.74   188.74    29381     6.42     6.42  _open [2]
 13.0     258.80    70.06    12094     5.79     5.79  _write [4]
  9.9     312.32    53.52    34303     1.56     1.56  _read [6]
  7.1     350.53    38.21     1177    32.46    32.46  _fork [9]
 ....

The special name <external> indicates a reference from outside of the address range of the shared object being profiled. Thus, in the above example, 1634 calls to the function open(2) within libc occurred from the dynamic executables, or from other shared objects, bound with libc while the profiling analysis was in progress.


Note -

The profiling of shared objects is multithread safe, except in the case where one thread calls fork(2) while another thread is updating the profile data information. The use of fork1(2) removes this restriction.