This section describes developer interfaces for use of CPU Performance counters (CPC). Solaris applications can use CPC independent of the underlying counter architecture.
This section covers recent additions to the libcpc(3LIB) library. Please see the libcpc man page for information on older interfaces.
An application preparing to use the CPC facility initializes the library with a call to the cpc_open() function. This function returns a cpc_t * parameter that is used by the other interfaces. The syntax for the cpc_open() function is as follows:
cpc_t*cpc_open(intver);The value of the ver parameter identifies the version of the interface that the application is using. The cpc_open() function fails if the underlying counters are inaccessible or unavailable.
uint_t cpc_npic(cpc_t *cpc);
uint_t cpc_caps(cpc_t *cpc);
void cpc_walk_events_all(cpc_t *cpc, void *arg,
          void (*action)(void *arg, const char *event));
void cpc_walk_events_pic(cpc_t *cpc, uint_t picno, void *arg, 
          void(*action)(void *arg, uint_t picno, const char *event));
void cpc_walk-attrs(cpc_t *cpc, void *arg,
          void (*action)(void *arg, const char *attr));
The cpc_npic() function returns the number of physical counters on the underlying processor.
The cpc_caps() function returns a uint_t parameter whose value is the result of the bitwise inclusive-OR operation performed on the capabilities that the underlying processor supports. There are two capabilities. The CPC_CAP_OVERFLOW_INTERRUPT capability enables the processor to generate an interrupt when a counter overflows. The CPC_CAP_OVERFLOW_PRECISE capability enables the processor to determine which counter generates an overflow interrupt.
The kernel maintains a list of the events that the underlying processor supports. Different physical counters on a single chip do not have to use the same list of events. The cpc_walk_events_all() function calls the the action() routine for each processor-supported event without regard to physical counter. The cpc_walk_events_pic() function calls the action() routine for each processor-supported event on a specific physical counter. Both of these functions pass the arg parameter uninterpreted from the caller to each invocation of the action() function.
The platform maintains a list of attributes that the underlying processor supports. These attributes enable access to advanced processor-specific features of the performance counters. The cpc_walk_attrs() function calls the action routine on each attribute name.
cpc_set_t *cpc_set_create(cpc_t *cpc);
int cpc_set_destroy(cpc_t *cpc, cpc_set_t *set);
int cpc_set_add_request(cpc_t *cpc, cpc_set_t *set, const char *event,
          uint64_t preset, uint_t flags, uint_t nattrs,
          const cpc_attr_t *attrs);
int cpc_set_request_preset(cpc_t *cpc, cpc_set_t *set, int index,
          uint64_t preset);
The opaque data type cpc_set_t represents collections of requests. The collections are called sets. The cpc_set_create() function creates an empty set. The cpc_set_destroy() function destroys a set and frees all the memory used by the set. Destroying a set releases the hardware resources the set uses.
The cpc_set_add_request() function adds requests to a set. The following list describes the parameters of a request.
A string that specifies the name of the event to count.
A 64–bit unsigned integer that is used as the initial value of the counter.
The results of the logical OR operation applied to a group of request flags.
The number of attributes in the array that attrs points to.
A pointer to an array of cpc_attr_t structures.
The following list describes the valid request flags.
This flag enables counting of events that occur while the CPU is executing in user mode.
This flag enables counting of events that occur while the CPU is executing in privileged mode.
This flag requests notification of hardware counter overflow.
The CPC interfaces pass attributes as an array of cpc_attr_t structures.
When the cpc_set_add_request() function returns successfully, it returns an index. The index references the data generated by the request added by the call to the cpc_set_add_request() function.
The cpc_set_request_preset() function changes the preset value of a request. This enables the re-binding of an overflowed set with new presets.
The cpc_walk_requests() function calls a user-provided action() routine on each request in cpc_set_t. The value of the arg parameter is passed to the user routine without interpretation. The cpc_walk_requests() function allows applications to print the configuration of each request in a set. The syntax for the cpc_walk_requests() function is as follows:
void cpc_walk_requests(cpc_t *cpc, cpc_set_t *set, void *arg,
void (*action)(void *arg, int index, const char *event,
uint64_t preset, uint_t flags, int nattrs,
            const cpc_attr_t *attrs));
The interfaces in this section bind the requests in a set to the physical hardware and set the counters to a starting position.
int cpc_bind_curlwp(cpc_t *cpc, cpc_set_t *set, uint_t flags);
int cpc_bind_pctx(cpc_t *cpc, pctx_t *pctx, id_t id, cpc_set_t *set,
          uint_t flags);
int cpc_bind_cpu(cpc_t *cpc, processorid_t id, cpc_set_t *set, 
          uint_t flags);
int cpc_unbind(cpc_t *cpc, cpc_set_t *set);
The cpc_bind_curlwp() function binds the set to the calling LWP. The set's counters are virtualized to this LWP and count the events that occur on the CPU while the calling LWP runs. The only flag that is valid for the cpc_bind_curlwp() routine is CPC_BIND_LWP_INHERIT.
The cpc_bind_pctx() function binds the set to a LWP in a process that is captured with libpctx(3LIB). This function has no valid flags.
The cpc_bind_cpu() function binds the set to the processor specified in the id parameter. Binding a set to a CPU invalidates existing performance counter contexts on the system. This function has no valid flags.
The cpc_unbind() function stops the performance counters and releases the hardware that is associated with the bound set. If a set is bound to a CPU, the cpc_unbind() function unbinds the LWP from the CPU and releases the CPC pseudo-device.
The interfaces described in this section enable the return of data from the counters to the application. Counter data resides in an opaque data structure named cpc_buf_t. This data structure takes a snapshot of the state of counters in use by a bound set and includes the following information:
The 64–bit values of each counter
The timestamp of the most recent hardware snapshot
A cumulative CPU cycle counter that counts the number of CPU cycles the processor has used on the bound set
cpc_buf_t *cpc_buf_create(cpc_t *cpc, cpc_set_t *set); int cpc_buf_destroy(cpc_t *cpc, cpc_buf_t *buf); int cpc_set_sample(cpc_t *cpc, cpc_set_t *set, cpc_buf_t *buf);
The cpc_buf_create() function creates a buffer that stores data from the set specified in cpc_set_t. The cpc_buf_destroy() function frees the memory that is associated with the given cpc_buf_t. The cpc_buf_sample() function takes a snapshot of the counters that are counting on behalf of the specified set. The specified set must already be bound and have a buffer created before calling the cpc_buf_sample() function.
Sampling into a buffer does not update the preset of the requests that are associated with that set. When a buffer is sampled with the cpc_buf_sample() function, then unbound and bound again, counts start from the request's preset as in the original call to the cpc_set_add_request() function.
The following routines provide access to the data in a cpc_buf_t structure.
int cpc_buf_get(cpc_t *cpc, cpc_buf_t *buf, int index, uint64_t *val);
int cpc_buf_set(cpc_t *cpc, cpc_buf_t *buf, int index, uint64_t *val);
hrtime_t cpc_buf_hrtime(cpc_t *cpc, cpc_buf_t *buf);
uint64_t cpc_buf_tick(cpc_t *cpc, cpc_buf_t *buf);
int cpc_buf_sub(cpc_t *cpc, cpc_buf_t *result, cpc_buf_t *left
      cpc_buf_t *right);
int cpc_buf_add(cpc_t *cpc, cpc_buf_t *result, cpc_buf_t *left,
      cpc_buf_t *right);
int cpc_buf_copy(cpc_t *cpc, cpc_buf_t *dest, cpc_buf_t *src);
void cpc_buf_zero(cpc_t *cpc, cpc_buf_t *buf);
The cpc_buf_get() function retrieves the value of the counter that is identified by the index parameter. The index parameter is a value that is returned by the cpc_set_add_request() function before the set is bound. The cpc_buf_get() function stores the value of the counter at the location indicated by the val parameter.
The cpc_buf_set() function sets the value of the counter that is identified by the index parameter. The index parameter is a value that is returned by the cpc_set_add_request() function before the set is bound. The cpc_buf_set() function sets the counter's value to the value at the location indicated by the val parameter. Neither the cpc_buf_get() function nor the cpc_buf_set() function change the preset of the corresponding CPC request.
The cpc_buf_hrtime() function returns the high resolution timestamp that indicates when the hardware was sampled. The cpc_buf_tick() function returns the number of CPU clock cycles that have elapsed while the LWP is running.
The cpc_buf_sub() function computes the difference between the counters and tick values that are specified in the left and right parameters. The cpc_buf_sub() function stores the results in result. A given invocation of the cpc_buf_sub() function must have all cpc_buf_t values originate from the same cpc_set_t structure. The result index contains the result of the left - right computation for each request index in the buffers. The result index also contains the tick difference. The cpc_buf_sub() function sets the high-resolution timestamp of the destination buffer to the most recent time of the left or right buffers.
The cpc_buf_add() function computes the total of the counters and tick values that are specified in the left and right parameters. The cpc_buf_add() function stores the results in result. A given invocation of the cpc_buf_add() function must have all cpc_buf_t values originate from the same cpc_set_t structure. The result index contains the result of the left + right computation for each request index in the buffers. The result index also contains the tick total. The cpc_buf_add() function sets the high-resolution timestamp of the destination buffer to the most recent time of the left or right buffers.
The cpc_buf_copy() function makes dest identical to src.
The cpc_buf_zero() function sets everything in buf to zero.
This section describes activation interfaces for CPC.
int cpc_enable(cpc_t *cpc); int cpc_disable(cpc_t *cpc);
These two interfaces respectively enable and disable counters of any set that is bound to the executing LWP. Use of these interfaces enables an application to designate code of interest while deferring the counter configuration to a controlling process by using libpctx.
This section describes CPC's error handling interfaces.
typedef void (cpc_errhndlr_t)(const char *fn, int subcode, const char *fmt,
          va_list ap);
void cpc_seterrhndlr(cpc_t *cpc, cpc_errhndlr_t *errhndlr);
These two interfaces allow the passage of a cpc_t handle. The cpc_errhndlr_t handle takes an integer subcode in addition to a string. The integer subcode describes the specific error that was encountered by the function that the fn argument refers to. The integer subcode simplifies an application's recognition of error conditions. The string value of the fmt argument contains an internationalized description of the error subcode and is suitable for printing.