Solaris Dynamic Tracing Guide

Chapter 18 `lockstat` Provider

The lockstat provider makes available probes that can be used to discern lock contention statistics, or to understand virtually any aspect of locking behavior. The lockstat(1M) command is actually a DTrace consumer that uses the lockstat provider to gather its raw data.

Overview

The lockstat provider makes available two kinds of probes: content-event probes and hold-event probes.

Contention-event probes correspond to contention on a synchronization primitive, and fire when a thread is forced to wait for a resource to become available. Solaris is generally optimized for the non-contention case, so prolonged contention is not expected. These probes should be used to understand those cases where contention does arise. Because contention is relatively rare, enabling contention-event probes generally doesn't substantially affect performance.

Hold-event probes correspond to acquiring, releasing, or otherwise manipulating a synchronization primitive. These probes can be used to answer arbitrary questions about the way synchronization primitives are manipulated. Because Solaris acquires and releases synchronization primitives very often (on the order of millions of times per second per CPU on a busy system), enabling hold-event probes has a much higher probe effect than does enabling contention-event probes. While the probe effect induced by enabling them can be substantial, it is not pathological; they may still be enabled with confidence on production systems.

The lockstat provider makes available probes that correspond to the different synchronization primitives in Solaris; these primitives and the probes that correspond to them are discussed in the remainder of this chapter.

Adaptive Lock Probes

Adaptive locks enforce mutual exclusion to a critical section, and may be acquired in most contexts in the kernel. Because adaptive locks have few context restrictions, they comprise the vast majority of synchronization primitives in the Solaris kernel. These locks are adaptive in their behavior with respect to contention: when a thread attempts to acquire a held adaptive lock, it will determine if the owning thread is currently running on a CPU. If the owner is running on another CPU, the acquiring thread will spin. If the owner is not running, the acquiring thread will block.

The four lockstat probes pertaining to adaptive locks are in Table 18–1. For each probe, arg0 contains a pointer to the kmutex_t structure that represents the adaptive lock.

Table 18–1 Adaptive Lock Probes


`adaptive-acquire`	Hold-event probe that fires immediately after an adaptive lock is acquired.
`adaptive-block`	Contention-event probe that fires after a thread that has blocked on a held adaptive mutex has reawakened and has acquired the mutex. If both probes are enabled, `adaptive-block` fires before `adaptive-acquire`. At most one of `adaptive-block` and `adaptive-spin` will fire for a single lock acquisition. `arg1` for `adaptive-block` contains the sleep time in nanoseconds.
`adaptive-spin`	Contention-event probe that fires after a thread that has spun on a held adaptive mutex has successfully acquired the mutex. If both are enabled, `adaptive-spin` fires before `adaptive-acquire`. At most one of `adaptive-spin` and `adaptive-block` will fire for a single lock acquisition. `arg1` for `adaptive-spin` contains the spin count: the number of iterations that were taken through the spin loop before the lock was acquired. The spin count has little meaning on its own, but can be used to compare spin times.
`adaptive-release`	Hold-event probe that fires immediately after an adaptive lock is released.

Spin Lock Probes

Threads cannot block in some contexts in the kernel, such as high-level interrupt context and any context manipulating dispatcher state. In these contexts, this restriction prevents the use of adaptive locks. Spin locks are instead used to effect mutual exclusion to critical sections in these contexts. As the name implies, the behavior of these locks in the presence of contention is to spin until the lock is released by the owning thread. The three probes pertaining to spin locks are in Table 18–2.

Table 18–2 Spin Lock Probes


`spin-acquire`	Hold-event probe that fires immediately after a spin lock is acquired.
`spin-spin`	Contention-event probe that fires after a thread that has spun on a held spin lock has successfully acquired the spin lock. If both are enabled, `spin-spin` fires before `spin-acquire`. `arg1` for `spin-spin` contains the spin count: the number of iterations that were taken through the spin loop before the lock was acquired. The spin count has little meaning on its own, but can be used to compare spin times.
`spin-release`	Hold-event probe that fires immediately after a spin lock is released.

Adaptive locks are much more common than spin locks. The following script displays totals for both lock types to provide data to support this observation.

lockstat:::adaptive-acquire
/execname == "date"/
{
	@locks["adaptive"] = count();
}

lockstat:::spin-acquire
/execname == "date"/
{
	@locks["spin"] = count();
}

Run this script in one window, and a date(1) command in another. When you terminate the DTrace script, you will see output similar to the following example:

# dtrace -s ./whatlock.d
dtrace: script './whatlock.d' matched 5 probes 
^C
spin                                                             26
adaptive                                                       2981

As this output indicates, over 99 percent of the locks acquired in running the date command are adaptive locks. It may be surprising that so many locks are acquired in doing something as simple as a date. The large number of locks is a natural artifact of the fine-grained locking required of an extremely scalable system like the Solaris kernel.

Thread Locks

Thread locks are a special kind of spin lock that are used to lock a thread for purposes of changing thread state. Thread lock hold events are available as spin lock hold-event probes (that is, spin-acquire and spin-release), but contention events have their own probe specific to thread locks. The thread lock hold-event probe is in Table 18–3.

Table 18–3 Thread Lock Probe


`thread-spin`	Contention-event probe that fires after a thread has spun on a thread lock. Like other contention-event probes, if both the contention-event probe and the hold-event probe are enabled, `thread-spin` will fire before `spin-acquire`. Unlike other contention-event probes, however, thread-spin fires before the lock is actually acquired. As a result, multiple `thread-spin` probe firings may correspond to a single `spin-acquire` probe firing.

Readers/Writer Lock Probes

Readers/writer locks enforce a policy of allowing multiple readers or a single writer — but not both — to be in a critical section. These locks are typically used for structures that are searched more frequently than they are modified and for which there is substantial time in the critical section. If critical section times are short, readers/writer locks will implicitly serialize over the shared memory used to implement the lock, giving them no advantage over adaptive locks. See rwlock(9F) for more details on readers/writer locks.

The probes pertaining to readers/writer locks are in Table 18–4. For each probe, arg0 contains a pointer to the krwlock_t structure that represents the adaptive lock.

Table 18–4 Readers/Writer Lock Probes


`rw-acquire`	Hold-event probe that fires immediately after a readers/writer lock is acquired. `arg1` contains the constant `RW_READER` if the lock was acquired as a reader, and `RW_WRITER` if the lock was acquired as a writer.
`rw-block`	Contention-event probe that fires after a thread that has blocked on a held readers/writer lock has reawakened and has acquired the lock. `arg1` contains the length of time (in nanoseconds) that the current thread had to sleep to acquire the lock. `arg2` contains the constant `RW_READER` if the lock was acquired as a reader, and `RW_WRITER` if the lock was acquired as a writer. `arg3` and `arg4` contain more information on the reason for blocking. `arg3` is non-zero if and only if the lock was held as a writer when the current thread blocked. `arg4` contains the readers count when the current thread blocked. If both the `rw-block` and `rw-acquire` probes are enabled, `rw-block` fires before `rw-acquire`.
`rw-upgrade`	Hold-event probe that fires after a thread has successfully upgraded a readers/writer lock from a reader to a writer. Upgrades do not have an associated contention event because they are only possible through a non-blocking interface, rw_tryupgrade(9F).
`rw-downgrade`	Hold-event probe that fires after a thread had downgraded its ownership of a readers/writer lock from writer to reader. Downgrades do not have an associated contention event because they always succeed without contention.
`rw-release`	Hold-event probe that fires immediately after a readers/writer lock is released. `arg1` contains the constant `RW_READER` if the released lock was held as a reader, and `RW_WRITER` if the released lock was held as a writer. Due to upgrades and downgrades, the lock may not have been released as it was acquired.

Stability

The lockstat provider uses DTrace's stability mechanism* to describe its stabilities as shown in the following table. For more information about the stability mechanism, see Chapter 39, Stability.

Element	Name stability	Data stability	Dependency class
Provider	Evolving	Evolving	Common
Module	Private	Private	Unknown
Function	Private	Private	Unknown
Name	Evolving	Evolving	Common
Arguments	Evolving	Evolving	Common