Solaris Dynamic Tracing Guide

Chapter 13 Speculative Tracing

This chapter discusses the DTrace facility for speculative tracing, the ability to tentatively trace data and then later decide whether to commit the data to a tracing buffer or discard it. In DTrace, the primary mechanism for filtering out uninteresting events is the predicate mechanism, discussed in Chapter 4, D Program Structure. Predicates are useful when you know at the time that a probe fires whether or not the probe event is of interest. For example, if you are only interested in activity associated with a certain process or a certain file descriptor, you know when the probe fires if it is associated with the process or file descriptor of interest. However, in other situations, you might not know whether a given probe event is of interest until some time after the probe fires.

For example, if a system call is occasionally failing with a common error code (for example, EIO or EINVAL), you might want to examine the code path leading to the error condition. To capture the code path, you could enable every probe — but only if the failing call can be isolated in such a way that a meaningful predicate can be constructed. If the failures are sporadic or nondeterministic, you would be forced to trace all events that might be interesting, and later postprocess the data to filter out the ones that were not associated with the failing code path. In this case, even though the number of interesting events may be reasonably small, the number of events that must be traced is very large, making postprocessing difficult.

You can use the speculative tracing facility in these situations to tentatively trace data at one or more probe locations, and then decide to commit the data to the principal buffer at another probe location. As a result, your trace data contains only the output of interest, no postprocessing is required, and the DTrace overhead is minimized.

Speculation Interfaces

The following table describes the DTrace speculation functions:

Table 13–1 DTrace Speculation Functions


Function Name	Args	Description
`speculation`	None	Returns an identifier for a new speculative buffer
`speculate`	ID	Denotes that the remainder of the clause should be traced to the speculative buffer specified by ID
`commit`	ID	Commits the speculative buffer associated with ID
`discard`	ID	Discards the speculative buffer associated with ID

Creating a Speculation

The speculation() function allocates a speculative buffer, and returns a speculation identifier. The speculation identifier should be used in subsequent calls to the speculate() function. Speculative buffers are a finite resource: if no speculative buffer is available when speculation() is called, an ID of zero is returned and a corresponding DTrace error counter is incremented. An ID of zero is always invalid, but may be passed to speculate(), commit() or discard(). If a call to speculation() fails, a dtrace message similar to the following example is generated:

dtrace: 2 failed speculations (no speculative buffer space available)

The number of speculative buffers defaults to one, but may be optionally tuned higher. See Speculation Options and Tuning for more information.

Using a Speculation

To use a speculation, an identifier returned from speculation() must be passed to the speculate() function in a clause before any data-recording actions. All subsequent data-recording actions in a clause containing a speculate() will be speculatively traced. The D compiler will generate a compile-time error if a call to speculate() follows data recording actions in a D probe clause. Therefore, clauses may contain speculative tracing or non-speculative tracing requests, but not both.

Aggregating actions, destructive actions, and the exit action may never be speculative. Any attempt to take one of these actions in a clause containing a speculate() results in a compile-time error. A speculate() may not follow a speculate(): only one speculation is permitted per clause. A clause that contains only a speculate() will speculatively trace the default action, which is defined to trace only the enabled probe ID. See Chapter 10, Actions and Subroutines for a description of the default action.

Typically, you assign the result of speculation() to a thread-local variable and then use that variable as a subsequent predicate to other probes as well as an argument to speculate(). For example:

syscall::open:entry
{
	self->spec = speculation();
}

syscall:::
/self->spec/
{
	speculate(self->spec);
	printf("this is speculative");
}

Committing a Speculation

You commit speculations using the commit() function. When a speculative buffer is committed, its data is copied into the principal buffer. If there is more data in the specified speculative buffer than there is available space in the principal buffer, no data is copied and the drop count for the buffer is incremented. If the buffer has been speculatively traced to on more than one CPU, the speculative data on the committing CPU is copied immediately, while speculative data on other CPUs is copied some time after the commit(). Thus, some time might elapse between a commit() beginning on one CPU and the data being copied from speculative buffers to principal buffers on all CPUs. This time is guaranteed to be no longer than the time dictated by the cleaning rate. See Speculation Options and Tuning for more details.

A committing speculative buffer will not be made available to subsequent speculation() calls until each per-CPU speculative buffer has been completely copied into its corresponding per-CPU principal buffer. Similarly, subsequent calls to speculate() to the committing buffer will be silently discarded, and subsequent calls to commit() or discard() will silently fail. Finally, a clause containing a commit() cannot contain a data recording action, but a clause may contain multiple commit() calls to commit disjoint buffers.

Discarding a Speculation

You discard speculations using the discard() function. When a speculative buffer is discarded, its contents are thrown away. If the speculation has only been active on the CPU calling discard(), the buffer is immediately available for subsequent calls to speculation(). If the speculation has been active on more than one CPU, the discarded buffer will be available for subsequent speculation() some time after the call to discard(). The time between a discard() on one CPU and the buffer being made available for subsequent speculations is guaranteed to be no longer than the time dictated by the cleaning rate. If, at the time speculation() is called, no buffer is available because all speculative buffers are currently being discarded or committed, adtrace message similar to the following example is generated:

dtrace: 905 failed speculations (available buffer(s) still busy)

The likelihood of all buffers being unavailable can be reduced by tuning the number of speculation buffers or the cleaning rate. See Speculation Options and Tuning, for details.

Speculation Example

One potential use for speculations is to highlight a particular code path. The following example shows the entire code path under the open(2) system call only when the open() fails:

Example 13–1 `specopen.d`: Code Flow for Failed open(2)

#!/usr/sbin/dtrace -Fs

syscall::open:entry,
syscall::open64:entry
{
	/*
	 * The call to speculation() creates a new speculation.  If this fails,
	 * dtrace(1M) will generate an error message indicating the reason for
	 * the failed speculation(), but subsequent speculative tracing will be
	 * silently discarded.
	 */
	self->spec = speculation();
	speculate(self->spec);

	/*
	 * Because this printf() follows the speculate(), it is being 
	 * speculatively traced; it will only appear in the data buffer if the
	 * speculation is subsequently commited.
	 */
	printf("%s", stringof(copyinstr(arg0)));
}

fbt:::
/self->spec/
{
	/*
	 * A speculate() with no other actions speculates the default action:
	 * tracing the EPID.
	 */
	speculate(self->spec);
}

syscall::open:return,
syscall::open64:return
/self->spec/
{
	/*
	 * To balance the output with the -F option, we want to be sure that
	 * every entry has a matching return.  Because we speculated the
	 * open entry above, we want to also speculate the open return.
	 * This is also a convenient time to trace the errno value.
	 */
	speculate(self->spec);
	trace(errno);
}

syscall::open:return,
syscall::open64:return
/self->spec && errno != 0/
{
	/*
	 * If errno is non-zero, we want to commit the speculation.
	 */
	commit(self->spec);
	self->spec = 0;
}

syscall::open:return,
syscall::open64:return
/self->spec && errno == 0/
{
	/*
	 * If errno is not set, we discard the speculation.
	 */
	discard(self->spec);
	self->spec = 0;
}

Running the above script produces output similar to the following example:

# ./specopen.d
dtrace: script './specopen.d' matched 24282 probes
CPU FUNCTION                                 
  1  => open                                  /var/ld/ld.config
  1    -> open                                
  1      -> copen                             
  1        -> falloc                          
  1          -> ufalloc                       
  1            -> fd_find                     
  1              -> mutex_owned               
  1              <- mutex_owned               
  1            <- fd_find                     
  1            -> fd_reserve                  
  1              -> mutex_owned               
  1              <- mutex_owned               
  1              -> mutex_owned               
  1              <- mutex_owned               
  1            <- fd_reserve                  
  1          <- ufalloc                       
  1          -> kmem_cache_alloc              
  1            -> kmem_cache_alloc_debug      
  1              -> verify_and_copy_pattern   
  1              <- verify_and_copy_pattern   
  1              -> file_cache_constructor    
  1                -> mutex_init              
  1                <- mutex_init              
  1              <- file_cache_constructor    
  1              -> tsc_gethrtime             
  1              <- tsc_gethrtime             
  1              -> getpcstack                
  1              <- getpcstack                
  1              -> kmem_log_enter            
  1              <- kmem_log_enter            
  1            <- kmem_cache_alloc_debug      
  1          <- kmem_cache_alloc              
  1          -> crhold                        
  1          <- crhold                        
  1        <- falloc                          
  1        -> vn_openat                       
  1          -> lookupnameat                  
  1            -> copyinstr                   
  1            <- copyinstr                   
  1            -> lookuppnat                  
  1              -> lookuppnvp                
  1                -> pn_fixslash             
  1                <- pn_fixslash             
  1                -> pn_getcomponent         
  1                <- pn_getcomponent         
  1                -> ufs_lookup              
  1                  -> dnlc_lookup           
  1                    -> bcmp                
  1                    <- bcmp                
  1                  <- dnlc_lookup           
  1                  -> ufs_iaccess           
  1                    -> crgetuid            
  1                    <- crgetuid            
  1                    -> groupmember         
  1                      -> supgroupmember    
  1                      <- supgroupmember
  1                    <- groupmember         
  1                  <- ufs_iaccess           
  1                <- ufs_lookup              
  1                -> vn_rele                 
  1                <- vn_rele                 
  1                -> pn_getcomponent         
  1                <- pn_getcomponent         
  1                -> ufs_lookup              
  1                  -> dnlc_lookup           
  1                    -> bcmp                
  1                    <- bcmp                
  1                  <- dnlc_lookup           
  1                  -> ufs_iaccess           
  1                    -> crgetuid            
  1                    <- crgetuid            
  1                  <- ufs_iaccess           
  1                <- ufs_lookup              
  1                -> vn_rele                 
  1                <- vn_rele                 
  1                -> pn_getcomponent         
  1                <- pn_getcomponent         
  1                -> ufs_lookup              
  1                  -> dnlc_lookup           
  1                    -> bcmp                
  1                    <- bcmp                
  1                  <- dnlc_lookup           
  1                  -> ufs_iaccess           
  1                    -> crgetuid            
  1                    <- crgetuid            
  1                  <- ufs_iaccess           
  1                  -> vn_rele               
  1                  <- vn_rele               
  1                <- ufs_lookup              
  1                -> vn_rele                 
  1                <- vn_rele                 
  1              <- lookuppnvp                
  1            <- lookuppnat                  
  1          <- lookupnameat                  
  1        <- vn_openat                       
  1        -> setf                            
  1          -> fd_reserve                    
  1            -> mutex_owned                 
  1            <- mutex_owned                 
  1            -> mutex_owned                 
  1            <- mutex_owned                 
  1          <- fd_reserve                    
  1          -> cv_broadcast                  
  1          <- cv_broadcast                  
  1        <- setf                            
  1        -> unfalloc                        
  1          -> mutex_owned                   
  1          <- mutex_owned                   
  1          -> crfree                        
  1          <- crfree                        
  1          -> kmem_cache_free               
  1            -> kmem_cache_free_debug       
  1              -> kmem_log_enter            
  1              <- kmem_log_enter            
  1              -> tsc_gethrtime             
  1              <- tsc_gethrtime             
  1              -> getpcstack                
  1              <- getpcstack                
  1              -> kmem_log_enter            
  1              <- kmem_log_enter
  1              -> file_cache_destructor     
  1                -> mutex_destroy           
  1                <- mutex_destroy           
  1              <- file_cache_destructor     
  1              -> copy_pattern              
  1              <- copy_pattern              
  1            <- kmem_cache_free_debug       
  1          <- kmem_cache_free               
  1        <- unfalloc                        
  1        -> set_errno                       
  1        <- set_errno                       
  1      <- copen                             
  1    <- open                                
  1  <= open                                          2

Speculation Options and Tuning

If a speculative buffer is full when a speculative tracing action is attempted, no data is stored in the buffer and a drop count is incremented. If this situation, a dtrace message similar to the following example is generated:

dtrace: 38 speculative drops

Speculative drops will not prevent the full speculative buffer from being copied into the principal buffer when the buffer is committed. Similarly, speculative drops can occur even if drops were experienced on a speculative buffer that was ultimately discarded. Speculative drops can be reduced by increasing the speculative buffer size, which is tuned using the specsize option. The specsize option may be specified with any size suffix. The resizing policy of this buffer is dictated by the bufresize option.

Speculative buffers might be unavailable when speculation() is called. If buffers exist that have not yet been committed or discarded, a dtrace message similar to the following example is generated:

dtrace: 1 failed speculation (no speculative buffer available)

You can reduce the likelihood of failed speculations of this nature by increasing the number of speculative buffers with the nspec option. The value of nspec defaults to one.

Alternatively, speculation() may fail because all speculative buffers are busy. In this case, a dtrace message similar to the following example is generated:

dtrace: 1 failed speculation (available buffer(s) still busy)

This message indicates that speculation() was called after commit() was called for a speculative buffer, but before that buffer was actually committed on all CPUs. You can reduce the likelihood of failed speculations of this nature by increasing the rate at which CPUs are cleaned with the cleanrate option. The value of cleanrate defaults to 101.