Analyzing Program Performance With Sun WorkShop

Chapter 5 Lock Analysis Tool

LockLint is a command line utility that analyzes the use of mutex and multiple readers/single writer locks, and looks for inconsistent use of these locking techniques.

This chapter is organized as follows:

Basic Concepts

In the multithreading model, a process consists of one or more threads of control that share a common address space and most other process resources. Threads must acquire and release locks associated with the data they share. If they fail to do so, a data race may ensue--a situation in which a program may produce different results when run repeatedly with the same input.

Data races are easy problems to introduce. Simply accessing a variable without first acquiring the appropriate lock can cause one. Data races are generally very difficult to find. Symptoms generally manifest themselves only if two threads access the improperly protected data at nearly the same time; hence a data race may easily run correctly for months without showing any signs of a problem. It is extremely difficult to exhaustively test all concurrent states of a program for even a simple multithreaded program, so conventional testing and debugging are not an adequate defense against data races.

Most processes share several resources. Operations within the application may require access to more than one of those resources. This means that the operation needs to grab a lock for each of the resources before performing the operation. If different operations use a common set of resources, but the order in which they acquire the locks is inconsistent, there is a potential for deadlock. For example, the simplest case of deadlock occurs when two threads hold locks for different resources and each thread tries to acquire the lock for the resource held by the other thread.

LockLint Overview

When analyzing locks and how they are used, LockLint detects a common cause of data races: failure to hold the appropriate lock while accessing a variable.

Table 5-1, Table 5-2, and Table 5-3 list the routines of the Solaris and POSIX libthread APIs recognized by LockLint.

Table 5-1 Reader -Writer Locks


Solaris	Kernel (Solaris only)
`rw_rdlock, rw_wrlock rw_unlock rw_tryrdlock, rw_trywrlock`	`rw_enter` `rw_exit` `rw_tryenter` `rw_downgrade` `rw_tryupgrade`

Table 5-2 Condition Variables


Solaris	POSIX	Kernel (Solaris only)
`cond_broadcast` `cond_wait` `cond_timedwait` `cond_signal`	`pthread_cond_broadcast` `pthread_cond_wait` `pthread_cond_timedwait` `pthread_cond_signal`	`cv_broadcast` `cv_wait` `cv_wait_sig` `cv_wait_sig_swap` `cv_timedwait` `cv_timedwait_sig` `cv_signal`

Table 5-3 Mutex (Mutual Exclusion) Locks


Solaris	POSIX	Kernel (Solaris only)
`mutex_lock` `mutex_unlock` `mutex_trylock`	`pthread_mutex_lock` `pthread_mutex_unlock` `pthread_mutex_trylock`	`mutex_enter` `mutex_exit` `mutex_tryenter`

Additionally, LockLint recognizes the structure types shown in Table 5-4.

Table 5-4 Lock Structures


Solaris	POSIX	Kernel (Solaris only)
mutex_t	pthread_mutex_t	kmutex_t
rwlock_t		krwlock_t

LockLint reports several kinds of basic information about the modules it analyzes, including:

Locking side effects of functions. Unknown side effects can lead to data races or deadlocks.

Accesses to variables that are not consistently protected by at least one lock, and accesses that violate assertions about which locks protect them. This information can point to a potential data race.

Cycles and inconsistent lock-order acquisitions. This information can point to potential deadlocks.

Variables that were protected by a given lock. This can assist in judging the appropriateness of the chosen granularity, that is, which variables are protected by which locks.

LockLint provides subcommands for specifying assertions about the application. During the analysis phase, LockLint reports any violation of the assertions.

Note -

Add assertions liberally, and use the analysis phase to refine assertions and to make sure that new code does not violate the established locking conventions of the program.

Collecting Information for LockLint

The compiler gathers the information used by LockLint. More specifically, you specify a command-line option, -Zll, to the C compiler to generate a .ll file for each .c source code file. The .ll file contains information about the flow of control in each function and about each access to a variable or operation on a mutex or readers-writer lock.

Note -

No .o file is produced when you compile with the -Zll flag.

LockLint User Interface

There are two ways for you to interact with LockLint: source code annotations and the command-line interface.

Source code annotations are assertions and NOTEs that you place in your source code to pass information to LockLint. LockLint can verify certain assertions about the states of locks at specific points in your code, and annotations can be used to verify that locking behavior is correct or avoid unnecessary error warnings.

See "Source Code Annotations" for more information.

Alternatively, you can use LockLint subcommands to load the relevant .ll files and make assertions. This interface to LockLint consists of a lock_lint command and a set of subcommands that you specify on the lock_lint command line.

The important features of the lock_lint subcommands are:

You can exercise a few additional controls that have no corresponding annotations.
You can make a number of useful queries about the functions, variables, function pointers, and locks in your program.

LockLint subcommands help you analyze your code and discover which variables are not consistently protected by locks. You may make assertions about which variables are supposed to be protected by a lock and which locks are supposed to be held whenever a function is called. Running the analysis with such assertions in place will show you where the assertions are violated.

See Appendix A, LockLint Command Reference.

Most programmers report that they find source code annotations preferable to command-line subcommands. However, there is not always a one-to-one correspondence between the two.

How to Use LockLint

Using LockLint consists of three steps:

Setting up the environment for using LockLint
Compiling the source code to be analyzed, producing the LockLint database files (.ll files)
Using the lock_lint command to run a LockLint session

These steps are described in the rest of this section.

Figure 5-1 shows the flow control of tasks involved in using LockLint:

Figure 5-1 LockLint Control Flow

Use LockLint to refine the set of assertions you maintain for the implementation of your system. A rich set of assertions enables LockLint to validate existing and new source code as you work.

Managing LockLint's Environment

The LockLint interface consists of the lock_lint command, which is executed in a shell, and the lock_lint subcommands. By default, LockLint uses the shell given by the environment variable $SHELL. Alternatively, LockLint can execute any shell by specifying the shell to use on the lock_lint start command. This example starts a LockLint session in the Korn shell:

% lock_lint start /bin/ksh

LockLint creates an environment variable called LL_CONTEXT, which is visible in the child shell. If you are using a shell that provides for initialization, you can arrange to have the lock_lint command source a .ll_init file in your home directory, and then execute a .ll_init file in the current directory if it exists. If you use csh, you can do this by inserting the following code into your .cshrc file:

if ($?LL_CONTEXT) then
	if ( -x $(HOME)/.ll_init ) source $(HOME)/.ll_init
endif

It is better not to have your .cshrc source the file in your current working directory, since others may want to run LockLint on those same files, and they may not use the same shell you do. Since you are the only one who is going to use your $(HOME)/.ll_init, you should source that one, so that you can change the prompt and define aliases for use during your LockLint session. The following version of ~/.ll_init does this for csh:

# Cause analyze subcommand to save state before analysis.
alias analyze "lock_lint save before analyze;\
	lock_lint analyze"
# Change prompt to show we are in lock_lint.
set prompt="lock_lint~$prompt"

Also see "start".

When executing subcommands, remember that you can use pipes, redirection, backward quotes (`), and so on to accomplish your aims. For example, the following command asserts that lock foo protects all global variables (the formal name for a global variable begins with a colon):

% lock_lint assert foo protects `lock_lint vars | grep ^:`

In general, the subcommands are set up for easy use with filters such as grep and sed. This is particularly true for vars and funcs, which put out a single line of information for each variable or function. Each line contains the attributes (defined and derived) for that variable or function. The following example shows which members of struct bar are supposed to be protected by member lock:

% lock_lint vars -a `lock_lint members bar` | grep =bar::lock

Since you are using a shell interface, a log of user commands can be obtained by using the shell's history function (the history level may need to be made large in the .ll_init file).

Temporary Files

LockLint puts temporary files in /var/tmp unless $TMPDIR is set.

Makefile Rules

To modify your makefile to produce .ll files, first use the rule for creating a .o from a .c to write a rule to create a .ll from a .c. For example, from:

# Rule for making .o from .c in ../src.
%.o: ../src/%.c
	$(COMPILE.c) -o $@ $<

you might write:

# Rule for making .ll from .c in ../src.
%.ll: ../src/%.c
	cc $(CFLAGS) $(CPPFLAGS) $(FOO) $<

In the above example, the -Zll flag would have to be specified in the make macros for compiler options (CFLAGS and CPPFLAGS).

If you use a suffix rule, you will need to define .ll as a suffix. For that reason some prefer to use % rules.

If the appropriate .o files are contained in a make variable FOO_OBJS, you can create FOO_LLS with the line:

FOO_LLS = ${FOO_OBJS:%.o=%.ll}

or, if they are in a subdirectory ll:

FOO_LLS = ${FOO_OBJS:%.o=ll/%.ll}

If you want to keep the .ll files in subdirectory ll/, you can have the makefile automatically create this file with the label:

.INIT:
	@if [ ! -d ll ]; then mkdir ll; fi

Compiling Code

For LockLint to analyze your source code, you must first compile it using the -Zll option of the Sun WorkShop ANSI C compiler. The compiler then produces the LockLint database files (.ll files), one for each .c file compiled. Later you load the .ll files into LockLint with the load subcommand.

LockLint sometimes needs a simpler view of the code to return meaningful results during analysis. To allow you to provide this simpler view, the -Zll option automatically defines the preprocessor symbol __lock_lint; further discussions of the likely uses of __lock_lint can be found in "Limitations of LockLint".

LockLint Subcommands

The interface to LockLint consists of a set of subcommands that can be specified with the lock_lint command:

lock_lint [subcommand
]

In this example subcommand is one of a set of subcommands used to direct the analysis of the source code for data races and deadlocks. More information about subcommands can be found in Appendix A, LockLint Command Reference.

Starting and Exiting LockLint

The first subcommand of any LockLint session must be start, which starts a subshell of your choice with the appropriate LockLint context. Since a LockLint session is started within a subshell, you exit by exiting that subshell. For example, to exit LockLint when using the C shell, use the command exit.

Setting the Tool State

LockLint's state consists of the set of databases loaded and the specified assertions. Iteratively modifying that state and rerunning the analysis can provide optimal information on potential data races and deadlocks. Since the analysis can be done only once for any particular state, the save, restore, and refresh subcommands are provided as a means to reestablish a state, modify that state, and retry the analysis.

Checking an Application

Annotate your source code and compile it to create .ll files.

See "Source Code Annotations".

Load the .ll files using the load subcommand.

Make assertions about locks protecting functions and variables using the assert subcommand.

Note -
These specifications may also be conveyed using source code annotations. See "Source Code Annotations".

Make assertions about the order in which locks should be acquired in order to avoid deadlocks, using the assert order subcommand.

Note -
These specifications may also be conveyed using source code annotations. See "Source Code Annotations".

Check that LockLint has the right idea about which functions are roots.

If the funcs -o subcommand does not show a root function as root, use the declare root subcommand to fix it. If funcs -o shows a non-root function as root, it's likely that the function should be listed as a function target using the declare ... targets subcommand. See "declare root func" for a discussion of root functions.

Describe any hierarchical lock relationships (if you have any--they are rare) using the assert rwlock subcommand.

Note -
These specifications may also be conveyed using source code annotations. See "Source Code Annotations".

Tell LockLint to ignore any functions or variables you want to exclude from the analysis using the ignore subcommand.

Be conservative in your use of the ignore command. Make sure you should not be using one of the source code annotations instead (for example, NO_COMPETING_THREADS_NOW).

Run the analysis using the analyze subcommand.

Deal with the errors.

This may involve modifying the source using #ifdef __lock_lint (see "Limitations of LockLint") or adding source code annotations to accomplish steps 3, 4, 6, and 7 (see "Source Code Annotations").

Restore LockLint to the state it was in before the analysis and rerun the analysis as necessary.

Note -
It is best to handle the errors in order. Otherwise, problems with locks not being held on entry to a function, or locks being released while not held, can cause lots of misleading messages about variables not being properly protected.

Run the analysis using the analyze -v subcommand and repeat the above step.

When the errors from the analyze subcommand are gone, check for variables that are not properly protected by any lock.

Use the command: lock_lint vars -h | fgrep \*

Rerun the analysis using appropriate assertions to find out where the variables are being accessed without holding the proper locks.

Remember that you cannot run analyze twice for a given state, so it will probably help to save the state of LockLint using the save subcommand before running analyze. Then restore that state using refresh or restore before adding more assertions. You may want to set up an alias for analyze that automatically does a save before analyzing.

Program Knowledge Management

LockLint acquires its information on the sources to be analyzed with a set of databases produced by the C compiler. The LockLint database for each source file is stored in a separate file. To analyze a set of source files, use the load subcommand to load their associated database files. The files subcommand can be used to display a list of the source files represented by the loaded database files. Once a file is loaded, LockLint knows about all the functions, global data, and external functions referenced in the associated source files.

Function Management

As part of the analysis phase, LockLint builds a call graph for all the loaded sources. Information about the functions defined is available via the funcs subcommand. It is extremely important for a meaningful analysis that LockLint have the correct call graph for the code to be analyzed.

All functions that are not called by any of the loaded files are called root functions. You may want to treat certain functions as root functions even though they are called within the loaded modules (for example, the function is an entry point for a library that is also called from within the library). Do this by using the declare root subcommand. You may also remove functions from the call graph by issuing the ignore subcommand.

LockLint knows about all the references to function pointers and most of the assignments made to them. Information about the function pointers in the currently loaded files is available through the funcptrs subcommand. Information about the calls made via function pointers is available via the pointer calls subcommand. If there are function pointer assignments that LockLint could not discover, they may be specified with the declare ... targetssubcommand.

By default, LockLint tries to examine all possible execution paths. If the code uses function pointers, it's possible that many of the execution paths are not actually followed in normal operation of the code. This can result in the reporting of deadlocks that do not really occur. To prevent this, use the disallow and reallow subcommands to inform LockLint of execution paths that never occur. To print out existing constraints, use the reallows and disallows subcommands.

Variable Management

The LockLint database also contains information about all global variables accessed in the source code. Information about these variables is available via the vars subcommands.

One of LockLint's jobs is to determine if variable accesses are consistently protected. If you are unconcerned about accesses to a particular variable, you can remove it from consideration by using the ignore subcommand.

You may also consider using one of the following source code annotations, as appropriate.

SCHEME_PROTECTS_DATA

READ_ONLY_DATA

DATA_READABLE_WITHOUT_LOCK

NOW_INVISIBLE_TO_OTHER_THREADS

NOW_VISIBLE_TO_OTHER_THREADS

For more information, see "Source Code Annotations".

Lock Management

Source code annotations are an efficient way to refine the assertions you make about the locks in your code. There are three types of assertions: protection, order, and side effects.

Protection assertions state what is protected by a given lock. For example, the following source code annotations can be used to assert how data is protected.

MUTEX_PROTECTS_DATA

RWLOCK_PROTECTS_DATA

SCHEME_PROTECTS_DATA

DATA_READABLE_WITHOUT_LOCK

RWLOCK_COVERS_LOCK

A variation of the assert subcommand is used to assert that a given lock protects some piece of data or a function. Another variation, assert ... covers, asserts that a given lock protects another lock; this is used for hierarchical locking schemes.

Order assertions specify the order in which the given locks must be acquired. The source code annotation LOCK_ORDER or the assert order subcommand can be used to specify lock ordering.

Side effect assertions state that a function has the side effect of releasing or acquiring a given lock. Use the following source code annotations:

MUTEX_ACQUIRED_AS_SIDE_EFFECT

READ_LOCK_ACQUIRED_AS_SIDE_EFFECT

WRITE_LOCK_ACQUIRED_AS_SIDE_EFFECT

LOCK_RELEASED_AS_SIDE_EFFECT

LOCK_UPGRADED_AS_SIDE_EFFECT

LOCK_DOWNGRADED_AS_SIDE_EFFECT

NO_COMPETING_THREADS_AS_SIDE_EFFECT

COMPETING_THREADS_AS_SIDE_EFFECT

You can also use the assert side effect subcommand to specify side effects. In some cases you may want to make side effect assertions about an external function and the lock is not visible from the loaded module (for example, it is static to the module of the external function). In such a case, you can "create" a lock by using a form of the declare subcommand.

Analysis of Lock Usage

LockLint's primary role is to report on lock usage inconsistencies that may lead to data races and deadlocks. The analysis of lock usage occurs when you use the analyze subcommand. The result is a report on the following problems:

Functions that produce side effects on locks or violate assertions made about side effects on locks (for example, a function that changes the state of a mutex lock from locked to unlocked). The most common unintentional side effect occurs when a function acquires a lock on entry, and then fails to release it at some return point. That path through the function is said to acquire the lock as a side effect. This type of problem may lead to both data races and deadlocks.
Functions that have inconsistent side effects on locks (that is, different paths through the function) yield different side effects. This may be a limitation of LockLint (see "Limitations of LockLint") and a common cause of errors. LockLint cannot handle such functions. It always reports them as errors and does not correctly interpret them. For example, one of the returns from a function may forget to unlock a lock acquired in the function.
Violations of assertions about which locks should be held upon entry to a function. This problem may lead to a data race.
Violations of assertions that a lock should be held when a variable is accessed. This problem may lead to a data race.
Violations of assertions that specify the order in which locks are to be acquired. This problem may lead to a deadlock.
Failure to use the same, or asserted, mutex lock for all waits on a particular condition variable.
Miscellaneous problems related to analysis of the source code in relation to assertions and locks.

Post-analysis Queries

After analysis, you can use LockLint subcommands for:

Finding additional locking inconsistencies.

Forming appropriate declare, assert, and ignore subcommands. These can be specified after you've restored LockLint's state, prior to rerunning the analysis.

One such subcommand is order, which you can use to make inquiries about the order in which locks have been acquired. This information is particularly useful in understanding lock ordering problems and making assertions about those orders so that LockLint can more accurately diagnose potential deadlocks.

Another such subcommand is vars. The vars subcommand reports which locks are consistently held when a variable is read or written (if any). This information can be useful in determining the protection conventions in code where the original conventions were never documented, or the documentation has become outdated.

Limitations of LockLint

There are limitations to LockLint's powers of analysis. At the root of many of its difficulties is the fact that LockLint doesn't know the values of your variables.

LockLint solves some of these problems by ignoring the likely cause or making simplifying assumptions. You can avoid some other problems by using conditionally compiled code in the application. Towards this end, the compiler always defines the preprocessor macro __lock_lint when you compile with the -Zll option. You can use this macro to make your code less ambiguous.

LockLint has trouble deducing:

Which functions your function pointers point to. There are some assignments LockLint cannot deduce (see "declare"). The declare subcommand can be used to add new possible assignments to the function pointer.

When LockLint sees a call through a function pointer, it tests that call path for every possible value of that function pointer. If you know or suspect that some calling sequences are never executed, use the disallow and reallow subcommands to specify which sequences are executed.

Whether or not you locked a lock in code like this:
```
if (x) pthread_mutex_lock(&lock1);
```
In this case, two execution paths are created, one holding the lock, and one not holding the lock, which will probably cause the generation of a side effect message at the unlock call. You may be able to work around this problem by using the __lock_lint macro to force LockLint to treat a lock as unconditionally taken. For example:
```
#ifdef __lock_lint
pthread_mutex_lock(&lock1);
#else
if (x) pthread_mutex_lock(&lock1);
#endif
```
LockLint has no problem analyzing code like this:
```
if (x) {
	pthread_mutex_lock(&lock1);
	foo();
	pthread_mutex_unlock(&lock1);
}
```
In this case, there is only one execution path, along which the lock is acquired and released, causing no side effects.

Whether or not a lock was acquired in code like this:
```
rc = pthread_mutex_trylock(&lock1);
if (rc) ...
```

Which lock is being locked in code like this:
```
pthread_mutex_t* lockp;
pthread_mutex_lock(lockp);
```
In such cases, the lock call is ignored.

Which variables and locks are being used in code where elements of a structure are used (see "Lock Inversions"):
```
struct foo* p;
pthread_mutex_lock(p->lock);
p->bar = 0;
```

Which element of an array is being accessed. This is treated analogously to the previous case; the index is ignored.

Anything about longjmps.

When you would exit a loop or break out of a recursion (so it just stops proceeding down a path as soon as it finds itself looping or after one recursion).

Some other LockLint difficulties:

LockLint only analyzes the use of mutex locks and readers-writer locks. LockLint performs limited consistency checks of mutex locks as used with condition variables. However, semaphores and condition variables are not recognized as locks by LockLint. Even with this analysis, there are limits to what LockLint can make sense of.

There are situations where LockLint thinks two different variables are the same variable, or that a single variable is two different variables. (See "Lock Inversions".)

It is possible to share automatic variables between threads (via pointers), but LockLint assumes that automatics are unshared, and generally ignores them (the only situation in which they are of interest to LockLint is when they are function pointers).

LockLint complains about any functions that are not consistent in their side effects on locks. #ifdef's and assertions must be used to give LockLint a simpler view of functions that may or may not have such a side effect.

During analysis, LockLint may produce messages about a lock operation called rw_upgrade. Such a call does not really exist, but LockLint rewrites code like

if (rw_tryupgrade(&lock1)) { 		...  	}

if () { 		rw_tryupgrade(&lock1); 		...  	}

such that, wherever rw_tryupgrade() occurs, LockLint always assumes it succeeds.

One of the errors LockLint flags is an attempt to acquire a lock that is already held. However, if the lock is unnamed (for example, foo::lock), this error is suppressed, since the name refers not to a single lock but to a set of locks. However, if the unnamed lock always refers to the same lock, use the declare one subcommand so that LockLint can report this type of potential deadlock.

If you have constructed your own locks out of these locks (for example, recursive mutexes are sometimes built from ordinary mutexes), LockLint will not know about them. Generally you can use #ifdef to make it appear to LockLint as though an ordinary mutex is being manipulated. For recursive locks, use an unnamed lock for this deception, since errors won't be generated when it is recursively locked. For example:

void get_lock() {
	#ifdef __lock_lint 
			struct bogus *p;
			pthread_mutex_lock(p->lock);
	#else
			<the real recursive locking code>
	#endif
}

Source Code Annotations

An annotation is some piece of text inserted into your source code. You use annotations to tell LockLint things about your program that it cannot deduce for itself, either to keep it from excessively flagging problems or to have LockLint test for certain conditions. Annotations also serve to document code, in much the same way that comments do. There are two types of source code annotations: assertions and NOTEs.

Annotations are similar to some of the LockLint subcommands described in Appendix A, LockLint Command Reference. In general, it's preferable to use source code annotations over these subcommands, as explained in "Reasons to Use Source Code Annotations".

Reasons to Use Source Code Annotations

There are several reasons to use source code annotations. In many cases, such annotations are preferable to using a script of LockLint subcommands.

Annotations, being mixed in with the code that they describe, are generally better maintained than a script of LockLint subcommands.
With annotations, you can make assertions about lock state at any point within a function--wherever you put the assertion is where the check occurs. With subcommands, the finest granularity you can achieve is to check an assertion on entry to a function.
Functions mentioned in subcommands can change. If someone changes the name of a function from func1 to func2, a subcommand mentioning func1 fails (or worse, might work but do the wrong thing, if a different function is given the name func1).
Some annotations, such as NOTE(NO_COMPETING_THREADS_NOW), have no subcommand equivalents.
Annotations provide a good way to document your program. In fact, even if you are not using LockLint often, annotations are worthwhile just for this purpose. For example, a header file declaring a variable can document what lock or convention protects the variable, or a function that acquires a lock and deliberately returns without releasing it can have that behavior clearly declared in an annotation.

The Annotations Scheme

LockLint shares the source code annotations scheme with several other tools. When you install the Sun WorkShop ANSI C Compiler, you automatically install the file SUNW_SPRO-cc-ssbd, which contains the names of all the annotations that LockLint understands. The file is located in <installation_directory>/SUNWspro/SC5.0/lib/note.

You may specify a location other than the default by setting the environment variable NOTEPATH, as in

setenv NOTEPATH other_location:$NOTEPATH

The default value for NOTEPATH is installation_directory</SUNWSPRO/SC5.0/lib/note:/usr/lib/note.>

To use source code annotations, include the file note.h in your source or header files:

#include <note.h>

Using LockLint `NOTE`s

Many of the note-style annotations accept names--of locks or variables--as arguments. Names are specified using the syntax shown in Table 5-5.

Table 5-5 Specifying Names With LockLint NOTEs


Syntax	Meaning
`Var`	Named variable
`Var`.`Mbr`.`Mbr`...	Member of a named `struct`/`union` variable
`Tag`	Unnamed `struct/union` (with this tag)
`Tag`::`Mbr`.`Mbr`...	Member of an unnamed `struct/union`
`Type`	Unnamed `struct/union` (with this typedef)
`Type`::`Mbr`.`Mbr`...	Member of an unnamed `struct/union`

In C, structure tags and types are kept in separate namespaces, making it possible to have two different structs by the same name as far as LockLint is concerned. When LockLint sees foo::bar, it first looks for a struct with tag foo; if it does not find one, it looks for a type foo and checks that it represents a struct.

However, the proper operation of LockLint requires that a given variable or lock be known by exactly one name. Therefore type will be used only when no tag is provided for the struct, and even then only when the struct is defined as part of a typedef.

For example, Foo would serve as the type name in this example:

typedef struct { int a, b; } Foo;

These restrictions ensure that there is only one name by which the struct is known.

Name arguments do not accept general expressions. It is not valid, for example, to write:

NOTE(MUTEX_PROTECTS_DATA(p->lock, p->a p->b))

However, some of the annotations do accept expressions (rather than names); they are clearly marked.

In many cases an annotation accepts a list of names as an argument. Members of a list should be separated by white space. To simplify the specification of lists, a generator mechanism similar to that of many shells is understood by all annotations taking such lists. The notation for this is:

Prefix{A B ...}Suffix

where Prefix, Suffix, A, B, ... are nothing at all, or any text containing no white space. The above notation is equivalent to:

PrefixASuffix PrefixBSuffix ...

For example, the notation:

struct_tag::{a b c d}

is equivalent to the far more cumbersome text:

struct_tag::a struct_tag::b struct_tag::c struct_tag::d

This construct may be nested, as in:

foo::{a b.{c d} e}

which is equivalent to:

foo::a

foo::b.c

foo::b.d

foo::ae

Where an annotation refers to a lock or another variable, a declaration or definition for that lock or variable should already have been seen.

If a name for data represents a structure, it refers to all non-lock (mutex or readers-writer) members of the structure. If one of those members is itself a structure, then all of its non-lock members are implied, and so on. However, LockLint understands the abstraction of a condition variable and therefore does not break it down into its constituent members.

`NOTE` and `_NOTE`

The NOTE interface enables you to insert information for LockLint into your source code without affecting the compiled object code. The basic syntax of a note-style annotation is either:

NOTE(NoteInfo)

or:

_NOTE(NoteInfo)

The preferred use is NOTE rather than _NOTE. Header files that are to be used in multiple, unrelated projects, should use _NOTE to avoid conflicts. If NOTE has already been used, and you do not want to change, you should define some other macro (such as ANNOTATION) using _NOTE. For example, you might define an include file (say, annotation.h) that contains the following:

#define ANNOTATION _NOTE
#include <sys/note.h>

The NoteInfo that gets passed to the NOTE interface must syntactically fit one of the following:

NoteName

NoteName(Args)

NoteName is simply an identifier indicating the type of annotation. Args can be anything, so long as it can be tokenized properly and any parenthesis tokens are matched (so that the closing parenthesis can be found). Each distinct NoteName will have its own requirements regarding arguments.

This text uses NOTE to mean both NOTE and _NOTE, unless explicitly stated otherwise.

Where `NOTE` May Be Used

NOTE may be invoked only at certain well-defined places in source code:

At the top level; that is, outside of all function definitions, type and struct definitions, variable declarations, and other constructs. For example:
```
struct foo { int a, b; mutex_t lock; };
NOTE(MUTEX_PROTECTS_DATA(foo::lock, foo))
bar() {...}
```

At the top level within a block, among declarations or statements. Here too, the annotation must be outside of all type and struct definitions, variable declarations, and other constructs. For example:
```
foo() { ...; NOTE(...) ...; ...; }
```

At the top level within a struct or union definition, among the declarations. For example:
```
struct foo { int a; NOTE(...) int b; };
```

Where `NOTE` May Not Be Used

NOTE() may be used only in the locations described above. For example, the following are invalid:

a = b NOTE(...) + 1;

typedef NOTE(...) struct foo Foo;

for (i=0; NOTE(...) i<10; i++) ...

A note-style annotation is not a statement; NOTE() may not be used inside an if/else/for/while body unless braces are used to make a block. For example, the following causes a syntax error:

if (x) NOTE(...)

How Data Is Protected

The following annotations are allowed both outside and inside a function definition. Remember that any name mentioned in an annotation must already have been declared.

NOTE(MUTEX_PROTECTS_DATA(Mutex, DataNameList))

NOTE(RWLOCK_PROTECTS_DATA(Rwlock, DataNameList))

NOTE(SCHEME_PROTECTS_DATA("description", DataNameList))

The first two annotations tell LockLint that the lock should be held whenever the specified data is accessed.

The third annotation, SCHEME_PROTECTS_DATA, describes how data are protected if it does not have a mutex or readers-writer lock. The description supplied for the scheme is simply text and is not semantically significant; LockLint responds by ignoring the specified data altogether. You may make description anything you like.

Some examples help show how these annotations are used. The first example is very simple, showing a lock that protects two variables:

mutex_t lock1;
int a,b;
NOTE(MUTEX_PROTECTS_DATA(lock1, a b))

In the next example, a number of different possibilities are shown. Some members of struct foo are protected by a static lock, while others are protected by the lock on foo. Another member of foo is protected by some convention regarding its use.

mutex_t lock1;
struct foo {
	mutex_t lock;
	int mbr1, mbr2;
	struct {
		int mbr1, mbr2;
		char* mbr3;
	} inner;
	int mbr4;
};
NOTE(MUTEX_PROTECTS_DATA(lock1, foo::{mbr1 inner.mbr1}))
NOTE(MUTEX_PROTECTS_DATA(foo::lock, foo::{mbr2 inner.mbr2}))
NOTE(SCHEME_PROTECTS_DATA("convention XYZ", inner.mbr3))

A datum can only be protected in one way. If multiple annotations about protection (not only these three but also READ_ONLY_DATA) are used for a single datum, later annotations silently override earlier annotations. This allows for easy description of a structure in which all but one or two members are protected in the same way. For example, most of the members of struct BAR below are protected by the lock on struct foo, but one is protected by a global lock.

mutex_t lock1;
typedef struct {
	int mbr1, mbr2, mbr3, mbr4;
} BAR;
NOTE(MUTEX_PROTECTS_DATA(foo::lock, BAR))
NOTE(MUTEX_PROTECTS_DATA(lock1, BAR::mbr3))

Read-Only Variables

NOTE(READ_ONLY_DATA(DataNameList))

This annotation is allowed both outside and inside a function definition. It tells LockLint how data should be protected. In this case, it tells LockLint that the data should only be read, and not written.

Note -

No error is signaled if read-only data is written while it is considered invisible. Data is considered invisible when other threads cannot access it; for example, if other threads do not know about it.

This annotation is often used with data that is initialized and never changed thereafter. If the initialization is done at runtime before the data is visible to other threads, use annotations to let LockLint know that the data is invisible during that time.

LockLint knows that const data is read-only.

Allowing Unprotected Reads

NOTE(DATA_READABLE_WITHOUT_LOCK(DataNameList))

This annotation is allowed both outside and inside a function definition. It informs LockLint that the specified data may be read without holding the protecting locks. This is useful with an atomically readable datum that stands alone (as opposed to a set of data whose values are used together), since it is valid to peek at the unprotected data if you do not intend to modify it.

Hierarchical Lock Relationships

NOTE(RWLOCK_COVERS_LOCKS(RwlockName, LockNameList))

This annotation is allowed both outside and inside a function definition. It tells LockLint that a hierarchical relationship exists between a readers-writer lock and a set of other locks. Under these rules, holding the cover lock for write access affords a thread access to all data protected by the covered locks. Also, a thread must hold the cover lock for read access whenever holding any of the covered locks.

Using a readers-writer lock to cover another lock in this way is simply a convention; there is no special lock type. However, if LockLint is not told about this coverage relationship, it assumes that the locks are being used according to the usual conventions and generates error messages.

The following example specifies that member lock of unnamed foo structures covers member lock of unnamed bar and zot structures:

NOTE(RWLOCK_COVERS_LOCKS(foo::lock, {bar zot}::lock))

Functions With Locking Side Effects

NOTE(MUTEX_ACQUIRED_AS_SIDE_EFFECT(MutexExpr))

NOTE(READ_LOCK_ACQUIRED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(WRITE_LOCK_ACQUIRED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(LOCK_RELEASED_AS_SIDE_EFFECT(LockExpr))

NOTE(LOCK_UPGRADED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(LOCK_DOWNGRADED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(NO_COMPETING_THREADS_AS_SIDE_EFFECT)

NOTE(COMPETING_THREADS_AS_SIDE_EFFECT)

These annotations are allowed only inside a function definition. Each tells LockLint that the function has the specified side effect on the specified lock--that is, that the function deliberately leaves the lock in a different state on exit than it was in when the function was entered. In the case of the last two of these annotations, the side effect is not about a lock but rather about the state of concurrency.

When stating that a readers-writer lock is acquired as a side effect, you must specify whether the lock was acquired for read or write access.

A lock is said to be upgraded if it changes from being acquired for read-only access to being acquired for read/write access. Downgraded means a transformation in the opposite direction.

LockLint analyzes each function for its side effects on locks (and concurrency). Ordinarily, LockLint expects that a function will have no such effects; if the code has such effects intentionally, you must inform LockLint of that intent using annotations. If it finds that a function has different side effects from those expressed in the annotations, an error message results.

The annotations described in this section refer generally to the function's characteristics and not to a particular point in the code. Thus, these annotations are probably best written at the top of the function. There is, for example, no difference (other than readability) between this:

foo() {
	NOTE(MUTEX_ACQUIRED_AS_SIDE_EFFECT(lock_foo))
	...
	if (x && y) {
		...
	}
}

and this:

foo() {
	...
	if (x && y) {
	NOTE(MUTEX_ACQUIRED_AS_SIDE_EFFECT(lock_foo))
		...
	}
}

If a function has such a side effect, the effect should be the same on every path through the function. LockLint complains about and refuses to analyze paths through the function that have side effects other than those specified.

Single-Threaded Code

NOTE(COMPETING_THREADS_NOW)

NOTE(NO_COMPETING_THREADS_NOW)

These two annotations are allowed only inside a function definition. The first annotation tells LockLint that after this point in the code, other threads exist that might try to access the same data that this thread will access. The second function specifies that this is no longer the case; either no other threads are running or whatever threads are running will not be accessing data that this thread will access. While there are no competing threads, LockLint does not complain if the code accesses data without holding the locks that ordinarily protect that data.

These annotations are useful in functions that initialize data without holding locks before starting up any additional threads. Such functions may access data without holding locks, after waiting for all other threads to exit. So one might see something like this:

main() {
	<initialize data structures>
	NOTE(COMPETING_THREADS_NOW)
	<create several threads>
	<wait for all of those threads to exit>
	NOTE(NO_COMPETING_THREADS_NOW)
	<look at data structures and print results>
}

Note -

If a NOTE is present in main(), LockLint assumes that when main() starts, no other threads are running. If main() does not include a NOTE, LockLint does not assume that no other threads are running.

LockLint does not issue a warning if, during analysis, it encounters a COMPETING_THREADS_NOW annotation when it already thinks competing threads are present. The condition simply nests. No warning is issued because the annotation may mean different things in each use (that is the notion of which threads compete may differ from one piece of code to the next). On the other hand, a NO_COMPETING_THREADS_NOW annotation that does not match a prior COMPETING_THREADS_NOW (explicit or implicit) causes a warning.

Unreachable Code

NOTE(NOT_REACHED)

This annotation is allowed only inside a function definition. It tells LockLint that a particular point in the code cannot be reached, and therefore LockLint should ignore the condition of locks held at that point. This annotation need not be used after every call to exit(), for example, as the lint annotation /* NOTREACHED */ is used. Simply use it in definitions for exit() and the like (primarily in LockLint libraries), and LockLint will determine that code following calls to such functions is not reached. This annotation should seldom appear outside LockLint libraries. An example of its use (in a LockLint library) would be:

exit(int code) { NOTE(NOT_REACHED) }

Lock Order

NOTE(LOCK_ORDER(LockNameList))

This annotation, which is allowed either outside or inside a function definition, specifies the order in which locks should be acquired. It is similar to the assert order and order subcommands. See Appendix A, LockLint Command Reference.

To avoid deadlocks, LockLint assumes that whenever multiple locks must be held at once they are always acquired in a well-known order. If LockLint has been informed of such ordering using this annotation, an informative message is produced whenever the order is violated.

This annotation may be used multiple times, and the semantics will be combined appropriately. For example, given the annotations

NOTE(LOCK_ORDER(a b c))

NOTE(LOCK_ORDER(b d))

LockLint will deduce the ordering:

NOTE(LOCK_ORDER(a d))

It is not possible to deduce anything about the order of c with respect to d in this example.

If a cycle exists in the ordering, an appropriate error message will be generated.

Variables Invisible to Other Threads

NOTE(NOW_INVISIBLE_TO_OTHER_THREADS(DataExpr, ...))

NOTE(NOW_VISIBLE_TO_OTHER_THREADS(DataExpr, ...))

These annotations, which are allowed only within a function definition, tell LockLint whether or not the variables represented by the specified expressions are visible to other threads; that is, whether or not other threads could access the variables.

Another common use of these annotations is to inform LockLint that variables it would ordinarily assume are visible are in fact not visible, because no other thread has a pointer to them. This frequently occurs when allocating data off the heap--you can safely initialize the structure without holding a lock, since no other thread can yet see the structure.

Foo* p = (Foo*) malloc(sizeof(*p));
NOTE(NOW_INVISIBLE_TO_OTHER_THREADS(*p))
p->a = bar;
p->b = zot;
NOTE(NOW_VISIBLE_TO_OTHER_THREADS(*p))
add_entry(&global_foo_list, p);

Calling a function never has the side effect of making variables visible or invisible. Upon return from the function, all changes in visibility caused by the function are reversed.

Assuming Variables Are Protected

NOTE(ASSUMING_PROTECTED(DataExpr, ...))

This annotation, which is allowed only within a function definition, tells LockLint that this function assumes that the variables represented by the specified expressions are protected in one of the following ways:

The appropriate lock is held for each variable
The variables are invisible to other threads
There are no competing threads when the call is made

LockLint issues an error if none of these conditions is true.

f(Foo* p, Bar* q) {
	NOTE(ASSUMING_PROTECTED(*p, *q))
	p->a++;
	...
}

Assertions Recognized by LockLint

LockLint recognizes some assertions as relevant to the state of threads and locks. (For more information, see the assert man page.)

Assertions may be made only within a function definition, where a statement is allowed.

Note -

ASSERT() is used in kernel and driver code, whereas assert() is used in user (application) code. For simplicity's sake, this document uses assert() to refer to either one, unless explicitly stated otherwise.

Making Sure All Locks Are Released

assert(NO_LOCKS_HELD);

LockLint recognizes this assertion to mean that, when this point in the code is reached, no locks should be held by the thread executing this test. Violations are reported during analysis. A routine that blocks might want to use such an assertion to ensure that no locks are held when a thread blocks or exits.

The assertion also clearly serves as a reminder to someone modifying the code that any locks acquired must be released at that point.

It is really only necessary to use this assertion in leaf-level functions that block. If a function blocks only inasmuch as it calls another function that blocks, the caller need not contain this assertion as long as the callee does. Therefore this assertion probably sees its heaviest use in versions of libraries (for example, libc) written specifically for LockLint (like lint libraries).

The file synch.h defines NO_LOCKS_HELD as 1 if it has not already been otherwise defined, causing the assertion to succeed; that is, the assertion is effectively ignored at runtime. You can override this default runtime meaning by defining NO_LOCKS_HELD before you include either note.h or synch.h (which may be included in either order). For example, if a body of code uses only two locks called a and b, the following definition would probably suffice:

#define NO_LOCKS_HELD (!MUTEX_HELD(&a) && !MUTEX_HELD(&b))
#include <note.h>
#include <synch.h>

Doing so does not affect LockLint's testing of the assertion; that is, LockLint still complains if any locks are held (not just a or b).

Making Sure No Other Threads Are Running

assert(NO_COMPETING_THREADS);

LockLint recognizes this assertion to mean that, when this point in the code is reached, no other threads should be competing with the one running this code. Violations (based on information provided by certain NOTE-style assertions) are reported during analysis. Any function that accesses variables without holding their protecting locks (operating under the assumption that no other relevant threads are out there touching the same data), should be so marked.

By default, this assertion is ignored at runtime--that is, it always succeeds. No generic runtime meaning for NO_COMPETING_THREADS is possible, since the notion of which threads compete involves knowledge of the application. For example, a driver might make such an assertion to say that no other threads are running in this driver for the same device. Because no generic meaning is possible, synch.h defines NO_COMPETING_THREADS as 1 if it has not already been otherwise defined.

However, you can override the default meaning for NO_COMPETING_THREADS by defining it before including either note.h or synch.h (which may be included in either order). For example, if the program keeps a count of the number of running threads in a variable called num_threads, the following definition might suffice:

#define NO_COMPETING_THREADS (num_threads == 1)
#include <note.h>
#include <synch.h>

Doing so does not affect LockLint's testing of the assertion.

Asserting Lock State

assert(MUTEX_HELD(lock_expr) && ...);

This assertion is widely used within the kernel. It performs runtime checking if assertions are enabled. The same capability exists in user code.

This code does roughly the same thing during LockLint analysis as it does when the code is actually run with assertions enabled; that is, it reports an error if the executing thread does not hold the lock as described.

Note -

The thread library performs a weaker test, only checking that some thread holds the lock. LockLint performs the stronger test.

LockLint recognizes the use of MUTEX_HELD(), RW_READ_HELD(), RW_WRITE_HELD(), and RW_LOCK_HELD() macros, and negations thereof. Such macro calls may be combined using the && operators. For example, the following assertion causes LockLint to check that a mutex is not held and that a readers-writer lock is write-held:

assert(p && !MUTEX_HELD(&p->mtx) && RW_WRITE_HELD(&p->rwlock));

LockLint also recognizes expressions like:

MUTEX_HELD(&foo) == 0

Chapter 5 Lock Analysis Tool

Basic Concepts

LockLint Overview

Collecting Information for LockLint

LockLint User Interface

How to Use LockLint

Figure 5-1 LockLint Control Flow

Managing LockLint's Environment

Temporary Files

Makefile Rules

Compiling Code

LockLint Subcommands

Starting and Exiting LockLint

Setting the Tool State

Checking an Application

Program Knowledge Management

Function Management

Variable Management

Lock Management

Analysis of Lock Usage

Post-analysis Queries

Limitations of LockLint

Source Code Annotations

Reasons to Use Source Code Annotations

The Annotations Scheme

Using LockLint NOTEs

NOTE and _NOTE

Where NOTE May Be Used

Where NOTE May Not Be Used

How Data Is Protected

Read-Only Variables

Allowing Unprotected Reads

Hierarchical Lock Relationships

Functions With Locking Side Effects

Single-Threaded Code

Unreachable Code

Lock Order

Variables Invisible to Other Threads

Assuming Variables Are Protected

Assertions Recognized by LockLint

Making Sure All Locks Are Released

Making Sure No Other Threads Are Running

Asserting Lock State

Using LockLint `NOTE`s

`NOTE` and `_NOTE`

Where `NOTE` May Be Used

Where `NOTE` May Not Be Used