Analyzing Program Performance With Sun WorkShop

How to Use LockLint

Using LockLint consists of three steps:

Setting up the environment for using LockLint
Compiling the source code to be analyzed, producing the LockLint database files (.ll files)
Using the lock_lint command to run a LockLint session

These steps are described in the rest of this section.

Figure 5-1 shows the flow control of tasks involved in using LockLint:

Figure 5-1 LockLint Control Flow

Use LockLint to refine the set of assertions you maintain for the implementation of your system. A rich set of assertions enables LockLint to validate existing and new source code as you work.

Managing LockLint's Environment

The LockLint interface consists of the lock_lint command, which is executed in a shell, and the lock_lint subcommands. By default, LockLint uses the shell given by the environment variable $SHELL. Alternatively, LockLint can execute any shell by specifying the shell to use on the lock_lint start command. This example starts a LockLint session in the Korn shell:

% lock_lint start /bin/ksh

LockLint creates an environment variable called LL_CONTEXT, which is visible in the child shell. If you are using a shell that provides for initialization, you can arrange to have the lock_lint command source a .ll_init file in your home directory, and then execute a .ll_init file in the current directory if it exists. If you use csh, you can do this by inserting the following code into your .cshrc file:

if ($?LL_CONTEXT) then
	if ( -x $(HOME)/.ll_init ) source $(HOME)/.ll_init
endif

It is better not to have your .cshrc source the file in your current working directory, since others may want to run LockLint on those same files, and they may not use the same shell you do. Since you are the only one who is going to use your $(HOME)/.ll_init, you should source that one, so that you can change the prompt and define aliases for use during your LockLint session. The following version of ~/.ll_init does this for csh:

# Cause analyze subcommand to save state before analysis.
alias analyze "lock_lint save before analyze;\
	lock_lint analyze"
# Change prompt to show we are in lock_lint.
set prompt="lock_lint~$prompt"

Also see "start".

When executing subcommands, remember that you can use pipes, redirection, backward quotes (`), and so on to accomplish your aims. For example, the following command asserts that lock foo protects all global variables (the formal name for a global variable begins with a colon):

% lock_lint assert foo protects `lock_lint vars | grep ^:`

In general, the subcommands are set up for easy use with filters such as grep and sed. This is particularly true for vars and funcs, which put out a single line of information for each variable or function. Each line contains the attributes (defined and derived) for that variable or function. The following example shows which members of struct bar are supposed to be protected by member lock:

% lock_lint vars -a `lock_lint members bar` | grep =bar::lock

Since you are using a shell interface, a log of user commands can be obtained by using the shell's history function (the history level may need to be made large in the .ll_init file).

Temporary Files

LockLint puts temporary files in /var/tmp unless $TMPDIR is set.

Makefile Rules

To modify your makefile to produce .ll files, first use the rule for creating a .o from a .c to write a rule to create a .ll from a .c. For example, from:

# Rule for making .o from .c in ../src.
%.o: ../src/%.c
	$(COMPILE.c) -o $@ $<

you might write:

# Rule for making .ll from .c in ../src.
%.ll: ../src/%.c
	cc $(CFLAGS) $(CPPFLAGS) $(FOO) $<

In the above example, the -Zll flag would have to be specified in the make macros for compiler options (CFLAGS and CPPFLAGS).

If you use a suffix rule, you will need to define .ll as a suffix. For that reason some prefer to use % rules.

If the appropriate .o files are contained in a make variable FOO_OBJS, you can create FOO_LLS with the line:

FOO_LLS = ${FOO_OBJS:%.o=%.ll}

or, if they are in a subdirectory ll:

FOO_LLS = ${FOO_OBJS:%.o=ll/%.ll}

If you want to keep the .ll files in subdirectory ll/, you can have the makefile automatically create this file with the label:

.INIT:
	@if [ ! -d ll ]; then mkdir ll; fi

Compiling Code

For LockLint to analyze your source code, you must first compile it using the -Zll option of the Sun WorkShop ANSI C compiler. The compiler then produces the LockLint database files (.ll files), one for each .c file compiled. Later you load the .ll files into LockLint with the load subcommand.

LockLint sometimes needs a simpler view of the code to return meaningful results during analysis. To allow you to provide this simpler view, the -Zll option automatically defines the preprocessor symbol __lock_lint; further discussions of the likely uses of __lock_lint can be found in "Limitations of LockLint".

LockLint Subcommands

The interface to LockLint consists of a set of subcommands that can be specified with the lock_lint command:

lock_lint [subcommand
]

In this example subcommand is one of a set of subcommands used to direct the analysis of the source code for data races and deadlocks. More information about subcommands can be found in Appendix A, LockLint Command Reference.

Starting and Exiting LockLint

The first subcommand of any LockLint session must be start, which starts a subshell of your choice with the appropriate LockLint context. Since a LockLint session is started within a subshell, you exit by exiting that subshell. For example, to exit LockLint when using the C shell, use the command exit.

Setting the Tool State

LockLint's state consists of the set of databases loaded and the specified assertions. Iteratively modifying that state and rerunning the analysis can provide optimal information on potential data races and deadlocks. Since the analysis can be done only once for any particular state, the save, restore, and refresh subcommands are provided as a means to reestablish a state, modify that state, and retry the analysis.

Checking an Application

Annotate your source code and compile it to create .ll files.

See "Source Code Annotations".

Load the .ll files using the load subcommand.

Make assertions about locks protecting functions and variables using the assert subcommand.

Note -
These specifications may also be conveyed using source code annotations. See "Source Code Annotations".

Make assertions about the order in which locks should be acquired in order to avoid deadlocks, using the assert order subcommand.

Note -
These specifications may also be conveyed using source code annotations. See "Source Code Annotations".

Check that LockLint has the right idea about which functions are roots.

If the funcs -o subcommand does not show a root function as root, use the declare root subcommand to fix it. If funcs -o shows a non-root function as root, it's likely that the function should be listed as a function target using the declare ... targets subcommand. See "declare root func" for a discussion of root functions.

Describe any hierarchical lock relationships (if you have any--they are rare) using the assert rwlock subcommand.

Note -
These specifications may also be conveyed using source code annotations. See "Source Code Annotations".

Tell LockLint to ignore any functions or variables you want to exclude from the analysis using the ignore subcommand.

Be conservative in your use of the ignore command. Make sure you should not be using one of the source code annotations instead (for example, NO_COMPETING_THREADS_NOW).

Run the analysis using the analyze subcommand.

Deal with the errors.

This may involve modifying the source using #ifdef __lock_lint (see "Limitations of LockLint") or adding source code annotations to accomplish steps 3, 4, 6, and 7 (see "Source Code Annotations").

Restore LockLint to the state it was in before the analysis and rerun the analysis as necessary.

Note -
It is best to handle the errors in order. Otherwise, problems with locks not being held on entry to a function, or locks being released while not held, can cause lots of misleading messages about variables not being properly protected.

Run the analysis using the analyze -v subcommand and repeat the above step.

When the errors from the analyze subcommand are gone, check for variables that are not properly protected by any lock.

Use the command: lock_lint vars -h | fgrep \*

Rerun the analysis using appropriate assertions to find out where the variables are being accessed without holding the proper locks.

Remember that you cannot run analyze twice for a given state, so it will probably help to save the state of LockLint using the save subcommand before running analyze. Then restore that state using refresh or restore before adding more assertions. You may want to set up an alias for analyze that automatically does a save before analyzing.

Program Knowledge Management

LockLint acquires its information on the sources to be analyzed with a set of databases produced by the C compiler. The LockLint database for each source file is stored in a separate file. To analyze a set of source files, use the load subcommand to load their associated database files. The files subcommand can be used to display a list of the source files represented by the loaded database files. Once a file is loaded, LockLint knows about all the functions, global data, and external functions referenced in the associated source files.

Function Management

As part of the analysis phase, LockLint builds a call graph for all the loaded sources. Information about the functions defined is available via the funcs subcommand. It is extremely important for a meaningful analysis that LockLint have the correct call graph for the code to be analyzed.

All functions that are not called by any of the loaded files are called root functions. You may want to treat certain functions as root functions even though they are called within the loaded modules (for example, the function is an entry point for a library that is also called from within the library). Do this by using the declare root subcommand. You may also remove functions from the call graph by issuing the ignore subcommand.

LockLint knows about all the references to function pointers and most of the assignments made to them. Information about the function pointers in the currently loaded files is available through the funcptrs subcommand. Information about the calls made via function pointers is available via the pointer calls subcommand. If there are function pointer assignments that LockLint could not discover, they may be specified with the declare ... targetssubcommand.

By default, LockLint tries to examine all possible execution paths. If the code uses function pointers, it's possible that many of the execution paths are not actually followed in normal operation of the code. This can result in the reporting of deadlocks that do not really occur. To prevent this, use the disallow and reallow subcommands to inform LockLint of execution paths that never occur. To print out existing constraints, use the reallows and disallows subcommands.

Variable Management

The LockLint database also contains information about all global variables accessed in the source code. Information about these variables is available via the vars subcommands.

One of LockLint's jobs is to determine if variable accesses are consistently protected. If you are unconcerned about accesses to a particular variable, you can remove it from consideration by using the ignore subcommand.

You may also consider using one of the following source code annotations, as appropriate.

SCHEME_PROTECTS_DATA

READ_ONLY_DATA

DATA_READABLE_WITHOUT_LOCK

NOW_INVISIBLE_TO_OTHER_THREADS

NOW_VISIBLE_TO_OTHER_THREADS

For more information, see "Source Code Annotations".

Lock Management

Source code annotations are an efficient way to refine the assertions you make about the locks in your code. There are three types of assertions: protection, order, and side effects.

Protection assertions state what is protected by a given lock. For example, the following source code annotations can be used to assert how data is protected.

MUTEX_PROTECTS_DATA

RWLOCK_PROTECTS_DATA

SCHEME_PROTECTS_DATA

DATA_READABLE_WITHOUT_LOCK

RWLOCK_COVERS_LOCK

A variation of the assert subcommand is used to assert that a given lock protects some piece of data or a function. Another variation, assert ... covers, asserts that a given lock protects another lock; this is used for hierarchical locking schemes.

Order assertions specify the order in which the given locks must be acquired. The source code annotation LOCK_ORDER or the assert order subcommand can be used to specify lock ordering.

Side effect assertions state that a function has the side effect of releasing or acquiring a given lock. Use the following source code annotations:

MUTEX_ACQUIRED_AS_SIDE_EFFECT

READ_LOCK_ACQUIRED_AS_SIDE_EFFECT

WRITE_LOCK_ACQUIRED_AS_SIDE_EFFECT

LOCK_RELEASED_AS_SIDE_EFFECT

LOCK_UPGRADED_AS_SIDE_EFFECT

LOCK_DOWNGRADED_AS_SIDE_EFFECT

NO_COMPETING_THREADS_AS_SIDE_EFFECT

COMPETING_THREADS_AS_SIDE_EFFECT

You can also use the assert side effect subcommand to specify side effects. In some cases you may want to make side effect assertions about an external function and the lock is not visible from the loaded module (for example, it is static to the module of the external function). In such a case, you can "create" a lock by using a form of the declare subcommand.

Analysis of Lock Usage

LockLint's primary role is to report on lock usage inconsistencies that may lead to data races and deadlocks. The analysis of lock usage occurs when you use the analyze subcommand. The result is a report on the following problems:

Functions that produce side effects on locks or violate assertions made about side effects on locks (for example, a function that changes the state of a mutex lock from locked to unlocked). The most common unintentional side effect occurs when a function acquires a lock on entry, and then fails to release it at some return point. That path through the function is said to acquire the lock as a side effect. This type of problem may lead to both data races and deadlocks.
Functions that have inconsistent side effects on locks (that is, different paths through the function) yield different side effects. This may be a limitation of LockLint (see "Limitations of LockLint") and a common cause of errors. LockLint cannot handle such functions. It always reports them as errors and does not correctly interpret them. For example, one of the returns from a function may forget to unlock a lock acquired in the function.
Violations of assertions about which locks should be held upon entry to a function. This problem may lead to a data race.
Violations of assertions that a lock should be held when a variable is accessed. This problem may lead to a data race.
Violations of assertions that specify the order in which locks are to be acquired. This problem may lead to a deadlock.
Failure to use the same, or asserted, mutex lock for all waits on a particular condition variable.
Miscellaneous problems related to analysis of the source code in relation to assertions and locks.

Post-analysis Queries

After analysis, you can use LockLint subcommands for:

Finding additional locking inconsistencies.

Forming appropriate declare, assert, and ignore subcommands. These can be specified after you've restored LockLint's state, prior to rerunning the analysis.

One such subcommand is order, which you can use to make inquiries about the order in which locks have been acquired. This information is particularly useful in understanding lock ordering problems and making assertions about those orders so that LockLint can more accurately diagnose potential deadlocks.

Another such subcommand is vars. The vars subcommand reports which locks are consistently held when a variable is read or written (if any). This information can be useful in determining the protection conventions in code where the original conventions were never documented, or the documentation has become outdated.

Limitations of LockLint

There are limitations to LockLint's powers of analysis. At the root of many of its difficulties is the fact that LockLint doesn't know the values of your variables.

LockLint solves some of these problems by ignoring the likely cause or making simplifying assumptions. You can avoid some other problems by using conditionally compiled code in the application. Towards this end, the compiler always defines the preprocessor macro __lock_lint when you compile with the -Zll option. You can use this macro to make your code less ambiguous.

LockLint has trouble deducing:

Which functions your function pointers point to. There are some assignments LockLint cannot deduce (see "declare"). The declare subcommand can be used to add new possible assignments to the function pointer.

When LockLint sees a call through a function pointer, it tests that call path for every possible value of that function pointer. If you know or suspect that some calling sequences are never executed, use the disallow and reallow subcommands to specify which sequences are executed.

Whether or not you locked a lock in code like this:
```
if (x) pthread_mutex_lock(&lock1);
```
In this case, two execution paths are created, one holding the lock, and one not holding the lock, which will probably cause the generation of a side effect message at the unlock call. You may be able to work around this problem by using the __lock_lint macro to force LockLint to treat a lock as unconditionally taken. For example:
```
#ifdef __lock_lint
pthread_mutex_lock(&lock1);
#else
if (x) pthread_mutex_lock(&lock1);
#endif
```
LockLint has no problem analyzing code like this:
```
if (x) {
	pthread_mutex_lock(&lock1);
	foo();
	pthread_mutex_unlock(&lock1);
}
```
In this case, there is only one execution path, along which the lock is acquired and released, causing no side effects.

Whether or not a lock was acquired in code like this:
```
rc = pthread_mutex_trylock(&lock1);
if (rc) ...
```

Which lock is being locked in code like this:
```
pthread_mutex_t* lockp;
pthread_mutex_lock(lockp);
```
In such cases, the lock call is ignored.

Which variables and locks are being used in code where elements of a structure are used (see "Lock Inversions"):
```
struct foo* p;
pthread_mutex_lock(p->lock);
p->bar = 0;
```

Which element of an array is being accessed. This is treated analogously to the previous case; the index is ignored.

Anything about longjmps.

When you would exit a loop or break out of a recursion (so it just stops proceeding down a path as soon as it finds itself looping or after one recursion).

Some other LockLint difficulties:

LockLint only analyzes the use of mutex locks and readers-writer locks. LockLint performs limited consistency checks of mutex locks as used with condition variables. However, semaphores and condition variables are not recognized as locks by LockLint. Even with this analysis, there are limits to what LockLint can make sense of.

There are situations where LockLint thinks two different variables are the same variable, or that a single variable is two different variables. (See "Lock Inversions".)

It is possible to share automatic variables between threads (via pointers), but LockLint assumes that automatics are unshared, and generally ignores them (the only situation in which they are of interest to LockLint is when they are function pointers).

LockLint complains about any functions that are not consistent in their side effects on locks. #ifdef's and assertions must be used to give LockLint a simpler view of functions that may or may not have such a side effect.

During analysis, LockLint may produce messages about a lock operation called rw_upgrade. Such a call does not really exist, but LockLint rewrites code like

if (rw_tryupgrade(&lock1)) { 		...  	}

if () { 		rw_tryupgrade(&lock1); 		...  	}

such that, wherever rw_tryupgrade() occurs, LockLint always assumes it succeeds.

One of the errors LockLint flags is an attempt to acquire a lock that is already held. However, if the lock is unnamed (for example, foo::lock), this error is suppressed, since the name refers not to a single lock but to a set of locks. However, if the unnamed lock always refers to the same lock, use the declare one subcommand so that LockLint can report this type of potential deadlock.

If you have constructed your own locks out of these locks (for example, recursive mutexes are sometimes built from ordinary mutexes), LockLint will not know about them. Generally you can use #ifdef to make it appear to LockLint as though an ordinary mutex is being manipulated. For recursive locks, use an unnamed lock for this deception, since errors won't be generated when it is recursively locked. For example:

void get_lock() {
	#ifdef __lock_lint 
			struct bogus *p;
			pthread_mutex_lock(p->lock);
	#else
			<the real recursive locking code>
	#endif
}