Analyzing Program Performance With Sun WorkShop

Source Code Annotations

An annotation is some piece of text inserted into your source code. You use annotations to tell LockLint things about your program that it cannot deduce for itself, either to keep it from excessively flagging problems or to have LockLint test for certain conditions. Annotations also serve to document code, in much the same way that comments do. There are two types of source code annotations: assertions and NOTEs.

Annotations are similar to some of the LockLint subcommands described in Appendix A, LockLint Command Reference. In general, it's preferable to use source code annotations over these subcommands, as explained in "Reasons to Use Source Code Annotations".

Reasons to Use Source Code Annotations

There are several reasons to use source code annotations. In many cases, such annotations are preferable to using a script of LockLint subcommands.

The Annotations Scheme

LockLint shares the source code annotations scheme with several other tools. When you install the Sun WorkShop ANSI C Compiler, you automatically install the file SUNW_SPRO-cc-ssbd, which contains the names of all the annotations that LockLint understands. The file is located in <installation_directory>/SUNWspro/SC5.0/lib/note.

You may specify a location other than the default by setting the environment variable NOTEPATH, as in


setenv NOTEPATH other_location:$NOTEPATH

:

The default value for NOTEPATH is installation_directory</SUNWSPRO/SC5.0/lib/note:/usr/lib/note.>

To use source code annotations, include the file note.h in your source or header files:


#include <note.h>

Using LockLint NOTEs

Many of the note-style annotations accept names--of locks or variables--as arguments. Names are specified using the syntax shown in Table 5-5.

Table 5-5 Specifying Names With LockLint NOTEs

Syntax 

Meaning 

Var

Named variable 

Var.Mbr.Mbr...

Member of a named struct/union variable

Tag

Unnamed struct/union (with this tag)

Tag::Mbr.Mbr...

Member of an unnamed struct/union

Type

Unnamed struct/union (with this typedef)

Type::Mbr.Mbr...

Member of an unnamed struct/union

In C, structure tags and types are kept in separate namespaces, making it possible to have two different structs by the same name as far as LockLint is concerned. When LockLint sees foo::bar, it first looks for a struct with tag foo; if it does not find one, it looks for a type foo and checks that it represents a struct.

However, the proper operation of LockLint requires that a given variable or lock be known by exactly one name. Therefore type will be used only when no tag is provided for the struct, and even then only when the struct is defined as part of a typedef.

For example, Foo would serve as the type name in this example:

typedef struct { int a, b; } Foo;

These restrictions ensure that there is only one name by which the struct is known.

Name arguments do not accept general expressions. It is not valid, for example, to write:

NOTE(MUTEX_PROTECTS_DATA(p->lock, p->a p->b))

However, some of the annotations do accept expressions (rather than names); they are clearly marked.

In many cases an annotation accepts a list of names as an argument. Members of a list should be separated by white space. To simplify the specification of lists, a generator mechanism similar to that of many shells is understood by all annotations taking such lists. The notation for this is:

Prefix{A B ...}Suffix

where Prefix, Suffix, A, B, ... are nothing at all, or any text containing no white space. The above notation is equivalent to:

PrefixASuffix PrefixBSuffix ...

For example, the notation:

struct_tag::{a b c d}

is equivalent to the far more cumbersome text:

struct_tag::a struct_tag::b struct_tag::c struct_tag::d

This construct may be nested, as in:

foo::{a b.{c d} e}

which is equivalent to:

foo::a

foo::b.c

foo::b.d

foo::ae

Where an annotation refers to a lock or another variable, a declaration or definition for that lock or variable should already have been seen.

If a name for data represents a structure, it refers to all non-lock (mutex or readers-writer) members of the structure. If one of those members is itself a structure, then all of its non-lock members are implied, and so on. However, LockLint understands the abstraction of a condition variable and therefore does not break it down into its constituent members.

NOTE and _NOTE

The NOTE interface enables you to insert information for LockLint into your source code without affecting the compiled object code. The basic syntax of a note-style annotation is either:

NOTE(NoteInfo)

or:

_NOTE(NoteInfo)

The preferred use is NOTE rather than _NOTE. Header files that are to be used in multiple, unrelated projects, should use _NOTE to avoid conflicts. If NOTE has already been used, and you do not want to change, you should define some other macro (such as ANNOTATION) using _NOTE. For example, you might define an include file (say, annotation.h) that contains the following:


#define ANNOTATION _NOTE
#include <sys/note.h>

The NoteInfo that gets passed to the NOTE interface must syntactically fit one of the following:

NoteName

NoteName(Args)

NoteName is simply an identifier indicating the type of annotation. Args can be anything, so long as it can be tokenized properly and any parenthesis tokens are matched (so that the closing parenthesis can be found). Each distinct NoteName will have its own requirements regarding arguments.

This text uses NOTE to mean both NOTE and _NOTE, unless explicitly stated otherwise.

Where NOTE May Be Used

NOTE may be invoked only at certain well-defined places in source code:

Where NOTE May Not Be Used

NOTE() may be used only in the locations described above. For example, the following are invalid:

a = b NOTE(...) + 1;

typedef NOTE(...) struct foo Foo;

for (i=0; NOTE(...) i<10; i++) ...

A note-style annotation is not a statement; NOTE() may not be used inside an if/else/for/while body unless braces are used to make a block. For example, the following causes a syntax error:

if (x) NOTE(...)

How Data Is Protected

The following annotations are allowed both outside and inside a function definition. Remember that any name mentioned in an annotation must already have been declared.

NOTE(MUTEX_PROTECTS_DATA(Mutex, DataNameList))

NOTE(RWLOCK_PROTECTS_DATA(Rwlock, DataNameList))

NOTE(SCHEME_PROTECTS_DATA("description", DataNameList))

The first two annotations tell LockLint that the lock should be held whenever the specified data is accessed.

The third annotation, SCHEME_PROTECTS_DATA, describes how data are protected if it does not have a mutex or readers-writer lock. The description supplied for the scheme is simply text and is not semantically significant; LockLint responds by ignoring the specified data altogether. You may make description anything you like.

Some examples help show how these annotations are used. The first example is very simple, showing a lock that protects two variables:


mutex_t lock1;
int a,b;
NOTE(MUTEX_PROTECTS_DATA(lock1, a b))

In the next example, a number of different possibilities are shown. Some members of struct foo are protected by a static lock, while others are protected by the lock on foo. Another member of foo is protected by some convention regarding its use.


mutex_t lock1;
struct foo {
	mutex_t lock;
	int mbr1, mbr2;
	struct {
		int mbr1, mbr2;
		char* mbr3;
	} inner;
	int mbr4;
};
NOTE(MUTEX_PROTECTS_DATA(lock1, foo::{mbr1 inner.mbr1}))
NOTE(MUTEX_PROTECTS_DATA(foo::lock, foo::{mbr2 inner.mbr2}))
NOTE(SCHEME_PROTECTS_DATA("convention XYZ", inner.mbr3))

A datum can only be protected in one way. If multiple annotations about protection (not only these three but also READ_ONLY_DATA) are used for a single datum, later annotations silently override earlier annotations. This allows for easy description of a structure in which all but one or two members are protected in the same way. For example, most of the members of struct BAR below are protected by the lock on struct foo, but one is protected by a global lock.


mutex_t lock1;
typedef struct {
	int mbr1, mbr2, mbr3, mbr4;
} BAR;
NOTE(MUTEX_PROTECTS_DATA(foo::lock, BAR))
NOTE(MUTEX_PROTECTS_DATA(lock1, BAR::mbr3))

Read-Only Variables

NOTE(READ_ONLY_DATA(DataNameList))

This annotation is allowed both outside and inside a function definition. It tells LockLint how data should be protected. In this case, it tells LockLint that the data should only be read, and not written.


Note -

No error is signaled if read-only data is written while it is considered invisible. Data is considered invisible when other threads cannot access it; for example, if other threads do not know about it.


This annotation is often used with data that is initialized and never changed thereafter. If the initialization is done at runtime before the data is visible to other threads, use annotations to let LockLint know that the data is invisible during that time.

LockLint knows that const data is read-only.

Allowing Unprotected Reads

NOTE(DATA_READABLE_WITHOUT_LOCK(DataNameList))

This annotation is allowed both outside and inside a function definition. It informs LockLint that the specified data may be read without holding the protecting locks. This is useful with an atomically readable datum that stands alone (as opposed to a set of data whose values are used together), since it is valid to peek at the unprotected data if you do not intend to modify it.

Hierarchical Lock Relationships

NOTE(RWLOCK_COVERS_LOCKS(RwlockName, LockNameList))

This annotation is allowed both outside and inside a function definition. It tells LockLint that a hierarchical relationship exists between a readers-writer lock and a set of other locks. Under these rules, holding the cover lock for write access affords a thread access to all data protected by the covered locks. Also, a thread must hold the cover lock for read access whenever holding any of the covered locks.

Using a readers-writer lock to cover another lock in this way is simply a convention; there is no special lock type. However, if LockLint is not told about this coverage relationship, it assumes that the locks are being used according to the usual conventions and generates error messages.

The following example specifies that member lock of unnamed foo structures covers member lock of unnamed bar and zot structures:

NOTE(RWLOCK_COVERS_LOCKS(foo::lock, {bar zot}::lock))

Functions With Locking Side Effects

NOTE(MUTEX_ACQUIRED_AS_SIDE_EFFECT(MutexExpr))

NOTE(READ_LOCK_ACQUIRED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(WRITE_LOCK_ACQUIRED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(LOCK_RELEASED_AS_SIDE_EFFECT(LockExpr))

NOTE(LOCK_UPGRADED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(LOCK_DOWNGRADED_AS_SIDE_EFFECT(RwlockExpr))

NOTE(NO_COMPETING_THREADS_AS_SIDE_EFFECT)

NOTE(COMPETING_THREADS_AS_SIDE_EFFECT)

These annotations are allowed only inside a function definition. Each tells LockLint that the function has the specified side effect on the specified lock--that is, that the function deliberately leaves the lock in a different state on exit than it was in when the function was entered. In the case of the last two of these annotations, the side effect is not about a lock but rather about the state of concurrency.

When stating that a readers-writer lock is acquired as a side effect, you must specify whether the lock was acquired for read or write access.

A lock is said to be upgraded if it changes from being acquired for read-only access to being acquired for read/write access. Downgraded means a transformation in the opposite direction.

LockLint analyzes each function for its side effects on locks (and concurrency). Ordinarily, LockLint expects that a function will have no such effects; if the code has such effects intentionally, you must inform LockLint of that intent using annotations. If it finds that a function has different side effects from those expressed in the annotations, an error message results.

The annotations described in this section refer generally to the function's characteristics and not to a particular point in the code. Thus, these annotations are probably best written at the top of the function. There is, for example, no difference (other than readability) between this:


foo() {
	NOTE(MUTEX_ACQUIRED_AS_SIDE_EFFECT(lock_foo))
	...
	if (x && y) {
		...
	}
}

and this:


foo() {
	...
	if (x && y) {
	NOTE(MUTEX_ACQUIRED_AS_SIDE_EFFECT(lock_foo))
		...
	}
}

If a function has such a side effect, the effect should be the same on every path through the function. LockLint complains about and refuses to analyze paths through the function that have side effects other than those specified.

Single-Threaded Code

NOTE(COMPETING_THREADS_NOW)

NOTE(NO_COMPETING_THREADS_NOW)

These two annotations are allowed only inside a function definition. The first annotation tells LockLint that after this point in the code, other threads exist that might try to access the same data that this thread will access. The second function specifies that this is no longer the case; either no other threads are running or whatever threads are running will not be accessing data that this thread will access. While there are no competing threads, LockLint does not complain if the code accesses data without holding the locks that ordinarily protect that data.

These annotations are useful in functions that initialize data without holding locks before starting up any additional threads. Such functions may access data without holding locks, after waiting for all other threads to exit. So one might see something like this:


main() {
	<initialize data structures>
	NOTE(COMPETING_THREADS_NOW)
	<create several threads>
	<wait for all of those threads to exit>
	NOTE(NO_COMPETING_THREADS_NOW)
	<look at data structures and print results>
}


Note -

If a NOTE is present in main(), LockLint assumes that when main() starts, no other threads are running. If main() does not include a NOTE, LockLint does not assume that no other threads are running.


LockLint does not issue a warning if, during analysis, it encounters a COMPETING_THREADS_NOW annotation when it already thinks competing threads are present. The condition simply nests. No warning is issued because the annotation may mean different things in each use (that is the notion of which threads compete may differ from one piece of code to the next). On the other hand, a NO_COMPETING_THREADS_NOW annotation that does not match a prior COMPETING_THREADS_NOW (explicit or implicit) causes a warning.

Unreachable Code

NOTE(NOT_REACHED)

This annotation is allowed only inside a function definition. It tells LockLint that a particular point in the code cannot be reached, and therefore LockLint should ignore the condition of locks held at that point. This annotation need not be used after every call to exit(), for example, as the lint annotation /* NOTREACHED */ is used. Simply use it in definitions for exit() and the like (primarily in LockLint libraries), and LockLint will determine that code following calls to such functions is not reached. This annotation should seldom appear outside LockLint libraries. An example of its use (in a LockLint library) would be:


exit(int code) { NOTE(NOT_REACHED) }

Lock Order

NOTE(LOCK_ORDER(LockNameList))

This annotation, which is allowed either outside or inside a function definition, specifies the order in which locks should be acquired. It is similar to the assert order and order subcommands. See Appendix A, LockLint Command Reference.

To avoid deadlocks, LockLint assumes that whenever multiple locks must be held at once they are always acquired in a well-known order. If LockLint has been informed of such ordering using this annotation, an informative message is produced whenever the order is violated.

This annotation may be used multiple times, and the semantics will be combined appropriately. For example, given the annotations

NOTE(LOCK_ORDER(a b c))

NOTE(LOCK_ORDER(b d))

LockLint will deduce the ordering:

NOTE(LOCK_ORDER(a d))

It is not possible to deduce anything about the order of c with respect to d in this example.

If a cycle exists in the ordering, an appropriate error message will be generated.

Variables Invisible to Other Threads

NOTE(NOW_INVISIBLE_TO_OTHER_THREADS(DataExpr, ...))

NOTE(NOW_VISIBLE_TO_OTHER_THREADS(DataExpr, ...))

These annotations, which are allowed only within a function definition, tell LockLint whether or not the variables represented by the specified expressions are visible to other threads; that is, whether or not other threads could access the variables.

Another common use of these annotations is to inform LockLint that variables it would ordinarily assume are visible are in fact not visible, because no other thread has a pointer to them. This frequently occurs when allocating data off the heap--you can safely initialize the structure without holding a lock, since no other thread can yet see the structure.


Foo* p = (Foo*) malloc(sizeof(*p));
NOTE(NOW_INVISIBLE_TO_OTHER_THREADS(*p))
p->a = bar;
p->b = zot;
NOTE(NOW_VISIBLE_TO_OTHER_THREADS(*p))
add_entry(&global_foo_list, p);

Calling a function never has the side effect of making variables visible or invisible. Upon return from the function, all changes in visibility caused by the function are reversed.

Assuming Variables Are Protected

NOTE(ASSUMING_PROTECTED(DataExpr, ...))

This annotation, which is allowed only within a function definition, tells LockLint that this function assumes that the variables represented by the specified expressions are protected in one of the following ways:

LockLint issues an error if none of these conditions is true.


f(Foo* p, Bar* q) {
	NOTE(ASSUMING_PROTECTED(*p, *q))
	p->a++;
	...
}

Assertions Recognized by LockLint

LockLint recognizes some assertions as relevant to the state of threads and locks. (For more information, see the assert man page.)

Assertions may be made only within a function definition, where a statement is allowed.


Note -

ASSERT() is used in kernel and driver code, whereas assert() is used in user (application) code. For simplicity's sake, this document uses assert() to refer to either one, unless explicitly stated otherwise.


Making Sure All Locks Are Released

assert(NO_LOCKS_HELD);

LockLint recognizes this assertion to mean that, when this point in the code is reached, no locks should be held by the thread executing this test. Violations are reported during analysis. A routine that blocks might want to use such an assertion to ensure that no locks are held when a thread blocks or exits.

The assertion also clearly serves as a reminder to someone modifying the code that any locks acquired must be released at that point.

It is really only necessary to use this assertion in leaf-level functions that block. If a function blocks only inasmuch as it calls another function that blocks, the caller need not contain this assertion as long as the callee does. Therefore this assertion probably sees its heaviest use in versions of libraries (for example, libc) written specifically for LockLint (like lint libraries).

The file synch.h defines NO_LOCKS_HELD as 1 if it has not already been otherwise defined, causing the assertion to succeed; that is, the assertion is effectively ignored at runtime. You can override this default runtime meaning by defining NO_LOCKS_HELD before you include either note.h or synch.h (which may be included in either order). For example, if a body of code uses only two locks called a and b, the following definition would probably suffice:


#define NO_LOCKS_HELD (!MUTEX_HELD(&a) && !MUTEX_HELD(&b))
#include <note.h>
#include <synch.h>

Doing so does not affect LockLint's testing of the assertion; that is, LockLint still complains if any locks are held (not just a or b).

Making Sure No Other Threads Are Running

assert(NO_COMPETING_THREADS);

LockLint recognizes this assertion to mean that, when this point in the code is reached, no other threads should be competing with the one running this code. Violations (based on information provided by certain NOTE-style assertions) are reported during analysis. Any function that accesses variables without holding their protecting locks (operating under the assumption that no other relevant threads are out there touching the same data), should be so marked.

By default, this assertion is ignored at runtime--that is, it always succeeds. No generic runtime meaning for NO_COMPETING_THREADS is possible, since the notion of which threads compete involves knowledge of the application. For example, a driver might make such an assertion to say that no other threads are running in this driver for the same device. Because no generic meaning is possible, synch.h defines NO_COMPETING_THREADS as 1 if it has not already been otherwise defined.

However, you can override the default meaning for NO_COMPETING_THREADS by defining it before including either note.h or synch.h (which may be included in either order). For example, if the program keeps a count of the number of running threads in a variable called num_threads, the following definition might suffice:


#define NO_COMPETING_THREADS (num_threads == 1)
#include <note.h>
#include <synch.h>

Doing so does not affect LockLint's testing of the assertion.

Asserting Lock State

assert(MUTEX_HELD(lock_expr) && ...);

This assertion is widely used within the kernel. It performs runtime checking if assertions are enabled. The same capability exists in user code.

This code does roughly the same thing during LockLint analysis as it does when the code is actually run with assertions enabled; that is, it reports an error if the executing thread does not hold the lock as described.


Note -

The thread library performs a weaker test, only checking that some thread holds the lock. LockLint performs the stronger test.


LockLint recognizes the use of MUTEX_HELD(), RW_READ_HELD(), RW_WRITE_HELD(), and RW_LOCK_HELD() macros, and negations thereof. Such macro calls may be combined using the && operators. For example, the following assertion causes LockLint to check that a mutex is not held and that a readers-writer lock is write-held:


assert(p && !MUTEX_HELD(&p->mtx) && RW_WRITE_HELD(&p->rwlock));

LockLint also recognizes expressions like:

MUTEX_HELD(&foo) == 0