Writing Device Drivers

Locking Changes

Starting with the SunOS 4.1.2 system, only one processor can be in the kernel at any one time. This is accomplished by using a master lock around the entire kernel. When a processor needs to execute kernel code, it needs to acquire the lock (this excludes other processors from running the code protected by the lock) and then release the lock when it is through. Because of this master lock, drivers written for uniprocessor systems did not change for multiprocessor systems. Two processors could not execute driver code at the same time.

In the SunOS 5.7 system, instead of one master lock, there are many smaller locks that protect smaller regions of code. For example, there may be a kernel lock that protects access to a particular vnode, and one that protects an inode. Only one processor can be running code dealing with that vnode at a time, but another could be accessing an inode. This allows a greater degree of concurrency.

However, because the kernel is multithreaded, it is possible that two (or more) threads are in driver code at the same time.

One thread could be in an entry point, and another in the interrupt routine. The driver had to handle this in the SunOS 4.1 system, but with the restriction that the interrupt routine blocked the user context routine while it ran.
Two threads could be in a routine at the same time. This could not happen in the SunOS 4.1 system.

Both of these cases are similar to situations present in the SunOS 4.1 system, but now these threads could run at the same time on different CPUs. The driver must be prepared to handle these types of occurrences.

Mutual Exclusion Locks

In the SunOS 4.1 system, a driver had to be careful when accessing data shared between the tophalf and the interrupt routine. Because the interrupt could occur asynchronously, the interrupt routine could corrupt data or simply hang. To prevent this, portions of the top half of the driver would raise, using the various spl routines, the interrupt priority level of the CPU to block the interrupt from being handled:

	s = splr(pritospl(6));
 	/* access shared data */
 	(void)splx(s);

In the SunOS 5.7 system, this no longer works. Changing the interrupt priority level of one CPU does not necessarily prevent another CPU from handling the interrupt. Also, two top-half routines may be running simultaneously with the interrupt running on a third CPU.

To solve this problem, the SunOS 5.7 system provides:

A uniform module of execution--even interrupts run as threads. This blurs the distinction between the tophalf and the bottomhalf, as effectively every routine is a bottomhalf routine.
A number of locking mechanisms-a common mechanism is to use mutual exclusion locks (mutexes):

	mutex_enter(&mu);
 	/* access shared data */
 	mutex_exit(&mu);

A subtle difference from the SunOS 4.1 system is that, because everything is run by kernel threads, the interrupt routine needs to explicitly acquire and release the mutex. In the SunOS 4.1 system, this was implicit since the interrupt handler automatically ran at an elevated priority.

See "Multithreading Additions to the State Structure" for more information on locking.

Condition Variables

In the SunOS 4.1 system, when the driver needed the current process to wait for something (such as a data transfer to complete), it called sleep()(), specifying a channel and a dispatch priority. The interrupt routine then called wakeup()( ) on that channel to notify all processes waiting on that channel that something happened. Because the interrupt could occur at any time, the interrupt priority was usually raised to ensure that the wakeup could not occur until the process was asleep.

Example A-1 SunOS 4.1 Synchronization Method

int		busy; /* global device busy flag */
int xxread(dev, uio)
dev_t		dev;
struct uio *uio;
{
	int		s;
	s = splr(pritospl(6));
	while (busy)
	    	sleep(&busy, PRIBIO + 1);
	busy = 1;
	(void)splx(s);
	/* do the read */
}
int xxintr()
{
	busy = 0;
	wakeup(&busy);
}

The SunOS 5.7 system provides similar functionality with condition variables. Threads are blocked on condition variables until they are notified that the condition has occurred. The driver must acquire a mutex that protects the condition variable before blocking the thread. The mutex is then released before the thread is blocked (similar to blocking/unblocking interrupts in the SunOS 4.1 system).

Example A-2 Synchronization in SunOS 5.7 Similar to SunOS 4.1

int			busy; 			/* global device busy flag */
kmutex_t 			busy_mu;			/* mutex protecting busy flag */
kcondvar_t			busy_cv;			/* condition variable for busy flag */
static int
xxread(dev_t dev, struct uio *uiop, cred_t *credp)
{
	mutex_enter(&busy_mu);
	while (busy)
	    	cv_wait(&busy_cv, &busy_mu);
	busy = 1;
	mutex_exit(&busy_mu);
	/* do the read */
}
static u_int
xxintr(caddr_t arg)
{
	mutex_enter(&busy_mu);
	busy = 0;
	cv_broadcast(&busy_cv);
	mutex_exit(&busy_mu);
}

Like wakeup(), cv_broadcast(9F) unblocks all threads waiting on the condition variable. To wake up one thread, use cv_signal(9F) (there was no documented equivalent for cv_signal(9F) in the SunOS 4.1 system).

Note -

There is no equivalent to the dispatch priority passed to sleep()( ).

Though the sleep()() and wakeup()() calls exist, do not use them, since the result would be an MT-unsafe driver.

See "Thread Synchronization" for more information.

Catching Signals

The driver could accidentally wait for an event that will never occur, or the event might not happen for a long time. In either case, the user might want to abort the process by sending it a signal (or typing a character that causes a signal to be sent to the process). Whether the signal causes the driver to wake up depends upon the driver.

In the SunOS 4.1 system, whether the sleep()() was signal-interruptible depended upon the dispatch priority passed to sleep()(). If the priority was greater than PZERO, the driver was signal-interruptible, otherwise the driver would not be awakened by a signal. Normally, a signal interrupt caused sleep( ) to return to the user, without notifying the driver that the signal had occurred. Drivers that needed to release resources before returning to the user passed the PCATCH flag to sleep( ), then looked at the return value of sleep() to determine why they awoke:

while (busy) {
 	if (sleep(&busy, PCATCH | (PRIBIO + 1))) {
 		/* awakened because of a signal */
 		/* free resources */
 		return (EINTR);
 	}
 }

In the SunOS 5.7 system, the driver can use cv_wait_sig(9F) to wait on the condition variable, but be signal interruptible. Note that cv_wait_sig(9F) returns zero to indicate the return was due to a signal, but sleep( ) in the SunOS 4.1 system returned a nonzero value:

while (busy) {
 	if (cv_wait_sig(&busy_cv, &busy_mu) == 0) {
 		/* returned because of signal */
 		/* free resources */
 		return (EINTR);
 	}
 }

`cv_timedwait()`()

Another solution drivers used to avoid blocking on events that would not occur was to set a timeout before the call to sleep. This timeout would occur far enough in the future that the event should have happened, and if it did run, it would awaken the blocked process. The driver would then see if the timeout function had run, and return some sort of error.

This can still be done in the SunOS 5.7 system, but the same thing may be accomplished with cv_timedwait(9F). An absolute time to wait is passed to cv_timedwait(9F), which will return zero if the time is reached and the event has not occurred. See Example 4-3 for an example usage of cv_timedwait(9F). Also see "cv_wait_sig()" for information on cv_timedwait_sig(9F).

Other Locks

Semaphores and readers/writers locks are also available. See semaphore(9F) and rwlock(9F).

Lock Granularity

Generally, start with one lock, and add more depending upon the abilities of the device. See "Choosing a Locking Scheme" and Appendix G, Advanced Topics, for more information.