Writing Device Drivers

Avoiding Data Loss on a Test System

A driver bug can sometimes render a system incapable of booting. By taking precautions, you can avoid system reinstallation in this event, as described in this section.

Back Up Critical System Files

A number of driver-related system files are difficult, if not impossible, to reconstruct. Files such as /etc/name_to_major, /etc/driver_aliases, /etc/driver_classes, and /etc/minor_perm can be corrupted if the driver crashes the system during installation. See the add_drv(1M) man page.

To be safe, make a backup copy of the root file system after the test machine is in the proper configuration. If you plan to modify the /etc/system file, make a backup copy of the file before making modifications.

ProcedureTo Boot With an Alternate Kernel

To avoid rendering a system inoperable, you should boot from a copy of the kernel and associated binaries rather than from the default kernel.

  1. Make a copy of the drivers in /platform/*.


    # cp -r /platform/`uname -i`/kernel /platform/`uname -i`/kernel.test
    
  2. Place the driver module in /platform/`uname -i`/kernel.test/drv.

  3. Boot the alternate kernel instead of the default kernel.

    After you have created and stored the alternate kernel, you can boot this kernel in a number of ways.

    • You can boot the alternate kernel by rebooting:


      # reboot -- kernel.test/unix
      
    • On a SPARC-based system, you can also boot from the PROM:


      ok boot kernel.test/sparcv9/unix
      

      Note –

      To boot with the kmdb debugger, use the -k option as described in Getting Started With the Modular Debugger.


    • On an x86-based system, when the Select (b)oot or (i)nterpreter: message is displayed in the boot process, type the following:


      boot kernel.test/unix
      

Example 22–4 Booting an Alternate Kernel

The following example demonstrates booting with an alternate kernel.


ok boot kernel.test/sparcv9/unix
Rebooting with command: boot kernel.test/sparcv9/unix
Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a File and \
    args:
kernel.test/sparcv9/unix


Example 22–5 Booting an Alternate Kernel With the -a Option

Alternatively, the module path can be changed by booting with the ask (-a) option. This option results in a series of prompts for configuring the boot method.


ok boot -a
Rebooting with command: boot -a
Boot device: /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a File and \
args: -a
Enter filename [kernel/sparcv9/unix]: kernel.test/sparcv9/unix
Enter default directory for modules
[/platform/sun4u/kernel.test /kernel /usr/kernel]: <CR>
Name of system file [etc/system]: <CR>
SunOS Release 5.10 Version Generic 64-bit
Copyright 1983-2002 Sun Microsystems, Inc. All rights reserved.
root filesystem type [ufs]: <CR>
Enter physical name of root device
[/sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@0,0:a]: <CR>

Consider Alternative Back-Up Plans

If the system is attached to a network, the test machine can be added as a client of a server. If a problem occurs, the system can be booted from the network. The local disks can then be mounted, and any fixes can be made. Alternatively, the system can be booted directly from the Solaris system CD-ROM.

Another way to recover from disaster is to have another bootable root file system. Use format(1M) to make a partition that is the exact size of the original. Then use dd(1M) to copy the bootable root file system. After making a copy, run fsck(1M) on the new file system to ensure its integrity.

Subsequently, if the system cannot boot from the original root partition, boot the backup partition. Use dd(1M) to copy the backup partition onto the original partition. You might have a situation where the system cannot boot even though the root file system is undamaged. For example, the damage might be limited to the boot block or the boot program. In such a case, you can boot from the backup partition with the ask (-a) option. You can then specify the original file system as the root file system.

Capture System Crash Dumps

When a system panics, the system writes an image of kernel memory to the dump device. The dump device is by default the most suitable swap device. The dump is a system crash dump, similar to core dumps generated by applications. On rebooting after a panic, savecore(1M) checks the dump device for a crash dump. If a dump is found, savecore makes a copy of the kernel's symbol table, which is called unix.n. The savecore utility then dumps a core file that is called vmcore.n in the core image directory. By default, the core image directory is /var/crash/machine_name. If /var/crash has insufficient space for a core dump, the system displays the needed space but does not actually save the dump. The mdb(1) debugger can then be used on the core dump and the saved kernel.

In the Solaris operating system, crash dump is enabled by default. The dumpadm(1M) command is used to configure system crash dumps. Use the dumpadm command to verify that crash dumps are enabled and to determine the location of core files that have been saved.


Note –

You can prevent the savecore utility from filling the file system. Add a file that is named minfree to the directory in which the dumps are to be saved. In this file, specify the number of kilobytes to remain free after savecore has run. If insufficient space is available, the core file is not saved.