System Administration Guide: Advanced Administration

System Crashes (Overview)

System crashes can occur due to hardware malfunctions, I/O problems, and software errors. If the system crashes, it will display an error message on the console, and then write a copy of its physical memory to the dump device. The system will then reboot automatically. When the system reboots, the savecore command is executed to retrieve the data from the dump device and write the saved crash dump to your savecore directory.The saved crash dump files provide invaluable information to your support provider to aid in diagnosing the problem.

The crash dump information is written in a compressed format to the vmdump.n file, where n is an integer that identifies the crash dump. Afterwards, the savecore command can be invoked on the same system or another system to expand the compressed crash dump to a pair of files that are named unix.n and vmcore.n. The directory in which the crash dump is saved upon reboot can also be configured by using the dumpadm command.

For systems that have a UFS root file system, the default dump device is configured as an appropriate swap partition. Swap partitions are disk partitions that are reserved as virtual memory backing storage for the operating system. Therefore, no permanent information resides in the swap are that is to be overwritten by the crash dump. For systems that have an Oracle Solaris ZFS root file system, dedicated ZFS volumes are used for swap and dump areas. See Oracle Solaris ZFS Support for Swap Area and Dump Devices for more information.

Oracle Solaris ZFS Support for Swap Area and Dump Devices

If you select an Oracle Solaris ZFS root file system during an initial software installation, or if you use the Oracle Solaris Live Upgrade to migrate from a UFS root file system to a ZFS root file system, a swap area is created on a ZFS volume in the ZFS root pool. Swap volume size is calculated as half the size of physical memory, but no more than 2 Gbytes and no less than 512 Mbytes. Dump volume size is calculated by the kernel based on dumpadm information and the size of physical memory. You can adjust the sizes of your swap and dump volumes in a JumpStart profile or during an initial installation to sizes of your choosing as long as the new sizes support system operation. For more information, see ZFS Support for Swap and Dump Devices in Oracle Solaris ZFS Administration Guide.

If you need to modify your ZFS swap area or dump area after installation, use the swap or dumpadm commands, as in previous releases.

For information about managing dump devices in this document, see Managing System Crash Dump Information.

x86: System Crashes in the GRUB Boot Environment

If a system crash occurs on an x86 based system in the GRUB boot environment, it is possible that the SMF service that manages the GRUB boot archive, svc:/system/boot-archive:default, might fail on the next system reboot. For more information on GRUB based booting, see Booting an x86 Based System by Using GRUB (Task Map) in System Administration Guide: Basic Administration.

System Crash Dump Files

The savecore command runs automatically after a system crash to retrieve the crash dump information from the dump device and writes a pair of files called unix.X and vmcore.X, where X identifies the dump sequence number. Together, these files represent the saved system crash dump information.

Crash dump files are sometimes confused with core files, which are images of user applications that are written when the application terminates abnormally.

Crash dump files are saved in a predetermined directory, which by default, is /var/crash/hostname. In previous releases, crash dump files were overwritten when a system rebooted, unless you manually enabled the system to save the images of physical memory in a crash dump file. Now, the saving of crash dump files is enabled by default.

System crash information is managed with the dumpadm command. For more information, see The dumpadm Command.

Saving Crash Dumps

You can examine the control structures, active tables, memory images of a live or crashed system kernel, and other information about the operation of the kernel by using the mdb utility. Using mdb to its full potential requires a detailed knowledge of the kernel, and is beyond the scope of this manual. For information on using this utility, see the mdb(1) man page.

Additionally, crash dumps saved by savecore can be useful to send to a customer service representative for analysis of why the system is crashing.

The dumpadm Command

Use the dumpadm command to manage system crash dump information in the Oracle Solaris OS.

The following table describes dumpadm's configuration parameters.

Dump Parameter 

Description 

dump device 

The device that stores dump data temporarily as the system crashes. When the dump device is not the swap area, savecore runs in the background, which speeds up the boot process.

savecore directory

The directory that stores system crash dump files. 

dump content 

Type of memory data to dump.  

minimum free space 

Minimum amount of free space required in the savecore directory after saving crash dump files. If no minimum free space has been configured, the default is one Mbyte.

For more information, see dumpadm(1M).

Dump configuration parameters are managed by the dumpadm command.

How the dumpadm Command Works

During system startup, the dumpadm command is invoked by the svc:/system/dumpadm:default service to configure crash dumps parameters.

Specifically, dumpadm initializes the dump device and the dump content through the /dev/dump interface.

After the dump configuration is complete, the savecore script looks for the location of the crash dump file directory. Then, savecore is invoked to check for crash dumps and check the content of the minfree file in the crash dump directory.

Dump Devices and Volume Managers

Do not configure a dedicated dump device that is under the control of volume management product such as Solaris Volume Manager for accessibility and performance reasons. You can keep your swap areas under the control of Solaris Volume Manager and this is a recommend practice, but keep your dump device separate.