This chapter describes how to manage system crash information in the Oracle Solaris OS.
For information on the procedures associated with managing system crash information, see Managing System Crash Information (Task Map).
This section describes new or changed features for managing system resources in this Oracle Solaris release.
Oracle Solaris 10 9/10: This feature enhancement enables the system to save crash dumps in less time, using less space. The time that is required for a crash dump to complete is now 2 to 10 times faster, depending on the platform. The amount of disk space that is required to save crash dumps in the savecore directory is reduced by the same factors. To accelerate the creation and compression of the crash dump file, the fast crash dump facility utilizes lightly used CPUs on large systems. A new crash dump file, vmdump.n, is a compressed version of the vmcore.n and unix.n files. Compressed crash dumps can be moved over the network more quickly and then analyzed off-site. Note that the dump file must first be uncompressed to use it with tools like the mdb utility. You can uncompress a dump file by using the savecore command, either locally or remotely.
To support the new crash dump facility, the -z option has been added to the dumpadm command. Use this option to specify whether to save dumps in a compressed or an uncompressed format. The default format is compressed.
For more detailed information, see the dumpadm(1M) and the savecore(1M) man pages.
The following task map identifies the procedures needed to manage system crash information.
Task |
Description |
For Instructions |
---|---|---|
1. Display the current crash dump configuration. |
Display the current crash dump configuration by using the dumpadm command. | |
2. Modify the crash dump configuration. |
Use the dumpadm command to specify the type of data to dump, whether or not the system will use a dedicated dump device, the directory for saving crash dump files, and the amount of space that must remain available after crash dump files are written. | |
3. Examine a crash dump file. |
Use the mdb command to view crash dump files. | |
4. (Optional) Recover from a full crash dump directory. |
The system crashes, but no room is available in the savecore directory, and you want to save some critical system crash dump information. | |
5. (Optional) Disable or enable the saving of crash dump files. |
Use the dumpadm command to disable or enable the saving the crash dump files. Saving crash dump files is enabled by default. |
System crashes can occur due to hardware malfunctions, I/O problems, and software errors. If the system crashes, it will display an error message on the console, and then write a copy of its physical memory to the dump device. The system will then reboot automatically. When the system reboots, the savecore command is executed to retrieve the data from the dump device and write the saved crash dump to your savecore directory.The saved crash dump files provide invaluable information to your support provider to aid in diagnosing the problem.
The crash dump information is written in a compressed format to the vmdump.n file, where n is an integer that identifies the crash dump. Afterwards, the savecore command can be invoked on the same system or another system to expand the compressed crash dump to a pair of files that are named unix.n and vmcore.n. The directory in which the crash dump is saved upon reboot can also be configured by using the dumpadm command.
For systems that have a UFS root file system, the default dump device is configured as an appropriate swap partition. Swap partitions are disk partitions that are reserved as virtual memory backing storage for the operating system. Therefore, no permanent information resides in the swap are that is to be overwritten by the crash dump. For systems that have an Oracle Solaris ZFS root file system, dedicated ZFS volumes are used for swap and dump areas. See Oracle Solaris ZFS Support for Swap Area and Dump Devices for more information.
If you select an Oracle Solaris ZFS root file system during an initial software installation, or if you use the Oracle Solaris Live Upgrade to migrate from a UFS root file system to a ZFS root file system, a swap area is created on a ZFS volume in the ZFS root pool. Swap volume size is calculated as half the size of physical memory, but no more than 2 Gbytes and no less than 512 Mbytes. Dump volume size is calculated by the kernel based on dumpadm information and the size of physical memory. You can adjust the sizes of your swap and dump volumes in a JumpStart profile or during an initial installation to sizes of your choosing as long as the new sizes support system operation. For more information, see ZFS Support for Swap and Dump Devices in Oracle Solaris ZFS Administration Guide.
If you need to modify your ZFS swap area or dump area after installation, use the swap or dumpadm commands, as in previous releases.
For information about managing dump devices in this document, see Managing System Crash Dump Information.
If a system crash occurs on an x86 based system in the GRUB boot environment, it is possible that the SMF service that manages the GRUB boot archive, svc:/system/boot-archive:default, might fail on the next system reboot. For more information on GRUB based booting, see Booting an x86 Based System by Using GRUB (Task Map) in System Administration Guide: Basic Administration.
The savecore command runs automatically after a system crash to retrieve the crash dump information from the dump device and writes a pair of files called unix.X and vmcore.X, where X identifies the dump sequence number. Together, these files represent the saved system crash dump information.
Crash dump files are sometimes confused with core files, which are images of user applications that are written when the application terminates abnormally.
Crash dump files are saved in a predetermined directory, which by default, is /var/crash/hostname. In previous releases, crash dump files were overwritten when a system rebooted, unless you manually enabled the system to save the images of physical memory in a crash dump file. Now, the saving of crash dump files is enabled by default.
System crash information is managed with the dumpadm command. For more information, see The dumpadm Command.
You can examine the control structures, active tables, memory images of a live or crashed system kernel, and other information about the operation of the kernel by using the mdb utility. Using mdb to its full potential requires a detailed knowledge of the kernel, and is beyond the scope of this manual. For information on using this utility, see the mdb(1) man page.
Additionally, crash dumps saved by savecore can be useful to send to a customer service representative for analysis of why the system is crashing.
Use the dumpadm command to manage system crash dump information in the Oracle Solaris OS.
The dumpadm command enables you to configure crash dumps of the operating system. The dumpadm configuration parameters include the dump content, dump device, and the directory in which crash dump files are saved.
Dump data is stored in compressed format on the dump device. Kernel crash dump images can be as big as 4 Gbytes or more. Compressing the data means faster dumping and less disk space needed for the dump device.
Saving crash dump files is run in the background when a dedicated dump device, not the swap area, is part of the dump configuration. This means a booting system does not wait for the savecore command to complete before going to the next step. On large memory systems, the system can be available before savecore completes.
System crash dump files, generated by the savecore command, are saved by default.
The savecore -L command is a new feature which enables you to get a crash dump of the live running the Oracle Solaris OS. This command is intended for troubleshooting a running system by taking a snapshot of memory during some bad state, such as a transient performance problem or service outage. If the system is up and you can still run some commands, you can execute the savecore -L command to save a snapshot of the system to the dump device, and then immediately write out the crash dump files to your savecore directory. Because the system is still running, you can only use the savecore -L command if you have configured a dedicated dump device.
The following table describes dumpadm's configuration parameters.
Dump Parameter |
Description |
---|---|
dump device |
The device that stores dump data temporarily as the system crashes. When the dump device is not the swap area, savecore runs in the background, which speeds up the boot process. |
savecore directory |
The directory that stores system crash dump files. |
dump content |
Type of memory data to dump. |
minimum free space |
Minimum amount of free space required in the savecore directory after saving crash dump files. If no minimum free space has been configured, the default is one Mbyte. |
For more information, see dumpadm(1M).
Dump configuration parameters are managed by the dumpadm command.
During system startup, the dumpadm command is invoked by the svc:/system/dumpadm:default service to configure crash dumps parameters.
Specifically, dumpadm initializes the dump device and the dump content through the /dev/dump interface.
After the dump configuration is complete, the savecore script looks for the location of the crash dump file directory. Then, savecore is invoked to check for crash dumps and check the content of the minfree file in the crash dump directory.
Do not configure a dedicated dump device that is under the control of volume management product such as Solaris Volume Manager for accessibility and performance reasons. You can keep your swap areas under the control of Solaris Volume Manager and this is a recommend practice, but keep your dump device separate.
Keep the following key points in mind when you are working with system crash information:
You must be superuser or assume an equivalent role to access and manage system crash information.
Do not disable the option of saving system crash dumps. System crash dump files provide an invaluable way to determine what is causing the system to crash.
Do not remove important system crash information until it has been sent to your customer service representative.
Become superuser or assume an equivalent role.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC (Task Map) in System Administration Guide: Security Services.
Display the current crash dump configuration.
# dumpadm Dump content: kernel pages Dump device: /dev/dsk/c0t3d0s1 (swap) Savecore directory: /var/crash/venus Savecore enabled: yes Saved compressed: on |
The preceding example output means:
The dump content is kernel memory pages.
Kernel memory will be dumped on a swap device, /dev/dsk/c0t3d0s1. You can identify all your swap areas with the swap -l command.
System crash dump files will be written in the /var/crash/venus directory.
Saving crash dump files is enabled.
Save crash dumps in compressed format.
Become superuser or assume an equivalent role.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC (Task Map) in System Administration Guide: Security Services.
Identify the current crash dump configuration.
# dumpadm Dump content: kernel pages Dump device: /dev/dsk/c0t3d0s1 (swap) Savecore directory: /var/crash/pluto Savecore enabled: yes Save commpressed: on |
This output identifies the default dump configuration for a system running the Oracle Solaris 10 release.
Modify the crash dump configuration.
# /usr/sbin/dumpadm [-nuy] [-c content-type] [-d dump-device] [-m mink | minm | min%] [-s savecore-dir] [-r root-dir] [-z on | off] |
Specifies the type of data to dump. Use kernel to dump of all kernel memory, all to dump all of memory, or curproc, to dump kernel memory and the memory pages of the process whose thread was executing when the crash occurred. The default dump content is kernel memory.
Specifies the device that stores dump data temporarily as the system crashes. The primary swap device is the default dump device.
Specifies the minimum free disk space for saving crash dump files by creating a minfree file in the current savecore directory. This parameter can be specified in Kbytes (nnnk), Mbytes (nnnm) or file system size percentage (nnn%). The savecore command consults this file prior to writing the crash dump files. If writing the crash dump files, based on their size, would decrease the amount of free space below the minfree threshold, the dump files are not written and an error message is logged. For information on recovering from this scenario, see How to Recover From a Full Crash Dump Directory (Optional).
Specifies that savecore should not be run when the system reboots. This dump configuration is not recommended. If system crash information is written to the swap device, and savecore is not enabled, the crash dump information is overwritten when the system begins to swap.
Specifies an alternate directory for storing crash dump files. The default directory is /var/crash/hostname where hostname is the output of the uname -n command.
Forcibly updates the kernel dump configuration based on the contents of the /etc/dumpadm.conf file.
Modifies the dump configuration to automatically execute the savecore command upon reboot, which is the default for this dump setting.
Modifies the dump configuration to control the operation of the savecore command upon reboot. The on setting enables the saving of core file in a compressed format. The off setting automatically uncompresses the crash dump file. Because crash dump files can be extremely large and therefore require less file system space if they are saved in a compressed forma, the default is on.
In this example, all of memory is dumped to the dedicated dump device, /dev/dsk/c0t1d0s1, and the minimum free space that must be available after the crash dump files are saved is 10% of the file system space.
# dumpadm Dump content: kernel pages Dump device: /dev/dsk/c0t3d0s1 (swap) Savecore directory: /var/crash/pluto Savecore enabled: yes Save compressed: on # dumpadm -c all -d /dev/dsk/c0t1d0s1 -m 10% Dump content: all pages Dump device: /dev/dsk/c0t1d0s1 (dedicated) Savecore directory: /var/crash/pluto (minfree = 77071KB) Savecore enabled: yes Save compressed: on |
Become superuser or assume an equivalent role.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC (Task Map) in System Administration Guide: Security Services.
Examine a crash dump by using the mdb utility.
# /usr/bin/mdb [-k] crashdump-file |
Specifies kernel debugging mode by assuming the file is an operating system crash dump file.
Specifies the operating system crash dump file.
Display crash status information.
# /usr/bin/mdb file-name > ::status . . . > ::system . . . |
The following example shows sample output from the mdb utility, which includes system information and identifies the tunables that are set in this system's /etc/system file.
# /usr/bin/mdb -k unix.0 Loading modules: [ unix krtld genunix ip nfs ipc ptm ] > ::status debugging crash dump /dev/mem (64-bit) from ozlo operating system: 5.10 Generic (sun4u) > ::system set ufs_ninode=0x9c40 [0t40000] set ncsize=0x4e20 [0t20000] set pt_cnt=0x400 [0t1024] |
In this scenario, the system crashes but no room is left in the savecore directory, and you want to save some critical system crash dump information.
After the system reboots, log in as superuser or assume an equivalent role.
Clear out the savecore directory, usually /var/crash/hostname, by removing existing crash dump files that have already been sent to your service provider.
Become superuser or assume an equivalent role.
Roles contain authorizations and privileged commands. For more information about roles, see Configuring RBAC (Task Map) in System Administration Guide: Security Services.
Disable or enable the saving of crash dumps on your system.
# dumpadm -n | -y |
This example illustrates how to disable the saving of crash dumps on your system.
# dumpadm -n Dump content: all pages Dump device: /dev/dsk/c0t1d0s1 (dedicated) Savecore directory: /var/crash/pluto (minfree = 77071KB) Savecore enabled: no Save Compressed: on |
This example illustrates how to enable the saving of crash dump on your system.
# dumpadm -y Dump content: all pages Dump device: /dev/dsk/c0t1d0s1 (dedicated) Savecore directory: /var/crash/pluto (minfree = 77071KB) Savecore enabled: yes Save compressed: on |