Skip Navigation Links | |
Exit Print View | |
System Administration Guide: Advanced Administration Oracle Solaris 10 8/11 Information Library |
1. Managing Terminals and Modems (Overview)
2. Setting Up Terminals and Modems (Tasks)
3. Managing Serial Ports With the Service Access Facility (Tasks)
4. Managing System Resources (Overview)
5. Displaying and Changing System Information (Tasks)
7. Managing UFS Quotas (Tasks)
8. Scheduling System Tasks (Tasks)
9. Managing System Accounting (Tasks)
10. System Accounting (Reference)
11. Managing System Performance (Overview)
12. Managing System Processes (Tasks)
13. Monitoring System Performance (Tasks)
14. Troubleshooting Software Problems (Overview)
16. Managing Core Files (Tasks)
17. Managing System Crash Information (Tasks)
What's New in Managing System Crash Information
Managing System Crash Information (Task Map)
Oracle Solaris ZFS Support for Swap Area and Dump Devices
Managing System Crash Dump Information
How to Display the Current Crash Dump Configuration
How to Modify a Crash Dump Configuration
How to Recover From a Full Crash Dump Directory (Optional)
How to Disable or Enable Saving Crash Dumps
18. Troubleshooting Miscellaneous Software Problems (Tasks)
19. Troubleshooting File Access Problems (Tasks)
20. Resolving UFS File System Inconsistencies (Tasks)
System crashes can occur due to hardware malfunctions, I/O problems, and software errors. If the system crashes, it will display an error message on the console, and then write a copy of its physical memory to the dump device. The system will then reboot automatically. When the system reboots, the savecore command is executed to retrieve the data from the dump device and write the saved crash dump to your savecore directory. The saved crash dump files provide invaluable information to your support provider to aid in diagnosing the problem.
The crash dump information is written in a compressed format to the vmdump.n file, where n is an integer that identifies the crash dump. Afterwards, the savecore command can be invoked on the same system or another system to expand the compressed crash dump to a pair of files that are named unix.n and vmcore.n. The directory in which the crash dump is saved upon reboot can also be configured by using the dumpadm command.
For systems that have a UFS root file system, the default dump device is configured as an appropriate swap partition. Swap partitions are disk partitions that are reserved as virtual memory backing storage for the operating system. Therefore, no permanent information resides in the swap are that is to be overwritten by the crash dump. For systems that have an Oracle Solaris ZFS root file system, dedicated ZFS volumes are used for swap and dump areas. See Oracle Solaris ZFS Support for Swap Area and Dump Devices for more information.
If you install an Oracle Solaris ZFS root file system or if you use the Oracle Solaris Live Upgrade program to migrate from a UFS root file system to a ZFS root file system, swap and dump devices are created on two ZFS volumes. For example, with a default root pool name, rpool, the /rpool/swap and /rpool/dump volumes are created automatically. You can adjust the sizes of your swap and dump volumes to sizes of your choosing as long as the new sizes support system operation. For more information, see ZFS Support for Swap and Dump Devices in Oracle Solaris ZFS Administration Guide.
If you need to modify your ZFS swap device or dump device after installation, use the swap or dumpadm commands, as in previous releases.
For information about managing dump devices in this document, see Managing System Crash Dump Information.
If a system crash occurs on an x86 based system in the GRUB boot environment, it is possible that the SMF service that manages the GRUB boot archive, svc:/system/boot-archive:default, might fail on the next system reboot. For more information on GRUB based booting, see Booting an x86 Based System by Using GRUB (Task Map) in System Administration Guide: Basic Administration.
The savecore command runs automatically after a system crash to retrieve the crash dump information from the dump device and writes a pair of files called unix.X and vmcore.X, where X identifies the dump sequence number. Together, these files represent the saved system crash dump information.
Crash dump files are sometimes confused with core files, which are images of user applications that are written when the application terminates abnormally.
Crash dump files are saved in a predetermined directory, which by default, is /var/crash/hostname. In previous releases, crash dump files were overwritten when a system rebooted, unless you manually enabled the system to save the images of physical memory in a crash dump file. Now, the saving of crash dump files is enabled by default.
System crash information is managed with the dumpadm command. For more information, see The dumpadm Command.
You can examine the control structures, active tables, memory images of a live or crashed system kernel, and other information about the operation of the kernel by using the mdb utility. Using mdb to its full potential requires a detailed knowledge of the kernel, and is beyond the scope of this manual. For information on using this utility, see the mdb(1) man page.
Additionally, crash dumps saved by savecore can be useful to send to a customer service representative for analysis of why the system is crashing.
Use the dumpadm command to manage system crash dump information in the Oracle Solaris OS.
The dumpadm command enables you to configure crash dumps of the operating system. The dumpadm configuration parameters include the dump content, dump device, and the directory in which crash dump files are saved.
Dump data is stored in compressed format on the dump device. Kernel crash dump images can be as big as 4 Gbytes or more. Compressing the data means faster dumping and less disk space needed for the dump device.
Saving crash dump files is run in the background when a dedicated dump device, not the swap area, is part of the dump configuration. This means a booting system does not wait for the savecore command to complete before going to the next step. On large memory systems, the system can be available before savecore completes.
System crash dump files, generated by the savecore command, are saved by default.
The savecore -L command is a new feature which enables you to get a crash dump of the live running the Oracle Solaris OS. This command is intended for troubleshooting a running system by taking a snapshot of memory during some bad state, such as a transient performance problem or service outage. If the system is up and you can still run some commands, you can execute the savecore -L command to save a snapshot of the system to the dump device, and then immediately write out the crash dump files to your savecore directory. Because the system is still running, you can only use the savecore -L command if you have configured a dedicated dump device.
The following table describes dumpadm's configuration parameters.
|
For more information, see dumpadm(1M).
Dump configuration parameters are managed by the dumpadm command.
During system startup, the dumpadm command is invoked by the svc:/system/dumpadm:default service to configure crash dumps parameters.
Specifically, dumpadm initializes the dump device and the dump content through the /dev/dump interface.
After the dump configuration is complete, the savecore script looks for the location of the crash dump file directory. Then, savecore is invoked to check for crash dumps and check the content of the minfree file in the crash dump directory.
Do not configure a dedicated dump device that is under the control of volume management product such as Solaris Volume Manager for accessibility and performance reasons. You can keep your swap areas under the control of Solaris Volume Manager and this is a recommend practice, but keep your dump device separate.