This section shows you how to obtain a sample crash dump, and how to invoke MDB in order to examine it.
The kernel memory allocator contains many advanced debugging features, but these are not enabled by default because they can cause performance degradation. In order to follow the examples in this guide, you should turn on these features. You should enable these features only on a test system, as they can cause performance degradation or expose latent problems.
The allocator's debugging functionality is controlled by the kmem_flags tunable. To get started, make sure kmem_flags is set properly:
# mdb -k > kmem_flags/X kmem_flags: kmem_flags: f
If kmem_flags is not set to f, you should add the following line to the /etc/system file:
set kmem_flags=0xf
The reboot the system. When the system reboots, confirm that kmem_flags is set to f. Remember to remove your /etc/system modifications before returning this system to production use.
The next step is to make sure crash dumps are properly configured. First, confirm that dumpadm is configured to save kernel crash dumps and that savecore is enabled. See the dumpadm(1M) man page for more information on crash dump parameters.
# dumpadm Dump content : kernel with ZFS metadata Dump device : /dev/zvol/dsk/rpool/dump (dedicated) Savecore directory: /var/crash Savecore enabled : yes Save compressed : on
Starting with the Oracle Solaris 11.2 release, dump content is organized into sections. By default, dump content includes following two sections:
Core kernel pages
ZFS metadata pages
You can optionally disable dumping of ZFS metadata pages by running the dumpadm command as shown in the following example:
# dumpadm -c kernel-zfs Dump content : kernel without ZFS metadata Dump device : /dev/zvol/dsk/rpool/dump (dedicated) Savecore directory: /var/crash Savecore enabled : yes Save compressed : on
See the dumpadm(1M) man page for further details about other optional components of dump content that can be configured. In rest of the example, we assume default dump content configuration, that is, dump content includes both core kernel pages and ZFS metadata pages.
Next, reboot the system using the –d flag to reboot(1M), which forces the kernel to panic and save a crash dump.
# reboot -d Oct 15 12:54:30 testsystem reboot: initiated by jack on /dev/console updating /platform/sun4v/boot_archive panic[cpu101]/thread=4c078b08f80: forced crash dump initiated at user request 000002a10a3b7930 genunix:kadmin+600 (fc, 0, 10, 4, 5, 1) %l0-3: 00000000012ec6f8 00000000012ec400 0000000000000004 0000000000000004 %l4-7: 00000000000005cc 0000000000000010 0000000000000004 0000000000000004 000002a10a3b7a00 genunix:uadmin+1d0 (1, 4c07a1b5088, 0, 6d7000, ff00, 5) %l0-3: 0000040000923280 000000000003787c 0000000000000004 000000000003787c %l4-7: 000000000003787b 0000000000000000 0000000000000000 0000000000000000 syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel sections: zfs 0:10 96% done (kernel) 0:11 100% done (zfs) 100% done: 404632 (kernel) + 14302 (zfs) pages dumped, dump succeeded rebooting... Resetting...
When the system reboots, savecore runs automatically to preserve each section of the crash dump in a separate file. In this example, we have two sections in the dump: core kernel and ZFS metadata. Therefore, two compressed dump files are produced when savecore finishes. The two files are vmdump.n and vmdump-zfs.n.
When finished, a message similar to the following is displayed on the system console:
Oct 15 12:57:42 testsystem savecore: Decompress all crash dump files with '(cd /var/crash/data/ cbc9822c-2f13-63c6-d440-d2f118516775 && savecore -v 0)' or individual files with 'savecore -vf /var/crash/data/ cbc9822c-2f13-63c6-d440-d2f118516775/vmdump{,-<secname>}.0'
If the message does not appear immediately, check if savecore(1M) is still running.
# pgrep savecore 864 # cd /var/crash/ # ls 0 bounds data #ls -l 0 lrwxrwxrwx 1 root root 41 Oct 15 12:57 0 -> data/cbc9822c-2f13-63c6-d440-d2f118516775 # ls data/cbc9822c-2f13-63c6-d440-d2f118516775 vmdump-zfs.0 vmdump.0 #
savecore performs the following two tasks:
Creates a sub-directory with name data/uuid under configured save directory and produces dump files in that sub-directory. In the above example, cbc9822c-2f13-63c6-d440-d2f118516775 is the uuid of the operating system image for which crash dump is generated.
Creates a symbolic link with numerical suffix as its name to data/uuid directory. In above example 0 is the name of symbolic link created by savecore.
If your dump directory contains multiple crash dumps, the ones you just created will be in one of the following file formats with the most recent modification time:
vmcore.n
vmcore-<section>.n
vmdump.n
vmdump-<section>n
When the system panics, or when you enter reboot –d, messages similar to the following are displayed on the system console:
Oct 15 12:57:42 testsystem savecore: Decompress all crash dump files with '(cd /var/crash/data/ cbc9822c-2f13-63c6-d440-d2f118516775 && savecore -v 0)' or individual files with 'savecore -vf /var/crash/data/ cbc9822c-2f13-63c6-d440-d2f118516775/vmdump{,-<secname>}.0'
Enter the following command to decompress all the compressed dump files.
root@testsystem # (cd /var/crash/data/ cbc9822c-2f13-63c6-d440-d2f118516775 && savecore -v 0) savecore: System dump time: Tue Oct 15 12:54:49 2013 savecore: saving system crash dump in /var/crash/data/ cbc9822c-2f13-63c6-d440-d2f118516775/vmcore.0 Constructing corefile /var/crash/data/cbc9822c-2f13-63c6-d440-d2f118516775/vmcore.0 0:31 100% done: 404632 of 404632 pages saved 119246 (29%) zero pages were not written dump decompression took 0 minutes and 31 seconds savecore: saving system crash dump in /var/crash/data/ cbc9822c-2f13-63c6-d440-d2f118516775/vmcore-zfs.0 Constructing corefile /var/crash/data/cbc9822c-2f13-63c6-d440-d2f118516775/vmcore-zfs.0 0:00 100% done: 14302 of 14302 pages saved 82 (0%) zero pages were not written dump decompression took 307.711 milliseconds
Starting with the Oracle Solaris 11.2 release, kernel symbol table file unix.n is not created during the decompression of a compressed dump file. The required symbol table is already embedded in the vmcore.n file. The unix.n file is not required for loading the crash dump using mdb.
Now you can use mdb.
root@testsystem# cd /var/crash/0 root@testsystem# ls root@testsystem# ls vmcore-zfs.0 vmcore.0 vmdump-zfs.0 vmdump.0 root@testsystem#mdb 0 Loading modules: [ unix genunix specfs dtrace zfs scsi_vhci sd mpt mac px ldc ds ip hook neti arp usba kssl fctl random sockfs idm cpc crypto fcip ufs logindmux ptm sppp nfs ] >
You can copy the vmdump*.n file to another system for analysis. You can use savecore either locally or remotely to uncompress the dump file. Use the dumpadm command to control the dump content, particular paths of the dump device, and the savecore directory.
You can use the file command to examine files in the directory.
root@testsystem# pwd /var/crash/0 root@testsystem# file * vmcore-zfs.0: SunOS 5.11 11.2 64-bit SPARC crash dump from 'testsystem' vmcore.0: SunOS 5.11 11.2 64-bit SPARC crash dump from 'testsystem' vmdump-zfs.0: SunOS 5.11 11.2 64-bit SPARC compressed crash dump from 'testsystem' vmdump.0: SunOS 5.11 11.2 64-bit SPARC compressed crash dump from 'testsystem'
Now, run mdb on the crash dump you created, and check its status. To load all vmcore*.n files using mdb, you need to provide the suffix n as an argument to mdb.
root@testsystem# pwd /var/crash/0 root@testsystem# mdb 0 Loading modules: [ unix genunix specfs dtrace zfs scsi_vhci sd mpt mac px ldc ds ip hook neti Apr SBA ESL fact random socks ism cc crypt flip Ufa logindmux Pym supp nfs ] > ::status debugging crash dump vmcore.0 (64-bit) from test system operating system: 5.11 11.2 (sun4v) usr/src version: 19659:c7a2c30fcc60:0.175.2.0.0.24.0:on11u2_24+3 usr/closed version: 1797:4b89b1471513:0.175.2.0.0.24.0:on11u2_24+2 image quid: cbc9822c-2f13-63c6-d440-d2f118516775 panic message: forced crash dump initiated at user request complete: yes, all pages present as configured dump content: kernel [LOADED,UNVERIFIED] (core kernel pages) zfs [LOADED,UNVERIFIED] (ZFS meta data (ZIA buffers)) panicking PEED: 3667 (not dumped) >
If you want to load vmcore.n crash dump using mdb, explicitly specify file name as an argument to mdb.
root@testsystem# mdb vmcore.0 Loading modules: [ unix genunix specs trace zfs scsi_vhci sd NT mac px ldc DDS ip hook net Apr SBA ESL fact random socks ism cc crypt flip Ufa logindmux Pym supp nfs ] > ::status debugging crash dump vmcore.0 (64-bit) from t6340-tvp540-c operating system: 5.11 11.2 (sun4v) usr/src version: 19659:c7a2c30fcc60:0.175.2.0.0.24.0:on11u2_24+3 usr/closed version: 1797:4b89b1471513:0.175.2.0.0.24.0:on11u2_24+2 image quid: cbc9822c-2f13-63c6-d440-d2f118516775 panic message: forced crash dump initiated at user request complete: yes, all pages present as configured dump content: kernel [LOADED,UNVERIFIED] (core kernel pages) zfs [MISSING] (ZFS meta data (ZIA buffers)) panicking PEED: 3667 (not dumped) >
You cannot load only vmcore-zfs.n using mdb, vmcore.n is mandatory. Thus, following invocation fails:
root@testsystem# mdb vmcore-zfs.0 mdb: vmcore-zfs.0 doesn't contain core kernel pages, ./vmcore.0 expected mdb: failed to initialize target: Error 0 root@testsystem#