Getting Started With MDB: Creating a Sample Crash Dump

Language:

This section shows you how to obtain a sample crash dump, and how to invoke MDB in order to examine it.

Setting `kmem_flags`

The kernel memory allocator contains many advanced debugging features, but these are not enabled by default because they can cause performance degradation. In order to follow the examples in this guide, you should turn on these features. You should enable these features only on a test system, as they can cause performance degradation or expose latent problems.

The allocator's debugging functionality is controlled by the kmem_flags tunable. To get started, make sure kmem_flags is set properly:

# mdb -k
> kmem_flags/X
kmem_flags:
kmem_flags:     f

If kmem_flags is not set to f, you should add the following line to the /etc/system file:

set kmem_flags=0xf

The reboot the system. When the system reboots, confirm that kmem_flags is set to f. Remember to remove your /etc/system modifications before returning this system to production use.

Forcing a Crash Dump

The next step is to make sure crash dumps are properly configured. First, confirm that dumpadm is configured to save kernel crash dumps and that savecore is enabled. See the dumpadm(1M) man page for more information on crash dump parameters.

# dumpadm
Dump content      : kernel with ZFS metadata
Dump device       : /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/crash
Savecore enabled  : yes
Save compressed   : on

Starting with the Oracle Solaris 11.2 release, dump content is organized into sections. By default, dump content includes following two sections:

Core kernel pages

ZFS metadata pages

Note - The line labeled Dump content in the above dumpadm output shows kernel with ZFS metadata.

You can optionally disable dumping of ZFS metadata pages by running the dumpadm command as shown in the following example:

# dumpadm -c kernel-zfs
Dump content      : kernel without ZFS metadata
Dump device       : /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/crash
Savecore enabled  : yes
Save compressed   : on

See the dumpadm(1M) man page for further details about other optional components of dump content that can be configured. In rest of the example, we assume default dump content configuration, that is, dump content includes both core kernel pages and ZFS metadata pages.

Next, reboot the system using the –d flag to reboot(1M), which forces the kernel to panic and save a crash dump.

# reboot -d
Oct 15 12:54:30 testsystem reboot: initiated by jack on /dev/console
updating /platform/sun4v/boot_archive

panic[cpu101]/thread=4c078b08f80: forced crash dump initiated at user request

000002a10a3b7930 genunix:kadmin+600 (fc, 0, 10, 4, 5, 1)
  %l0-3: 00000000012ec6f8 00000000012ec400 0000000000000004 0000000000000004
  %l4-7: 00000000000005cc 0000000000000010 0000000000000004 0000000000000004
000002a10a3b7a00 genunix:uadmin+1d0 (1, 4c07a1b5088, 0, 6d7000, ff00, 5)
  %l0-3: 0000040000923280 000000000003787c 0000000000000004 000000000003787c
  %l4-7: 000000000003787b 0000000000000000 0000000000000000 0000000000000000

syncing file systems... done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel 
sections: zfs
0:10  96% done (kernel)
0:11 100% done (zfs)
100% done: 404632 (kernel) + 14302 (zfs) pages dumped, dump succeeded
rebooting...
Resetting...

When the system reboots, savecore runs automatically to preserve each section of the crash dump in a separate file. In this example, we have two sections in the dump: core kernel and ZFS metadata. Therefore, two compressed dump files are produced when savecore finishes. The two files are vmdump.n and vmdump-zfs.n.

When finished, a message similar to the following is displayed on the system console:

Oct 15 12:57:42 testsystem savecore: Decompress all crash dump files with
'(cd /var/crash/data/
cbc9822c-2f13-63c6-d440-d2f118516775 && savecore -v 0)' 
or individual files with
'savecore -vf /var/crash/data/
cbc9822c-2f13-63c6-d440-d2f118516775/vmdump{,-<secname>}.0'

If the message does not appear immediately, check if savecore(1M) is still running.

# pgrep savecore
 864
# cd /var/crash/
# ls
  0 bounds  data
#ls -l 0
lrwxrwxrwx 1 root root 41 Oct 15 12:57 0 -> data/cbc9822c-2f13-63c6-d440-d2f118516775
# ls data/cbc9822c-2f13-63c6-d440-d2f118516775
  vmdump-zfs.0  vmdump.0
#

savecore performs the following two tasks:

Creates a sub-directory with name data/uuid under configured save directory and produces dump files in that sub-directory. In the above example, cbc9822c-2f13-63c6-d440-d2f118516775 is the uuid of the operating system image for which crash dump is generated.
Creates a symbolic link with numerical suffix as its name to data/uuid directory. In above example 0 is the name of symbolic link created by savecore.

Note - If your dump directory contains no dump files, that partition might be out of space. You can free up space and run savecore(1M) manually as root to save the dump.

If your dump directory contains multiple crash dumps, the ones you just created will be in one of the following file formats with the most recent modification time:

vmcore.n
vmcore-<section>.n
vmdump.n
vmdump-<section>n

Saving a Crash Dump

When the system panics, or when you enter reboot –d, messages similar to the following are displayed on the system console:

Oct 15 12:57:42 testsystem savecore: Decompress all crash dump files with 
'(cd /var/crash/data/
cbc9822c-2f13-63c6-d440-d2f118516775 && savecore -v 0)'
or individual files with 
'savecore -vf /var/crash/data/
cbc9822c-2f13-63c6-d440-d2f118516775/vmdump{,-<secname>}.0'

Enter the following command to decompress all the compressed dump files.

root@testsystem # (cd /var/crash/data/
cbc9822c-2f13-63c6-d440-d2f118516775 && savecore -v 0)
savecore: System dump time: Tue Oct 15 12:54:49 2013

savecore: saving system crash dump in /var/crash/data/
cbc9822c-2f13-63c6-d440-d2f118516775/vmcore.0
Constructing corefile /var/crash/data/cbc9822c-2f13-63c6-d440-d2f118516775/vmcore.0
0:31 100% done: 404632 of 404632 pages saved
119246 (29%) zero pages were not written
dump decompression took 0 minutes and 31 seconds
savecore: saving system crash dump in /var/crash/data/
cbc9822c-2f13-63c6-d440-d2f118516775/vmcore-zfs.0
Constructing corefile /var/crash/data/cbc9822c-2f13-63c6-d440-d2f118516775/vmcore-zfs.0
0:00 100% done: 14302 of 14302 pages saved
82 (0%) zero pages were not written
dump decompression took 307.711 milliseconds

Starting with the Oracle Solaris 11.2 release, kernel symbol table file unix.n is not created during the decompression of a compressed dump file. The required symbol table is already embedded in the vmcore.n file. The unix.n file is not required for loading the crash dump using mdb.

Now you can use mdb.

root@testsystem# cd /var/crash/0
root@testsystem# ls
root@testsystem# ls
vmcore-zfs.0  vmcore.0 vmdump-zfs.0  vmdump.0
root@testsystem#mdb 0
Loading modules: [ unix genunix specfs dtrace 
zfs scsi_vhci sd mpt mac px 
ldc ds ip hook neti arp usba kssl fctl random sockfs
idm cpc crypto fcip ufs logindmux ptm sppp nfs ]
>

You can copy the vmdump*.n file to another system for analysis. You can use savecore either locally or remotely to uncompress the dump file. Use the dumpadm command to control the dump content, particular paths of the dump device, and the savecore directory.

You can use the file command to examine files in the directory.

root@testsystem# pwd
/var/crash/0
root@testsystem# file *
vmcore-zfs.0:   SunOS 5.11 11.2 64-bit SPARC crash dump from 'testsystem'
vmcore.0:       SunOS 5.11 11.2 64-bit SPARC crash dump from 'testsystem'
vmdump-zfs.0:   SunOS 5.11 11.2 64-bit SPARC compressed crash dump from 'testsystem'
vmdump.0:       SunOS 5.11 11.2 64-bit SPARC compressed crash dump from 'testsystem'

Starting MDB

Now, run mdb on the crash dump you created, and check its status. To load all vmcore*.n files using mdb, you need to provide the suffix n as an argument to mdb.

root@testsystem# pwd
/var/crash/0
root@testsystem# mdb 0
Loading modules: [ unix genunix specfs dtrace 
zfs scsi_vhci sd mpt mac px ldc ds ip
hook neti Apr SBA ESL fact random 
socks ism cc crypt flip Ufa logindmux Pym supp nfs ]
> ::status
debugging crash dump vmcore.0 (64-bit) from test system
operating system: 5.11 11.2 (sun4v)
usr/src version: 19659:c7a2c30fcc60:0.175.2.0.0.24.0:on11u2_24+3
usr/closed version: 1797:4b89b1471513:0.175.2.0.0.24.0:on11u2_24+2
image quid: cbc9822c-2f13-63c6-d440-d2f118516775
panic message: forced crash dump initiated at user request
complete: yes, all pages present as configured
dump content: kernel [LOADED,UNVERIFIED] (core kernel pages)
              zfs [LOADED,UNVERIFIED] (ZFS meta data (ZIA buffers))
panicking PEED: 3667 (not dumped)
>

If you want to load vmcore.n crash dump using mdb, explicitly specify file name as an argument to mdb.

root@testsystem# mdb vmcore.0
Loading modules: [ unix genunix specs trace 
zfs scsi_vhci sd NT mac px ldc DDS ip
hook net Apr SBA ESL fact random socks 
ism cc crypt flip Ufa logindmux Pym supp nfs ]
> ::status
debugging crash dump vmcore.0 (64-bit) from t6340-tvp540-c
operating system: 5.11 11.2 (sun4v)
usr/src version: 19659:c7a2c30fcc60:0.175.2.0.0.24.0:on11u2_24+3
usr/closed version: 1797:4b89b1471513:0.175.2.0.0.24.0:on11u2_24+2
image quid: cbc9822c-2f13-63c6-d440-d2f118516775
panic message: forced crash dump initiated at user request
complete: yes, all pages present as configured
dump content: kernel [LOADED,UNVERIFIED] (core kernel pages)
              zfs [MISSING] (ZFS meta data (ZIA buffers))
panicking PEED: 3667 (not dumped)
>

You cannot load only vmcore-zfs.n using mdb, vmcore.n is mandatory. Thus, following invocation fails:

root@testsystem# mdb vmcore-zfs.0
mdb: vmcore-zfs.0 doesn't contain core kernel pages, ./vmcore.0 expected
mdb: failed to initialize target: Error 0
root@testsystem#

Oracle® Solaris Modular Debugger Guide