System Administration Guide, Volume 2

Chapter 39 Managing System Crash Information

This section contains information about managing system crash information.

This is a list of the step-by-step instructions in this chapter.

System Crashes

System crashes can occur due to hardware malfunctions, I/O problems, and software errors. If the system crashes, it will display an error message on the console, and then write a copy of its physical memory to the dump device. The system will then reboot automatically. When the system reboots, the savecore command is executed to retrieve the data from the dump device and write the saved crash dump to your savecore directory. The saved crash dump files provide invaluable information to your support provider to aid in diagnosing the problem.

System Crash Files and Core Files

The savecore command runs automatically after a system crash to retrieve the crash dump information from the dump device and writes a pair of files called unix.X and vmcore.X, where X identifies the dump sequence number. Together, these files represent the saved system crash dump information.

Crash dump files are sometimes confused with core files, which are images of user applications that are written when the application terminates abnormally.

Crash dump files are saved in a predetermined directory, which by default, is /var/crash/hostname. In previous Solaris releases, crash dump files were overwritten when a system rebooted--unless you manually enabled the system to save the images of physical memory in a crash dump file. Now the saving of crash dump files is enabled by default.

System crash information is managed with the dumpadm command. See "Managing System Crash Dump Information (dumpadm)" for more information.

Core files are managed with the coreadm command. See "Managing Core Files (coreadm)" for more information.

Managing Core Files (`coreadm`)

The coreadm command enables you to manage core files. For example, you can use the coreadm command to configure a system so that all process core files are placed in a single system directory. This means it is easier to track problems by examining the core files in a specific directory whenever a Solaris process or daemon terminates abnormally.

Limitations of the previous Solaris process core dump features are:

Process core files are placed in their current working directory, and thus all Solaris daemons, which typically chdir to the root (/) directory as part of their initialization, overwrite each other's core files.
Many system daemons, such as statd, perform setuid operations but do not produce core files in the event of a problem, for security reasons.

Configurable Core File Paths

Two new configurable core file paths that can be enabled or disabled independently of each other are:

A per-process core file path, which defaults to core and is enabled by default. If enabled, the per-process core file path causes a core file to be produced when the process terminates abnormally. The per-process path is inherited by a new process from its parent process.

When generated, a per-process core file is owned by the owner of the process with read/write permissions for the owner. Only the owning user can view this file.
A global core file path, which defaults to core and is disabled by default. If enabled, an additional core file with the same content as the per-process core file is produced by using the global core file path.

When generated, a global core file is owned by superuser with read/write permissions for superuser only. Non-privileged users cannot view this file.

When a process terminates abnormally, it produces a core file in the current directory as in previous Solaris releases. But if the global core file path is enabled and set to /corefiles/core, for example, then each process that expires produce two core files: one in the current working directory and one in the /corefiles directory.

By default, the Solaris core paths and core file retention remain the same:

A setuid process does not produce core files using either the global or per-process path.
The global core file path is disabled.
The per-process core file path is enabled.
The per-process core file path is set to core.

Expanded Core File Names

If a global core file directory is enabled, core files can be distinguished from one another by using the variables described in the following table.

Variable Name	Variable Definition
`%p`	Process ID
`%u`	Effective user ID
`%g`	Effective group ID
`%f`	Executable file name
`%n`	System node name, equivalent to the `uname -n` output
`%m`	Machine name, equivalent to the `uname -m` output
`%t`	Decimal value of time(2) system call
`%%`	Literal %

For example, if the global core file path is set to:

/var/core/core.%f.%p

and a sendmail process with PID 12345 terminates abnormally, it produces the following core file:

/var/core/core.sendmail.12345

Setting the Core File Name Pattern

You can set a core file name pattern on a global basis or a per-process basis, and you can specify whether you want these settings saved across a system reboot.

For example, the following coreadm command sets the global core file pattern for all processes started by the init process. This pattern will persist across system reboots.

$ coreadm -i /var/core/core.%f.%p

Global core values are stored in the /etc/coreadm.conf file, which means these settings are saved across a system reboot.

This coreadm command sets the per-process core file name pattern for all processes:

$ coreadm -p /var/core/core.%f.%p $$

The $$ symbols represent a placeholder for the process ID of the currently running shell. The per-process core file name pattern is inherited by all child processes.

Once a global or per-process core file name pattern is set, it must be enabled with the coreadm -e command. See the procedures below for more information.

You can set the core file name pattern for all processes run during a user's login session by putting the command in a user's $HOME/.profile or .login file.

Enabling `setuid` Programs to Produce Core Files

You can use the coreadm command to enable or disable setuid programs to produce core files for all system processes or on a per-process basis by setting the following paths:

If the global setuid option is enabled, a global core file path allows all setuid programs on a system to produce core files.
If the per-process setuid option is enable, a per-process core file path allows specific setuid processes to produce core files.

By default, both flags are disabled. For security reasons, the global core file path must be a full pathname, starting with a leading /. If superuser disables per-process core files, individual users cannot obtain core files.

The setuid core files are owned by superuser with read/write permissions for superuser only. Ordinary users cannot access them even if the process that produced the setuid core file was owned by an ordinary user.

See coreadm(1M) for more information.

How to Display the Current Core Dump Configuration

Use the coreadm command without any options to display the current core dump configuration.

$ coreadm
               global core file pattern: /var/core/core.%f.%p
                 init core file pattern: core
                      global core dumps: enabled
                 per-process core dumps: enabled
                global setid core dumps: enabled
           per-process setid core dumps: disabled
               global core dump logging: disabled

How to Set a Core File Name Pattern

Determine whether you want to set a per-process or global core file and select one of the following:
1. Set a per-process file name pattern.
  # coreadm -p $HOME/corefiles/%f.%p $$
2. Set a global file name pattern.
  
  Become superuser first.
  # coreadm -g /var/corefiles/%f.%p

How to Display a Core File Name Pattern

Use the following coreadm command to inquire about the core file settings of the current process. The $$ symbols represent a placeholder for the process ID of the running shell.

$ coreadm $$
278:    core.%f.%p

Superuser can inquire about any user's core file settings by using coreadm process ID. Ordinary users can only inquire about the core file settings of their own processes.

How to Enable a Per-Process Core File Path

Become superuser.

Enable a per-process core file path.
# coreadm -e process

Display the current process core file path to verify the configuration.
$ coreadm $$ 1180: /home/kryten/corefiles/%f.%p

How to Enable a Global Core File Path

Become superuser.

Enable a global core file path.

# coreadm -e global -g /var/core/core.%f.%p

Display the current process core file path to verify the configuration.

# coreadm
      global core file pattern: /var/core/core.%f.%p
        init core file pattern: core
             global core dumps: enabled
        per-process core dumps: enabled
       global setid core dumps: disabled
  per-process setid core dumps: disabled
      global core dump logging: disabled

Troubleshooting Core File Problems

Error Message

 
NOTICE: 'set allow_setid_core = 1' in /etc/system is obsolete
NOTICE: Use the coreadm command instead of 'allow_setid_core'

Cause

You have an obsolete parameter that allows setuid core files in your /etc/system file.

Solution

Remove allow_setid_core=1 from the /etc/system file. Then use the coreadm command to enable global setuid core file paths.

Managing System Crash Dump Information (`dumpadm`)

This section describes how to manage system crash information in the Solaris environment.

System Crash Dump Features

This section describes how to manage system crash dump information in the Solaris environment.

The new dumpadm command, which allows system administrators to configure crash dumps of the operating system. The dumpadm configuration parameters include the dump content, dump device, and the directory in which crash dump files are saved. See "The dumpadm Command" for more information about the dumpadm command.
Dump data is now stored in compressed format on the dump device. Kernel crash dump images can be as big as 4 Gbytes or more. Compressing the data means faster dumping and less disk space needed for the dump device.
Saving crash dump files is run in the background when a dedicated dump device--not the swap area--is part of the dump configuration. This means a booting system does not wait for the savecore command to complete before going to the next step. On large memory systems, the system can be available before savecore completes.
System crash dump files, generated by the savecore command, are now saved by default.
The savecore -L command is a new feature which enables you to get a crash dump of the live running Solaris operating environment. This command is intended for troubleshooting a running system by taking a snapshot of memory during some bad state--such as a transient performance problem or service outage. If the system is up and you can still run some commands, you can execute the savecore -L to save a snapshot of the system to the dump device, and then immediately write out the crash dump files to your savecore directory. Because the system is still running, you may only use savecore -L if you have configured a dedicated dump device.

The `dumpadm` Command

The /usr/sbin/dumpadm command manages a system's crash dump configuration parameters. The following table describes dumpadm's configuration parameters.

Dump Parameter	Description
dump device	The device that stores dump data temporarily as the system crashes. When the dump device is not the swap area, `savecore` runs in the background, which speeds up the boot process.
savecore directory	The directory that stores system crash dump files.
dump content	Type of data, kernel memory or all of memory, to dump.
minimum free space	Minimum amount of free space required in the `savecore` directory after saving crash dump files. If no minimum free space has been configured, the default is one megabyte.

See dumpadm(1M) for more information.

The dump configuration parameters managed by the dumpadm command are stored in the /etc/dumpadm.conf file.

Note -

Do not /etc/dumpadm.conf edit manually. This could result in an inconsistent system dump configuration.

How the `dumpadm` Command Works

During system startup, the dumpadm command is invoked by the /etc/init.d/savecore script to configure crash dumps parameters based on information in the /etc/dumpadm.conf file.

Specifically, it initializes the dump device and the dump content through the /dev/dump interface.

After the dump configuration is complete, the savecore script looks for the location of the crash dump file directory by parsing the content of /etc/dumpadm.conf file. Then, savecore is invoked to check for crash dumps. It will also check the content of the minfree file in the crash dump directory.

Saving Crash Dumps

You can examine the control structures, active tables, memory images of a live or crashed system kernel, and other information about the operation of the kernel by using the crash or adb utilities. Using crash or adb to its full potential requires a detailed knowledge of the kernel, and is beyond the scope of this manual. See crash(1M) or adb(1)for more details on using these utilities.

Additionally, crash dumps saved by savecore can be useful to send to a customer service representative for analysis of why the system is crashing. If you will be sending crash dump files to a customer service representative, perform the first two tasks listed in "Managing System Crash Information Task Map".

The next section describes how to manage system crash information with the dumpadm command.

Managing System Crash Information Task Map

Table 39-1 Managing System Crash Information Task Map


Task	Description	For Instructions, Go To
1. Display the Current Crash Dump Configuration	Display the current crash dump configuration by using the `dumpadm`command.	"How to Display the Current Crash Dump Configuration"
2. Modify the Crash Dump Configuration	Use the `dumpadm` command to specify the type of data to dump, whether or not the system will use a dedicated dump device, the directory for saving crash dump files, and the amount of space that must remain available after crash dump files are written.	"How to Modify a Crash Dump Configuration"
3. Examine a Crash Dump File	Use the `crash` command to view crash dump files.	"How to Examine a Crash Dump"
4. Recover From a Full Crash Dump Directory	Optional. The system crashes but there is no room in the `savecore` directory, and you want to save some critical system crash dump information.	"How to Recover From a Full Crash Dump Directory (Optional)"
4. Disable or Enable the Saving of Crash Dump Files	Optional. Use the `dumpadm` command to disable or enable the saving the crash dump files. Saving crash dump files is enabled by default.	"How to Disable or Enable Saving Crash Dumps (Optional)"

How to Display the Current Crash Dump Configuration

Become superuser.

Display the current crash dump configuration by using the dumpadm command without any options.
# dumpadm Dump content: kernel pages Dump device: /dev/dsk/c0t3d0s1 (swap) Savecore directory: /var/pluto Savecore enabled: yes
The above example output means:
- The dump content is kernel memory pages.
- Kernel memory will be dumped on a swap device, /dev/dsk/c0t3d0s1. You can identify all your swap areas with the swap -l command.
- System crash dump files will be written in the /var/crash/venus directory.
- Saving crash dump files is enabled.

How to Modify a Crash Dump Configuration

Become superuser.

Identify the current crash dump configuration by using the dumpadm command.

# dumpadm
      Dump content: kernel pages
       Dump device: /dev/dsk/c0t3d0s1 (swap)
Savecore directory: /var/crash/pluto
  Savecore enabled: yes

This the default dump configuration for a system running the Solaris 8 release.

Modify the crash dump configuration by using the dumpadm command.

# dumpadm -c content -d dump-device -m nnnk | nnnm | nnn% -n -s savecore-dir

`-c` `content`	Specifies the type of data to dump: kernel memory or all of memory. The default dump content is kernel memory.
`-d` `dump-device`	Specifies the device that stores dump data temporarily as the system crashes. The primary swap device is the default dump device.
`-m` `nnnk` \| `nnnm` \| `nnn%`	Specifies the minimum free disk space for saving crash dump files by creating a `minfree` file in the current `savecore` directory. This parameter can be specified in kilobytes (`nnnk`) , megabytes (`nnnm`) or file system size percentage (`nnn%`). The `savecore` command consults this file prior to writing the crash dump files. If writing the crash dump files, based on their size, would decrease the amount of free space below the `minfree` threshold, the dump files are not written and an error message is logged. See "How to Recover From a Full Crash Dump Directory (Optional)" for recovering from this scenario.
`-n`	Specifies that `savecore` should not be run when the system reboots. This dump configuration is not recommended. If system crash information is written to the swap device, and `savecore` is not enabled, the crash dump information is overwritten when the system begins to swap.
`-s`	Specifies an alternate directory for storing crash dump files. The default directory is `/var/crash/hostname` where `hostname` is the output of the `uname -n` command.

Example--Modifying a Crash Dump Configuration

In this example, all of memory is dumped to the dedicated dump device, /dev/dsk/c0t1d0s1, and the minimum free space that must be available after the crash dump files are saved is 10% of the file system space.

# dumpadm
      Dump content: kernel pages
       Dump device: /dev/dsk/c0t3d0s1 (swap)
Savecore directory: /var/crash/pluto
  Savecore enabled: yes
 # dumpadm -c all -d /dev/dsk/c0t1d0s1 -m 10%
      Dump content: all pages
       Dump device: /dev/dsk/c0t1d0s1 (dedicated)
Savecore directory: /var/crash/pluto (minfree = 77071KB)
  Savecore enabled: yes

How to Examine a Crash Dump

Become superuser.

Examine a crash dump by using the crash utility.

# /usr/sbin/crash [-d crashdump-file] [-n name-list] [-w output-file]

`-d` `crashdump-file`	Specifies a file to contain the system memory image. The default crash dump file is `/dev/mem`.
`-n` `name-list`	Specifies a text file to contain symbol table information if you want to examine symbolic access to the system memory image. The default file name is `/dev/ksyms`.
`-w` `output-file`	Specifies a file to contain output from a crash session. The default is standard output.

Display crash status information.

# /usr/sbin/crash
dumpfile = /dev/mem, namelist = /dev/ksyms, outfile = stdout
> status
   .
   .
   .
> size buf proc queue
   .
   .
   .

Example--Examining a Crash Dump

The following example shows sample output from the crash utility. Information about status, and about the buffer, process, and queue size is displayed.

# /usr/sbin/crash
dumpfile = /dev/mem, namelist = /dev/ksyms, outfile = stdout
> status
system name:	SunOS
release:	5.8
node name:	earth
version:	s28_25
machine name:	sun4m
time of crash:	Wed Jun 30 16:02:31 1999
age of system:	18 min.
panicstr:	
panic registers:
	pc: 0      sp: 0
> size buf proc queue
120
1808
96

How to Recover From a Full Crash Dump Directory (Optional)

In this scenario, the system crashes but there is no room in the savecore directory, and you want to save some critical system crash dump information.

Clear out the savecore directory, usually /var/crash/hostname, by removing existing crash dump files that have already been sent to your service provider. Or, run the savecore command and specify an alternate directory that has sufficient disk space. (See the next step.)

Manually run the savecore command and if necessary, specify an alternate savecore directory.
# savecore [ directory ]

How to Disable or Enable Saving Crash Dumps (Optional)

Become superuser.

Disable or enable the saving of crash dumps on your system by using the dumpadm command.

Example--Disabling the Saving of Crash Dumps

This example illustrates how to disable the saving of crash dumps on your system.

# dumpadm -n
      Dump content: all pages
       Dump device: /dev/dsk/c0t1d0s1 (dedicated)
Savecore directory: /var/crash/pluto (minfree = 77071KB)
  Savecore enabled: no

Example--Enabling the Saving of Crash Dumps

This example illustrates how to enable the saving of crash dump on your system.

# dumpadm -y
      Dump content: all pages
       Dump device: /dev/dsk/c0t1d0s1 (dedicated)
Savecore directory: /var/crash/pluto (minfree = 77071KB)
  Savecore enabled: yes