Go to main content

Managing Faults, Defects, and Alerts in Oracle® Solaris 11.4

Exit Print View

Updated: November 2020
 
 

Receiving Notification of Faults, Defects, and Alerts

The Fault Manager daemon notifies you that a fault or defect has been detected and diagnosed and alerts you to other changes to your system.

Configuring When and How You Will Be Notified

This section describes the following capabilities:

  • Using the svccfg command to configure which FMA events you will receive notification for and how you will be notified

  • Using the coreadm command to enable or disable generating diagnostic core dumps and alerts about diagnostic core dumps

Configuring Which Events to Notify You of and How to Notify You

Use the svcs -n and svccfg listnotify commands to show event notification parameters, as shown in Showing Event Notification Parameters in Managing System Services in Oracle Solaris 11.4. Settings for notification parameters for FMA events are stored in properties in svc:/system/fm/notify-params:default. System-wide notification parameters for SMF state transition events are stored in svc:/system/svc/global:default.

Use the svccfg setnotify command to configure FMA event notification, as shown in Configuring Notification of State Transition and FMA Events in Managing System Services in Oracle Solaris 11.4. For example, the following command creates a notification that sends an SMTP message when an FMA-managed problem is repaired:

$ svccfg setnotify problem-repaired smtp:

You can configure notification of fault management error events to use the Simple Mail Transfer Protocol (SMTP) or the Simple Network Management Protocol (SNMP).

FMA event tags include problem-diagnosed, problem-updated, problem-repaired, and problem-resolved. These tags correspond to the problem lifecycle stages described in Fault Management Overview.

Event notification and FMA event tags are also described in the Notification Parameters section in the smf(7) man page. For more information about the notification daemons, see the snmp-notify(8), smtp-notify(8), and asr-notify(8) man pages.

Events generated by SMF state transitions are stored in the service or in the transitioning service instance.

Configuring Reporting of Diagnostic Core Dumps

By default, a diagnostic core file is generated in /var/diag and an FMA alert is generated when a process terminates abnormally. See COREDIAG Alerts for information about diagnostic core files.

To change the default behavior, use options of the coreadm command or modify settings of the coreadm:default service. The diagnostic and alert command options and service properties interact as described in the following table:

Table 1  Effects of Diagnostic Core File Reporting Settings
Command Option and Service Property Settings
Resultant Behavior
$ coreadm -e diagnostic -e alert
or
config_params/diagnostic_enabled = true
config_params/diag_alert_enabled = true
Default. Generate a diagnostic core file, a JSON summary file, and an FMA alert.
$ coreadm -e diagnostic -d alert
or
config_params/diagnostic_enabled = true
config_params/diag_alert_enabled = false
Full and silent. Generate a diagnostic core file and a JSON summary file. Do not generate an FMA alert.
$ coreadm -d diagnostic -e alert
or
config_params/diagnostic_enabled = false
config_params/diag_alert_enabled = true
Limited reporting. Generate an empty diagnostic core file. Generate a JSON summary file and an FMA alert.
$ coreadm -d diagnostic -d alert
or
config_params/diagnostic_enabled = false
config_params/diag_alert_enabled = false
Turn off all diagnostics. Do not generate a diagnostic core file or a JSON summary file. Do not generate an FMA alert.

Understanding Messages From the Fault Manager Daemon

The Fault Manager daemon sends messages to both the console and the /var/adm/messages file. Messages from the Fault Manager daemon use the format shown in the following example except that lines in the following example that do not begin with a date actually belong with the preceding line that begins with a date:

Apr 17 15:57:35 bur-7430 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: FMD-8000-CV,
TYPE: Alert, VER: 1, SEVERITY: Minor
Apr 17 15:57:35 bur-7430 EVENT-TIME: Fri Apr 17 15:56:28 EDT 2015
Apr 17 15:57:35 bur-7430 PLATFORM: SUN SERVER X4-4, CSN: 1421NM900G, HOSTNAME: bur-7430
Apr 17 15:57:35 bur-7430 SOURCE: software-diagnosis, REV: 0.1
Apr 17 15:57:35 bur-7430 EVENT-ID: b22c3c73-77d7-4f4e-8030-c589bf057bb9
Apr 17 15:57:35 bur-7430 DESC: FRU '/SYS/HDD0' has been removed from the system.
Apr 17 15:57:35 bur-7430 AUTO-RESPONSE: FMD topology will be updated.
Apr 17 15:57:35 bur-7430 IMPACT: System impact depends on the type of FRU.
Apr 17 15:57:35 bur-7430 REC-ACTION: Use 'fmadm faulty' to provide a more detailed
view of this event. Please refer to the associated reference document at
http://support.oracle.com/msg/FMD-8000-CV for the latest service procedures and
policies regarding this diagnosis.

When you are notified of a diagnosis, consult the recommended knowledge article for additional details. The recommended knowledge article is listed in the last line of the output, which is labeled REC-ACTION for recommended action. The knowledge article might contain actions that you or a service provider should take in addition to other actions listed in the REC-ACTION line.