Go to main content

Managing Faults, Defects, and Alerts in Oracle® Solaris 11.3

Exit Print View

Updated: March 2018
 
 

Fault Manager and Module Statistics

The Fault Manager daemon and many of its modules gather statistics. The fmadm config command shows the status of Fault Manager modules. The fmstat command reports statistics gathered by these modules.

Example 8  fmadm config Output
# fmadm config
MODULE                   VERSION STATUS  DESCRIPTION
cpumem-retire            1.1     active  CPU/Memory Retire Agent
disk-diagnosis           0.1     active  Disk Diagnosis engine
disk-transport           2.1     active  Disk Transport Agent
eft                      1.16    active  eft diagnosis engine
ext-event-transport      0.2     active  External FM event transport
fabric-xlate             1.0     active  Fabric Ereport Translater
fmd-self-diagnosis       1.0     active  Fault Manager Self-Diagnosis
fru-monitor              1.1     active  FRU Monitor
io-retire                2.0     active  I/O Retire Agent
network-monitor          1.0     active  Network monitor
sensor-transport         1.2     active  Sensor Transport Agent
ses-log-transport        1.0     active  SES Log Transport Agent
software-diagnosis       0.1     active  Software Diagnosis engine
software-response        0.1     active  Software Response Agent
sysevent-transport       1.0     active  SysEvent Transport Agent
syslog-msgs              1.1     active  Syslog Messaging Agent
zfs-diagnosis            1.0     active  ZFS Diagnosis Engine
zfs-retire               1.0     active  ZFS Retire Agent
Example 9  fmstat Output Showing All Loaded Modules

Without options, the fmstat command provides a high-level overview of the events, processing times, and memory usage of all loaded modules.

# fmstat
module             ev_recv ev_acpt wait  svc_t    %w  %b  open solve  memsz  bufsz
cpumem-retire            0       0  0.0 10010.0    0   0     0     0      0      0
disk-diagnosis           0       0  0.0 10007.7    0   0     0     0      0      0
disk-transport           0       0  0.9 1811945.5 92   0     0     0    52b      0
eft                      0       0  0.0 4278.0     0   0     3     0   1.6M    58b
ext-event-transport      6       0  0.0  860.8     0   0     0     0    46b   2.0K
fabric-xlate             0       0  0.0    4.8     0   0     0     0      0      0
fmd-self-diagnosis     393       0  0.0   25.5     0   0     0     0      0      0
fru-monitor              2       0  0.0   42.4     0   0     0     0   880b      0
io-retire                1       0  0.0 5003.8     0   0     0     0      0      0
network-monitor          0       0  0.0   13.2     0   0     0     0   664b      0
sensor-transport         0       0  0.0   38.3     0   0     0     0    40b      0
ses-log-transport        0       0  0.0   23.8     0   0     0     0    40b      0
software-diagnosis       0       0  0.0 10010.0    0   0     0     0   316b      0
software-response        0       0  0.0 10006.8    0   0     0     0    14K    14K
sysevent-transport       0       0  0.0 6125.0     0   0     0     0      0      0
syslog-msgs              2       0  0.0 3337.2     0   0     0     0      0      0
zfs-diagnosis            4       0  0.0 2002.0     0   0     0     0      0      0
zfs-retire               4       0  0.0 2715.1     0   0     0     0     4b      0
ev_recv

The number of telemetry events received by the module.

ev_acpt

The number of telemetry events accepted by the module as relevant to a diagnosis.

wait

The average number of telemetry events waiting to be examined by the module.

svc_t

The average service time for telemetry events received by the module, in milliseconds.

%w

The percentage of time that telemetry events were waiting to be examined by the module.

%b

The percentage of time that the module was busy processing telemetry events.

open

The number of active cases (open problem investigations) owned by the module. The open column applies only to fault management cases, which are created and solved only by diagnosis engines. This column does not apply to other modules, such as response agents.

solve

The total number of cases solved by this module since it was loaded. The solve column applies only to fault management cases, which are created and solved only by diagnosis engines. This column does not apply to other modules, such as response agents.

memsz

The amount of dynamic memory currently allocated by this module.

bufsz

The amount of persistent buffer space currently allocated by this module.

Example 10  fmstat Output Showing a Single Module

Different statistics and columns are displayed when you specify different options.

To display statistics on an individual module, use the -m module option. The -z option suppresses zero-valued statistics. The following example shows that the cpumem-retire response agent successfully processed a request to take a CPU offline.

# fmstat -z -m cpumem-retire
  NAME      VALUE        DESCRIPTION
  cpu_flts  1            cpu faults resolved

See the fmstat(1M) man page for information about other options.