Skip Navigation Links | |
Exit Print View | |
man pages section 1M: System Administration Commands Oracle Solaris 11 Information Library |
- fault management configuration tool
fmadm [-q] [subcommand [arguments]]
The fmadm utility can be used by administrators and service personnel to view and modify system configuration parameters maintained by the Solaris Fault Manager, fmd(1M). fmd receives telemetry information relating to problems detected by the system software, diagnoses these problems, and initiates proactive self-healing activities such as disabling faulty components.
fmadm can be used to:
view the set of diagnosis engines and agents that are currently participating in fault management,
view the list of system components that have been diagnosed as faulty, and
perform administrative tasks related to these entities.
The Fault Manager attempts to automate as many activities as possible, so use of fmadm is typically not required. When the Fault Manager needs help from a human administrator, service repair technician, or Oracle, it produces a message indicating its needs. It also refers you to a knowledge article on Sun's web site. The web site might ask you to use fmadm or one of the other fault management utilities to gather more information or perform additional tasks. The documentation for fmd(1M), fmdump(1M), and fmstat(1M) describe more about tools to observe fault management activities.
One responsibility of the Fault Manager is to keep track of the location of components. At the chassis level, the fmadm *-alias subcommands manages a chassis product-id.chassis-id to alias-id mapping. The administered alias-id is intended to describe, in some meaningful way, the physical location of a chassis.
The fmadm utility requires the user to possess the SYS_CONFIG privilege. Refer to the Oracle Solaris Administration: Security Services for more information about how to configure Solaris privileges. The fmadm load subcommand requires that the user possess all privileges.
fmadm accepts the following subcommands. Some of the subcommands accept or require additional options and operands. The load, unload, reset, and rotate subcommands are intended for trained technical personnel. We recommend against use of these subcommands without the specific guidance of, for example, a Knowledge Base article.
Notify the Fault Manager that the specified resource is not to be considered to be a suspect in the fault event identified by uuid, or if no UUID is specified, then in any fault or faults that have been detected. The fmadm acquit subcommand should be used only at the direction of a documented Sun repair procedure. Administrators might need to apply additional commands to re-enable a previously faulted resource.
Notify the Fault Manager that the fault event identified by uuid can be safely ignored. The fmadm acquit subcommand should be used only at the direction of a documented Sun repair procedure. Administrators might need to apply additional commands to re-enable any previously faulted resources.
Display the configuration of the Fault Manager itself, including the module name, version, and description of each component module. Fault Manager modules provide services such as automated diagnosis, self-healing, and messaging for hardware and software present on the system.
Display status information for resources that the Fault Manager currently believes to be faulty.
The following options are supported:
Display all faults. By default, the fmadm faulty command only lists output for resources that are currently present and faulty. If you specify the -a option, all resource information cached by the Fault Manager is listed, including faults which have been automatically corrected or where no recovery action is needed. The listing includes information for resources that might no longer be present in the system.
Display faulty fru's (Field replaceable units).
Group together faults which have the same fru, class and fault message.
Display persistent cache identifier for each resource in the Fault Manager.
If faults or resources are grouped together with the -a or -g options, limit the output to max entries.
Pipe output through pager with form feed between each fault.
Display Fault Management Resource with their Identifier (FMRI) and their fault management state.
Display 1 line fault summary for each fault event.
Only display fault with given uid.
Display full output.
The percentage certainty is displayed if a fault has multiple suspects, either of different classes or on different fru's. If more than one resource is on the same fru and it is not 100% certain that the fault is associated with the fru, the maximum percentage certainty of the possible suspects on the fru is displayed.
The Fault Manager associates the following states with every resource for which telemetry information has been received:
The resource is present and in use and has no known problems so far as the Fault Manager is concerned.
The resource is not present or not usable but has no known problems. This might indicate the resource has been disabled or deconfigured by an administrator. Consult appropriate management tools for more information.
The resource is present but is not usable because one or more problems have been diagnosed by the Fault Manager. The resource has been disabled to prevent further damage to the system.
The resource is present and usable, but one or more problems have been diagnosed in the resource by the Fault Manager.
If all affected resources are in the same state, this is reflected in the message at the end of the list. Otherwise the state is given after each affected resource.
Flush the information cached by the Fault Manager for the specified resource, named by its FMRI. This subcommand should only be used when indicated by a documented Sun repair procedure. Typically, the use of this command is not necessary as the Fault Manager keeps its cache up-to-date automatically. If a faulty resource is flushed from the cache, administrators might need to apply additional commands to enable the specified resource.
Load the specified Fault Manager module. path must be an absolute path and must refer to a module present in one of the defined directories for modules. Typically, the use of this command is not necessary as the Fault Manager loads modules automatically when Solaris initially boots or as needed.
Unload the specified Fault Manager module. Specify module using the basename listed in the fmadm config output. Typically, the use of this command is not necessary as the Fault Manager loads and unloads modules automatically based on the system configuration
Notify the Fault Manager that a repair procedure has been carried out on the specified resource. The fmadm repaired subcommand should be used only at the direction of a documented Sun repair procedure. Administrators might need to apply additional commands to re-enable a previously faulted resource.
Notify the Fault Manager that the specified resource has been replaced. This command should be used in those cases where the Fault Manager is unable to automatically detect the replacement. The fmadm replaced subcommand should be used only at the direction of a documented Sun repair procedure. Administrators might need to apply additional commands to re-enable a previously faulted resource.
Reset the specified Fault Manager module or module subcomponent. If the -s option is present, the specified Soft Error Rate Discrimination (SERD) engine is reset within the module. If the -s option is not present, the entire module is reset and all persistent state associated with the module is deleted. The fmadm reset subcommand should only be used at the direction of a documented Sun repair procedure. The use of this command is typically not necessary as the Fault Manager manages its modules automatically.
The rotate subcommand is a helper command for logadm(1M), so that logadm can rotate live log files correctly. It is not intended to be invoked directly (and invoking it directly is likely to lose log history). Use one of the following commands to cause the appropriate logfile to be rotated, if the current one is not zero in size:
# logadm -p now -s 1b /var/fm/fmd/errlog # logadm -p now -s 1b /var/fm/fmd/fltlog # logadm -p now -s 1b /var/fm/fmd/infolog # logadm -p now -s 1b /var/fm/fmd/infolog_hival
The add-alias subcommand is used to establish alias-id as a managed alias for the product-id.chassis-id chassis. When a managed alias is defined, the /dev/chassis devchassis(7FS) name space representation of the chassis will use the more meaningful alias-id instead of the product-id.chassis-id.
# fmadm add-alias SUN-Storage-J4410.1039QAQ007 RACK29.U25-28
The command shown above will verify that the new mapping does not conflict with existing mappings. In the case of conflict, no mapping change occurs. This subcommand completes when the associated name space updates are complete. If the updated name space does not use the new alias-id, a warning if printed, but the mapping is updated. If the name space update takes too long, a warning is printed.
If an optional comment is provided, the comment is preserved and will be displayed by a subsequent lookup-alias or list-alias command.
The remove-alias subcommand is used to remove an product-id.chassis-id to alias-id mapping.
# fmadm remove-alias RACK29.U25-28
The subcommand above completes when the associated name space updates are complete.
The lookup-alias subcommand can be used to determine what the current mapping is. The following is an example command.
# fmadm lookup-alias SUN-Storage-J4410.1039QAQ007
The list-alias subcommand is used to display all comments and mappings.
The sync-alias subcommand is used to hand-import a set of mappings in bulk. Two copies of the current mappings are maintained:
/etc/dev/chassis_aliases
/etc/dev/.chassis_aliase
To import a set of mappings in bulk, you can update the /etc/dev/chassis_aliases file and then run sync-alias.
The following options are supported:
Set quiet mode. fmadm does not produce messages indicating the result of successful operations to standard output.
The following operands are supported:
The name of a subcommand listed in SUBCOMMANDS.
One or more options or arguments appropriate for the selected subcommand, as described in SUBCOMMANDS. Among these arguments are fmri, uuid, and label. These identify resources that are the objects of fmadm subcommands. Use fmadm faulty to obtain the fmri, uuid, and label for a targeted resource. See EXAMPLES. In general, label is the most user-friendly of these operands.
Example 1 Invoking faulty Subcommand
The following command invokes the faulty subcommand, which displays the uuid, label, and fmri for a component.
# fmadm faulty ------------ ------------------------------------ ------------ --------- TIME EVENT-ID MSG-ID SEVERITY ------------ ------------------------------------ ------------ --------- Sep 09 16:15 96609fae-113c-e48c-b1cf-ebf4b0902d72 DISK-8000-3E Critical injected Host : x4170-brm-02 Platform : SUN-FIRE-X4170-SERVER Chassis_id : 0920XF508B Product_sn: Fault class: fault.io.scsi.cmd.disk.dev.rqs.derr Affects : dev:///:devid=id1,sd@n5000c5000940edbb//scsi_vhci/disk@g5000c\ 5000940edbb out of service, but associated components no longer faulty FRU : "DISK 11" (hc://:product-id=SUN-Storage-J4410:server-id=:chassis-id=:serial=000930\ G01CN4----3SJ01CN4:part=SEAGATE-ST330057SSUN300G:revision=0205/\ ses-enclosure=0/bay=11/disk=0) replaced ... ...
In the preceding output, the uuid is the first item in the EVENT-ID column, 96609fae-113c-e48c-b1cf-ebf4b0902d72. The label is in the FRU row, DISK 11. The fmris are:
dev:///:devid=id1,sd@n5000c5000940edbb//scsi_vhci/disk@g5000c\ 5000940edbb hc://:product-id=SUN-Storage-J4410:server-id=:chassis-id=:serial=000930\ G01CN4--------3SJ01CN4:part=SEAGATE-ST330057SSUN300G:revision=0205/\ ses-enclosure=0/bay=11/disk=0)
The same values are available with fmdump -v:
# fmdump -v Sep 09 16:15:36.9252 96609fae-113c-e48c-b1cf-ebf4b0902d72 DISK-8000-3E \ Diagnosed 100% fault.io.scsi.cmd.disk.dev.rqs.derr Problem in: hc://:scheme=:product-id=SUN-Storage-J4410:chassis-id=:\ server-id=/ses-enclosure=0/bay=11/disk=0 Affects: dev:///:devid=id1,sd@n5000c5000940edbb//\ scsi_vhci/disk@g5000c5000940edbb FRU: hc://:product-id=SUN-Storage-J4410:server-id=:chassis-id=:\ serial=000930G01CN4--------3SJ01CN4:part=SEAGATE-ST330057SSUN300G:\ revision=0205/ses-enclosure=0/bay=11/disk=0 Location: DISK 11
Note that label is the easiest-to-use identifier.
Example 2 Obtaining Module Name
The following command displays the module name for each component. The module name is specified as input to the fmadm unload command.
# fmadm config MODULE VERSION STATUS DESCRIPTION cpumem-retire 1.1 active CPU/Memory Retire Agent disk-transport 1.0 active Disk Transport Agent eft 1.16 active eft diagnosis engine ..
The following exit values are returned:
Successful completion.
An error occurred. Errors include a failure to communicate with fmd or insufficient privileges to perform the requested operation.
Invalid command-line options were specified.
See attributes(5) for descriptions of the following attributes:
|
The command-line options are Committed. The human-readable output is not-an-interface.
fmd(1M), fmdump(1M), fmstat(1M), logadm(1M), syslogd(1M), attributes(5), devchassis(7FS)