Repairing Faults and Defects
After Fault Management has identified a faulted component in your system, you should repair it. A repair can happen in one of two ways: implicitly or explicitly.
-
Implicit repair – An implicit repair can occur when the faulty component is replaced or removed, provided the component has serial number information that the Fault Manager daemon can track. The system's serial number information is included so that the Fault Manager daemon can determine when components have been removed from operation, either through replacement or other means. When such detections occur, the Fault Manager daemon no longer displays the affected resource in
fmadm list
output. The resource is maintained in the daemon's internal resource cache until the fault event is 30 days old, at which point it is purged. -
Explicit repair – An explicit repair is required if no FRU serial number is available. For example, CPUs have no serial numbers. In these cases, the Fault Manager daemon cannot detect a FRU replacement.
Use the
fmadm
command to explicitly mark a fault as repaired. The options include:-
fmadm replaced
label -
fmadm repaired
label -
fmadm acquit
label[
uuid]
-
fmadm acquit
uuid
Although these four commands can take UUIDs or labels as arguments, it is better to use the label. For example, the label
/SYS/MB/P0
represents the CPU labeled "P0" on the motherboard.If a FRU has multiple faults against it and you want to replace the FRU only one time, use the
fmadm replaced
command against the FRU. -