An alert is information of interest that is neither a fault nor a defect. An alert might report a problem or might be simply informational. A problem that is reported by an alert is a misconfiguration or other problem that the administrator can resolve without assistance from a response agent. An example of this type of problem is a DIMM plugged into the wrong slot. An example of an informational message reported by an alert is a message that a shadow migration has completed. The following list provides examples of alert messages:
Threshold alerts – Temperature is high, storage is at capacity, a zpool is at 80% or 90% capacity, a quota is exceeded, the path count to a chassis or disk has changed. These kinds of alerts can predict a performance impact.
Configuration checks – An FRU has been added or removed, SAS cabling is incorrect, a DIMM is plugged into the wrong slot, a datalink changed, a link went up or down, ILOM is misconfigured, MTU (Maximum Transmission Unit - TCP/IP) is misconfigured.
Interesting events – A reboot occurred, file system events occurred, firmware has been upgraded, save core failed, ZFS deduplication failed, shadow migration completed.
Alerts can be in one of the following states:
active – The alert has not been cleared.
cleared – The alert has been cleared. The cleared state for alerts can be compared to the resolved state for faults and defects. See the following description of persistent and transient alerts for more information about clearing an alert.
Alerts can be persistent or transient.
A persistent alert is active until it is manually cleared as shown in fmadm clear Command.
A transient alert clears after a specified timeout period or is cleared by a service such as a network monitor.
Use the fmadm list-alert command to list all alerts that have not been cleared. The following alert shows that a disk has been removed from the system. The Problem Status has the value open, which is an active state. Problem Status can be open, isolated, repaired, or resolved. The Problem class indicates that the FRU has been removed. The Impact indicates that the severity of the impact depends on the importance of this device in your environment. Perhaps the most useful piece of information in this output is the MSG-ID. Follow the instructions in the Action at the end of the alert to access more information about FMD-8000-CV.
# fmadm list-alert --------------- ------------------------------------ -------------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ -------------- --------- Apr 23 02:15:12 a7921317-8ba2-4ab1-b1c3-b0fb8822c000 FMD-8000-CV Minor Problem Status : open Diag Engine : software-diagnosis / 0.1 System Manufacturer : Oracle Corporation Name : Sun Netra X4270 M3 Part_Number : NILE-P1LRQT-8 Serial_Number : 1211FM200D System Component Manufacturer : Oracle Name : Sun Netra X4270 M3 Part_Number : NILE-P1LRQT-8 Serial_Number : 1211FM200D Host_ID : 008167b1 ---------------------------------------- Suspect 1 of 1 : Problem class : alert.oracle.solaris.fmd.fru-monitor.fru-remove Certainty : 100% FRU Status : faulty/not present Location : "/SUN-Storage-J4410.1051QCQ08A/HDD13" Manufacturer : SEAGATE Name : ST330057SSUN300G Part_Number : SEAGATE-ST330057SSUN300G Revision : 0B25 Serial_Number : 001117G1LC1S--------6SJ1LC1S Chassis Manufacturer : SUN Name : SUN-Storage-J4410 Part_Number : 3753659 Serial_Number : 1051QCQ08A Resource Status : faulty/not present Description : FRU '/SUN-Storage-J4410.1051QCQ08A/HDD13' has been removed from the system. Response : FMD topology will be updated. Impact : System impact depends on the type of FRU. Action : Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/FMD-8000-CV for the latest service procedures and policies regarding this diagnosis.