C H A P T E R  3

SNMP and PET Reference Information

This section describes Simple Network Management Protocol (SNMP) and Platform Event Trap (PET) messages that are generated by devices being monitored by ILOM. The messages described in this section are for both Sun Blade 6000 modular system and Sun Blade 6048 modular system.


SNMP Traps

SNMP Traps are generated by the SNMP agents that are installed on the SNMP devices being managed by ILOM. ILOM receives the SNMP Traps and converts them into SNMP event messages that appear in the event log. For more information about the SNMP event messages that might be generated on your system, see the following table.

 


SNMP Event

SNMP Trap Sent

Sensor Name

Severity

Description

event fault.chassis.
device.fail

sunHwTrapIOFault

/CH/NEM

Major

A component in the IO subsystem is suspected of causing a fault.

event fault.chassis.
device.fail

sunHwTrapIOFaultCleared

/CH/NEM

Informational

An IO subsystem component fault has been cleared.

Upper critical threshold exceeded

sunHwTrapTempCritThresholdExceeded

/CH/T_AMB

/CH/PSx/T_AMB

Major

A temperature sensor has reported that its value has gone above an upper critical threshold setting or below a lower critical threshold setting.

Upper critical threshold no longer exceeded

sunHwTrapTempCritThresholdDeasserted

/CH/T_AMB

/CH/PSx/T_AMB

Informational

A temperature sensor has reported that its value has gone below an upper critical threshold setting or above a lower critical threshold setting.

Upper fatal threshold exceeded

sunHwTrapTempFatalThresholdExceeded

/CH/T_AMB

/CH/PSx/T_AMB

Critical

A temperature sensor has reported that its value has gone above an upper fatal threshold setting or below a lower fatal threshold setting.

Upper fatal threshold no longer exceeded

sunHwTrapTempFatalThresholdDeasserted

/CH/T_AMB

/CH/PSx/T_AMB

Informational

A temperature sensor has reported that its value has gone below an upper fatal threshold setting or above a lower fatal threshold setting.

Assert

sunHwTrapPowerSupplyError

/CH/P_OVER_WARN

Major

A power supply sensor has detected an error.

Deassert

sunHwTrapPowerSupplyOk

/CH/P_OVER_WARN

Informational

A power supply sensor has returned to its normal state.

Assert

sunHwTrapComponentError

/CH/HOT

/CH/PSx/Sx/V_OUT_OK

 

Major

A sensor has detected an error. This generic ’component’ trap is generated when the SNMP agent does not recognize the component type.

Deassert

sunHwTrapComponentOk

/CH/HOT

/CH/PSx/Sx/V_OUT_OK

 

Informational

A sensor has returned to its normal state. This generic ’component’ trap is generated when the SNMP agent does not recognize the component type.

Lower fatal threshold exceeded

sunHwTrapFanSpeedFatalThresholdExceeded

/CH/PSx/FANx/TACH

 

Critical

A fan speed sensor has reported that its value has gone above an upper fatal threshold setting or below a lower fatal threshold setting.

Lower fatal threshold no longer exceeded

sunHwTrapFanSpeedFatalThresholdDeasserted

/CH/PSx/FANx/TACH

 

Informational

A fan speed sensor has reported that its value has gone below an upper fatal threshold setting or above a lower fatal threshold setting.



PET Event Messages

Platform Event Trap (PET) events are generated by systems with Alert Standard Format (ASF) or an IPMI baseboard management controller. The PET events provide advance warning of possible system failures. For more information about the PET event messages that might occur on your system, see the following table.

 


SNMP Event

SNMP Trap Sent

Sensor Name

Severity

Description

Temperature Upper critical threshold has been exceeded

petTrapTemperatureUpperNonCriticalGoingHigh

/CH/T_AMB

/CH/PSx/T_AMB

 

Major

Temperature has increased above upper critical threshold.

Temperature Upper critical threshold no longer exceeded

petTrapTemperatureUpperNonCriticalGoingLowDeassert

/CH/T_AMB

/CH/PSx/T_AMB

Warning

Temperature has decreased below upper critical threshold.

Temperature Lower fatal threshold has been exceeded

petTrapTemperatureUpperNonRecoverableGoingHigh

/CH/T_AMB

/CH/PSx/T_AMB

Critical

Temperature has increased above upper non-recoverable threshold.

Temperature Lower fatal threshold no longer exceeded

petTrapTemperatureUpperNonRecoverableGoingLowDeassert

/CH/T_AMB

/CH/PSx/T_AMB

Major

Temperature has decreased below upper non-recoverable threshold.

Temperature sensor ASSERT

petTrapTemperatureStateAssertedAssert

/CH/HOT

Critical

Temperature event occured. Possible cause: CPU is too hot.

Temperature sensor DEASSERT

petTrapTemperatureStateDeassertedAssert

/CH/HOT

Informational

Temperature event occured.

Entity Presence Insert

petTrapEntityPresenceDeviceInsertedAssert

/CH/BLx/PRSNT

/CH/BLx/HDDx/PRSNT

/CH/BLx/FMODx/PRSNT

/CH/BLx/ESM/PRSNT

/CH/NEMx/PRSNT

/CH/PSx/PRSNT

Informational

A device is present or has been inserted.

Entity Presence Remove

petTrapEntityPresenceDeviceRemovedAssert

/CH/BLx/PRSNT

/CH/BLx/HDDx/PRSNT

/CH/BLx/FMODx/PRSNT

/CH/BLx/ESM/PRSNT

/CH/NEMx/PRSNT

/CH/PSx/PRSNT

Informational

A device is absent or has been removed.

Module Transition to Running assert

petTrapModuleBoardTransitionToRunningAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

A device has transitioned to the normal running state. For a blade, this indicates that the host has powered on.

Module Transition to In Test assert

petTrapModuleBoardTransitionToInTestAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

A device is in a transitional state. (Only used for NEMs.)

Module Transition to Power Off assert

petTrapModuleBoardTransitionToPowerOffAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

A device has powered off.

Module Transition to On Line assert

petTrapModuleBoardTransitionToOnLineAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

A device is online and ready to enter the running state. (Only used for NEMs.)

Module Transition to Off Line assert

petTrapModuleBoardTransitionToOffLineAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

Unused.

Module Transition to Off Duty assert

petTrapModuleBoardTransitionToOffDutyAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

A device is no longer in use and is ready to be removed.

Module Transition to Degraded assert

petTrapModuleBoardTransitionToDegradedAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

A device is has entered a state of degraded operation, for example, due to a hardware fault, or an over-temperature condition that caused the device to shut itself down.

Module Transition to Power Save assert

petTrapModuleBoardTransitionToPowerSaveAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

Unused.

Module Install Error assert

petTrapModuleBoardInstallErrorAssert

/CH/BLx/STATE

/CH/NEMx/STATE

Informational

Unused.

OEM Reserved reporting Predictive Failure

petTrapOEMPredictiveFailureAsserted 12583937

/CH/BLx/ERR

/CH/BLx/ESM/ERR

/CH/NEMx/ERR

Major

OEM predictive failure asserted.

OEM Reserved Return to normal

petTrapOEMPredictiveFailureDeasserted

/CH/BLx/ERR

/CH/BLx/ESM/ERR

/CH/NEMx/ERR

Informational

OEM predictive failure deasserted.

Fan reporting Predictive Failure

petTrapFanPredictiveFailureAsserted

/CH/FMx/ERR

/CH/PSx/FAN_ERR

Major

Fan Predictive Failure detected.

Fan Return to normal

petTrapFanPredictiveFailureDeasserted

/CH/FMx/ERR

/CH/PSx/FAN_ERR

Informational

Fan Predictive Failure state has been cleared.

Voltage reporting Predictive Failure

petTrapVoltagePredictiveFailureAssertedAssert

/CH/PSx/V_3V3_ERR

/CH/PSx/Sx/V_IN_ERR

/CH/PSx/Sx/V_12V_ERR

Major

Voltage Predictive Failure detected.

Voltage Return to normal

petTrapVoltagePredictiveFailureDeassertedAssert

/CH/PSx/V_3V3_ERR

/CH/PSx/Sx/V_IN_ERR

/CH/PSx/Sx/V_12V_ERR

Informational

Predictive failure state due to voltage event has been cleared.

Temperature reporting Predictive Failure

petTrapTemperaturePredictiveFailureAsserted

/CH/PSx/TEMP_WRN

/CH/PSx/TEMP_ERR

Major

System is reporting a predictive failure as a result of high temperature.

Temperature Return to normal

petTrapTemperaturePredictiveFailureDeasserted

/CH/PSx/TEMP_WRN

/CH/PSx/TEMP_ERR

Informational

Predictive failure state due to high temperature has been cleared.

Fan Lower fatal threshold has been exceeded

petTrapFanLowerNonRecoverableGoingLow

/CH/PSx/FANx/TACH

Critical

Fan speed has decreased below lower non-recoverable threshold. Fan failed or removed.

Fan Lower fatal threshold no longer exceeded

petTrapFanLowerNonRecoverableGoingHighDeassert

/CH/PSx/FANx/TACH

Major

Fan speed has increased above lower non-recoverable threshold.

Voltage sensor ASSERT

petTrapVoltageStateAssertedAssert

/CH/PSx/Sx/V_OUT_OK

Informational

Voltage event occured.

Voltage sensor DEASSERT

petTrapVoltageStateDeassertedAssert

/CH/PSx/Sx/V_OUT_OK

Informational

Voltage event occured.

Current reporting Predictive Failure

petTrapCurrentPredictiveFailureAsserted

/CH/PSx/Sx/I_12V_ERR

/CH/PSx/Sx/I_12V_WRN

Major

Predictive Failure due to electric current conditions.

Current Return to normal

petTrapCurrentPredictiveFailureDeasserted

/CH/PSx/Sx/I_12V_ERR

/CH/PSx/Sx/I_12V_WRN

Informational

Predictive failure caused by electric current conditions.