C H A P T E R  6

Alarm Rules

An alarm is a notification of an abnormal event. Sun Management Center software enables you to monitor your system using alarms that have differing severities.

This chapter summarizes the alarm rules specific to Sun SPARC Enterprise Mx000 servers. The chapter contains the following sections:

For more detailed information about alarms, refer to the Sun Management Center User's Guide.


About Alarm Rules

The add-on software contains a number of alarm rules used by the system to respond to the state of various components. Each alarm rule instance is applied to a specific property of a table in the platform administration module. A single rule can be applied to multiple properties and tables.

An alarm rule takes input from two main sources:

You can assign actions to rule states and state transitions through the Sun Management Center console. Refer to the Sun Management Center Software User's Guide for detailed information.


Reference: Platform Administration Module Alarm Rules

This section lists the alarm rules for properties monitored by the platform administration module.

The first table in each section lists

The alarm rules are also listed in the tables describing the platform administration module properties in Chapter 3.

The second table in each section lists each value for the monitored properties:

Error Status Rule (rErrorStatus)

Alarms governed by the error status rule alert you to changes in the status of the system or a component of the system.


TABLE 6-1 Error Status Rule Tables and Properties

Applicable Tables

Properties Read

System

System State, Firmware State, Hardware State

CMU Board

Error Status

CPU Module

Error Status

Memory Board

Error Status

Memory DIMM

Error Status

IOU Board

Error Status

PCI Slot

Error Status

System Board

CMU Error Status, IOU Error Status

XSB

Error Status

System Components

Error Status

Environmental Monitors

Value Status

Domain

Error Status



TABLE 6-2 Error Status Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

NORMAL

no alarm

OK

WARNING

warning

yellow

ALARM

error

red

CHANGE

no alarm

OK

NOTICE

info

blue

UNKNOWN

info

blue


LED State Rule (rLEDState)

Alarms governed by the LED state rule alert you when the system might require service.


TABLE 6-3 LED State Rule Tables and Properties

Applicable Tables

Properties Read

System

Check LED



TABLE 6-4 LED State Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

ON

error

red

OFF

no alarm

OK

BLINKING

info

blue

UNKNOWN

info

blue


Test State Rule (rTestState)

Alarms governed by the test state rule alert you when the current state of testing Extended System Boards (XSBs) is not either PASSED or UNMOUNTED.


TABLE 6-5 Test State Rule Tables and Properties

Applicable Tables

Properties Read

XSB

Test



TABLE 6-6 Test State Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

PASSED

no alarm

OK

FAILED

error

red

UNKNOWN

info

blue

UNMOUNTED

no alarm

OK

TESTING

info

blue


Domain Status Rule (rDomainStatus)

Alarms governed by the domain status rule alert you when the status of a domain is PANIC or UNKNOWN.


TABLE 6-7 Domain Status Rule Tables and Properties

Applicable Tables

Properties Read

Domain

Status



TABLE 6-8 Domain Status Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

POWER OFF

no alarm

OK

PANIC

error

red

SHUTDOWN

no alarm

OK

INITIALIZE

no alarm

OK

BOOT

no alarm

OK

RUNNING

no alarm

OK

PROM

no alarm

OK

CHANGE

no alarm

OK

UNKNOWN

warning

yellow


Valid Status Rule (rValidStatus)

Alarms governed by the valid status rule alert you when the status of an environmental probe is not VALID.


TABLE 6-9 Valid Status Rule Tables and Properties

Applicable Tables

Properties Read

Environmental Monitors

Status



TABLE 6-10 Valid Status Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

INVALID

warning

yellow

VALID

no alarm

OK

UNKNOWN

info

blue


External I/O Expansion Unit LED State Rule (rIoBoxLEDState)

Alarms governed by the External I/O Expansion Unit LED state rule alert you when such LEDs indicate there might be an issue requiring your attention, or that service might be required, relating to external I/O.


TABLE 6-11 External I/O Expansion Unit LED State Rule Tables and Properties

Applicable Tables

Properties Read

IO Box Chassis

Over Temperature LED, Service Required LED

IO Boat

Service Required LED

IO Box Power Supply and Fan

Service Required LED



TABLE 6-12 External I/O Expansion Unit LED State Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

OFF

no alarm

OK

STANDBY BLINK

no alarm

OK

BLINK SLOW

warning

yellow

BLINK FAST

no alarm

OK

FEEDBACK FLASH

no alarm

OK

ON

error

red

UNKNOWN

warning

yellow


Link Card LED State Rule (rLinkCardLEDState)

Alarms governed by the Link Card LED state rule alert you when such LEDs indicate there might be an issue requiring your attention, or that service might be required, relating to external I/O.


TABLE 6-13 Link Card LED State Rule Tables and Properties

Applicable Tables

Properties Read

Link Card

Data LED, Management LED



TABLE 6-14 Link Card LED State Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

OFF

error

red

STANDBY BLINK

no alarm

OK

BLINK SLOW

warning

yellow

BLINK FAST

no alarm

OK

FEEDBACK FLASH

no alarm

OK

ON

no alarm

OK

UNKNOWN

warning

yellow


OK To Remove LED Rule (rOKtoRemoveLED)

Alarms governed by the OK To Remove LED rule alert you when the OK To Remove LED property is ON or UNKNOWN.


TABLE 6-15 OK To Remove LED Rule Tables and Properties

Applicable Tables

Properties Read

IO Boat

OK To Remove LED



TABLE 6-16 OK To Remove LED Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

OFF

no alarm

OK

STANDBY BLINK

no alarm

OK

BLINK SLOW

no alarm

OK

BLINK FAST

no alarm

OK

FEEDBACK FLASH

no alarm

OK

ON

info

blue

UNKNOWN

warning

yellow


External I/O Expansion Unit Sensor Rule (rIoBoxSensor)

Alarms governed by the External I/O Expansion Unit sensor rule alert you when an environmental value is sensed that is equal to a threshold value, exceeds a maximum threshold value, or is lower than a minimum threshold value.


TABLE 6-17 External I/O Expansion Unit Sensor Rule Tables and Properties

Applicable Tables

Properties Read

IO Box Sensor

Value



TABLE 6-18 External I/O Expansion Unit Sensor Rule Property Values

Sensor Value

Alarm Level
(if any)

Meaning/Color

> minimum threshold value

no alarm

OK

< maximum threshold value

no alarm

OK

= minimum threshold value

warning

yellow

= maximum threshold value

warning

yellow

< minimum threshold value

error

red

> maximum threshold value

error

red



Reference: Domain Administration Module Alarm Rules

This section lists the alarm rules for properties monitored by the domain administration module.

The first table in each section lists

The alarm rules are also listed in the tables describing the domain administration module properties in Chapter 4.

The second table in each section lists each value for the monitored properties:

CPU Status Rule (oplCPUStatus)

Alarms governed by the CPU status rule alert you to changes in the status of the CPU. A caution alarm is generated if the processor is OFFLINE.


TABLE 6-19 CPU Status Rule Tables and Properties

Applicable Tables

Properties Read

Processor

Core Status



TABLE 6-20 CPU Status Rule Property Values

Property Value

 

Alarm Level
(if any)

Meaning/Color

ONLINE

no alarm

OK

OFFLINE

caution

blue

POWEROFF

no alarm

OK

UNKNOWN

no alarm

OK


State Check Rule (oplStateCheck)

Alarms governed by the state check rule alert you to changes in the CS status of a memory controller. A caution alarm is generated if the status is not OKAY.


TABLE 6-21 State Check Rule Tables and Properties

Applicable Tables

Properties Read

Memory Controller

CS0 Status, CS1 Status



TABLE 6-22 State Check Rule Property Values

Property Value

Alarm Level
(if any)

Meaning/Color

UNKNOWN

caution

blue

OKAY

no alarm

OK

DISABLED

caution

blue

UNDEFINED

caution

blue

MISCONFIGURED

caution

blue

FAIL-OBP

caution

blue

FAIL

caution

blue

BLACKLISTED

caution

blue

REDLISTED

caution

blue

--

caution

blue


Disk Error Count Rule (oplDskErrCnt)

Alarms governed by the disk error count rule alert you when an error count threshold is exceeded.


TABLE 6-23 Disk Error Count Rule Tables and Properties

Applicable Tables

Properties Read

Disk Device

Hardware Errors, Software Errors, Transport Errors



TABLE 6-24 Disk Error Count Rule Property Values

Error Count Threshold

Alarm Level
(if any)

Meaning/Color

5

info

blue

10

warning

yellow

15

error

red


Tape Error Count Rule (oplTpeErrCnt)

Alarms governed by the tape error count rule alert you when an error count threshold is exceeded.


TABLE 6-25 Tape Error Count Rule Tables and Properties

Applicable Tables

Properties Read

Tape Device

Tape Errors



TABLE 6-26 Tape Error Count Rule Property Values

Error Count Threshold

Alarm Level
(if any)

Meaning/Color

10

info

blue

20

warning

yellow

30

error

red