C H A P T E R  4

Alarms

This chapter summarizes the Alarm Rules that are specific to the supported platform components.

The chapter contains the following sections:

Each section provides information about error classes, default alarm levels, and recommended action to take when the alarms are triggered.


Alarm Rules

The hardware common config reader contains a number of alarm rules used by the system to determine the state of various components. Each alarm rule instance is applied to a specific property of a table in the config reader. A single rule can be applied to multiple properties and tables.

An alarm rule takes input from three main sources:

All three of these sources can be modified on a per-object and property basis. You can change user-specifiable values, while the rule programmer specifies which object properties and stored data are used.

You can assign actions to rule states and state transitions through the Sun Management Center console. Refer to the Sun Management Center User's Guide for more information. You can also modify the values mentioned in this chapter by editing the file ELP-base_ruleinit-d.x directly.


Operational State Rule

This rule is applied to any node that contains an Operational Status property. It generates an alarm if the operational state is anything other than OK, Starting, Stopping, or double dash (--). The error string incorporates the value of the Additional Information property to provide additional information to the end user.


TABLE 4-1 Operational Status Rule

Applicable tables

Any that contain Operational Status property

Properties read

Operational Status, Additional Information

Alarm trigger

Operational Status is not OK, Starting, Stopping, or --

Editable parameters

Alarm Severity for the error class associated with the value of Operational Status.


Error Classes and Default Alarm Levels

This rule associates specific values of the Operational Status property with specific error classes. Those error classes in turn determine the level of alarm that is generated for the associated values. TABLE 4-2 lists the possible Operational Status values with their associated error classes and default alarm levels.


TABLE 4-2 Operational Status Values, Error Classes, and Default Alarm Levels

Operational Status Value

Error Class

Default Alarm Level

OK

None

None

Starting

None

None

Stopping

None

None

--

None

None

Error

Critical

Critical

Non-Recoverable

Critical

Critical

Degraded

Degraded

Alert

Predicted Failure

Degraded

Alert

Stressed

Degraded

Alert

Service

Service

None

Stopped

Service

None

All Others

Unknown

Caution


You can edit the alarm levels associated with each error class. TABLE 4-3 lists the error classes for the Operational Status rule and their default alarm levels.


TABLE 4-3 Default Alarm Levels for Operational Status Rule Error Classes

Error Class

Default Alarm Level

Critical

3 Critical

Degraded

2 Alert

Unknown

1 Caution

Service

0 None


Action

If an Alert or Critical alarm is generated, contact your Sun service representative.

The Caution alarm is for information only and is not an error. If necessary, contact your Sun service representative to help determine why the operations status is Unknown.


Availability Rule

This rule is applied to any table with an Availability property.


TABLE 4-4 Availability Rule

Applicable tables

Any that contain the Availability property

Properties read

Availability

Alarm trigger

Availability is not OK, Running, Not Applicable, or --

Editable parameters

Alarm Severity for the error class associated with the value of Availability.


Error Classes and Default Alarm Levels

This rule associates specific values of the Availability property with specific error classes. Those error classes in turn determine the level of alarm that is generated for the associated values. TABLE 4-5 lists the possible Availability values with their associated error classes and default alarm levels.


TABLE 4-5 Availability Values, Error Classes, and Default Alarm Levels

Availability Value

Error Class

Default Alarm Level

OK

None

None

Running

None

None

Not Applicable

None

None

-- (double dash)

None

None

Degraded

Degraded

Alert

Warning

Degraded

Alert

PowerSave - Warning

Degraded

Alert

Install Error

Degraded

Alert

Not Configured

Uninstalled

None

Not Installed

Uninstalled

None

Not Ready

Uninstalled

None

All Others

Default

None


You can edit the alarm levels associated with each error class. TABLE 4-6 lists the error classes for the Availability rule and their default alarm levels.


TABLE 4-6 Default Alarm Levels for Availability Rule Error Classes

Error Class

Default Alarm Level

Degraded

2 Alert

Uninstalled

None

Default

None


Action

Contact your Sun service representative for information about correcting the problem.


Non-Numeric Sensor Rule

This rule is applied to any non-numeric sensor. It uses the Current Reading in the error message.


TABLE 4-7 Non-Numeric Sensor Rule

Applicable tables

Non-Numeric Temperature, Voltage, and Current sensors

Properties read

Current Value, Normal Values

Alarm trigger

Current Value is not one of the Normal Values

Editable parameters

Alarm Severity


Error Classes and Default Alarm Levels

This rule generates an alarm if the value of Current Reading does not match one of the values for the Normal Values property. In this case, an alarm is generated. The default alarm level associated with this error is Critical. TABLE 4-8 describes the property value, along with its associated error class and default alarm level.


TABLE 4-8 Current Reading Property Value, Error Class, and Default Alarm Level

Current Reading Value

Error Class

Default Alarm Level

Does not match any of the values of the Normal Values property

Alarm

Critical


You can change the alarm level associated with this Alarm error class.

Action

Contact your Sun service representative for information about correcting the problem.


Numeric Sensor Threshold Rule

This rule is applied to any numeric sensor. It reads the various thresholds presented in the sensor, and generates an alarm if the current value is outside the specified ranges.


TABLE 4-9 Numeric Sensor Threshold Rule

Applicable tables

Numeric Temperature, Voltage, and Current Sensors, Tachometers

Properties read

Current Value, Threshold Values

Alarm trigger

Current Value is outside Threshold ranges

Editable parameters

Alarm Severity for the error class associated with the Threshold above or below which the value of Current Reading lies


Error Classes and Default Alarm Levels

This rule generates an alarm when the value of Current Reading falls below any of the Lower Threshold values or rises above any of the Upper Threshold values. The level of alarm generated is determined by the error class associated with the threshold. TABLE 4-10 lists the possible threshold property values with their associated error classes and default alarm levels.



Note - When a Threshold is set to -- (double dash), this rule does not compare the value of Current Reading with it.




TABLE 4-10 Current Reading Property Values, Error Classes, and Default Alarm Levels

Current Reading Value

Error Class

Default Alarm Level

< Lower Non-Critical Threshold

Non-Critical

Caution

> Upper Non-Critical Threshold

Non-Critical

Caution

< Lower Critical Threshold

Critical

Alert

> Upper Critical Threshold

Critical

Alert

< Lower Fatal Threshold

Fatal

Critical

> Upper Fatal Threshold

Fatal

Critical


You can edit the alarm levels associated with each error class. TABLE 4-11 lists the error classes for the Availability rule and their default alarm levels.


TABLE 4-11 Default Alarm Levels for Numeric Sensor Threshold Rule Error Classes

Error Class

Default Alarm Level

Non-Critical

Caution

Critical

Alert

Fatal

Critical


Action

Contact your Sun service representative for information about correcting the problem.


Occupancy Rule

This rule generates an alarm when the occupancy of a location changes.


TABLE 4-12 Occupancy Rule

Applicable tables

Location

Properties read

Name, Occupancy

Alarm trigger

The occupancy changes

Editable parameters

Alarm Severity




Note - You can clear this alarm by acknowledging the alarm in the Sun Management Center console. All other alarms are cleared by a change of state.



Error Classes and Default Alarm Levels

This rule generates an alarm if the value of Occupancy has changed since the last time it was checked. In this case, an alarm is generated. The default alarm level associated with this error is Caution. TABLE 4-13 describes the occupancy property value, along with its associated error class and default alarm level.


TABLE 4-13 Occupancy Property Value, Error Class, and Default Alarm Level

Occupancy Value

Error Class

Default Alarm Level

Does not match the previous value reported for this property.

Alarm

Caution


You can change the alarm level associated with this Alarm error class.

Action

The Caution alarm is for information only and is not an error. If necessary, contact your Sun service representative to obtain more information about the value of the Occupancy property.


Rate or Count Rule

This rule enables you to specify a rate or count for any integer property. If the rate or count exceeds the specified values, an alarm is generated. Apply the rule to all properties that count a number of errors, so that you can generate such alarms as required.


TABLE 4-14 Rate or Count Rule

Applicable tables

  • Memory Modules table - ECC Error Count
  • Media Devices table - Hard Error Count, Soft Error Count, Transport Error Count
  • Network Interfaces table - Output Error Count

Properties read

Error Counts and similar integer properties

Alarm trigger

Rate or Count exceeds user-specified value

Editable parameters

Rate, Count, and Alarm Severity


Error Count, Error Rate, and Default Alarm Levels

This rule generates an alarm when one or both of the following is true for one of the properties:



Note - When the specified error count or error rate is set to less than zero, the rule does not check the error count or rate. If the alarm level is not greater than zero, no alarm will be generated.



By default, the values are set to -1, so the rule does not check the error count or rate until you set it. You can change the values of the Error Count, Error Rate, and Alarm Level parameters. TABLE 4-15 describes these parameters and lists their default values.


TABLE 4-15 Rate or Count Rule Parameters

Parameter

Units

Default

Meaning

Error Count

Integer

-1

Total number of errors

Error Rate

Float

-1

Number of errors per minute

Alarm Level

Unsigned Integer

2 Alert

0 = None

1 = Caution

2 = Alert

3 = Critical


Action

Contact your Sun service representative for information about correcting the problem.


Module Status Rule

This rule applies only to the Module Status property in the system object. It is primarily used to report module data acquisition problems.


TABLE 4-16 Module Status Rule

Applicable tables

System

Properties read

Module Status, Module Status Severity

Alarm trigger

Status is not OK

Editable parameters

Alarm Severity for the error class associated with the value of Module Status.


Error Classes and Default Alarm Levels

This rule generates an alarm of a certain level when a problem is encountered during data acquisition. The rule associates specific values of the Module Status property with specific error classes. Those error classes in turn determine the level of alarm that is generated for the associated values. TABLE 4-17 lists possible Module Status values with their associated error classes and default alarm levels.


TABLE 4-17 Module Status Values, Error Classes, and Default Alarm Levels

Module Status Value

Error Class

Default Alarm Level

DAQ Failure

Critical

Critical

Memory Allocation

Warning

Alert

Internal Error

Info

Caution

OK

None

None


You can edit the alarm levels associated with each error class. TABLE 4-18 lists the error classes for the Module Status rule and their default alarm levels.


TABLE 4-18 Default Alarm Levels for Module Status Rule Error Classes

Error Class

Default Alarm Level

Critical

3 Critical

Warning

2 Alert

Info

1 Caution

None

0 None


Action

If an Alert or Critical alarm is generated, contact your Sun service representative.

A Caution alarm might not be an error. Check the console data and contact your Sun service representative if data is missing or unexpected.


Indicator Status Rule

This rule applies only to the Indicator State property in the Indicator object.


TABLE 4-19 Indicator Status Rule

Applicable tables

Indicator

Properties read

Indicator State, Expected State

Alarm trigger

State does not equal Expected State

Editable parameters

Alarm Severity


Error Classes and Default Alarm Levels

This rule generates an alarm when the value of Indicator State does not match the Expected State. The default alarm level associated with this error is Caution. TABLE 4-20 describes the property value, along with its associated error class and default alarm level.



Note - When the value of Expected State is -- (double dash), this rule does not compare the value of Indicator State with it.




TABLE 4-20 Indicator State Property Value, Error Class, and Default Alarm Level

Indicator State Value

Error Class

Default Alarm Level

Does not match value of Expected State value

Alarm

1 Caution


You can change the alarm level associated with this Alarm error class.

Action

Contact your Sun service representative for information about correcting the problem.