A P P E N D I X  B

Error Messages

This appendix describes the error messages produced by the various components of the NAS software. It includes the following sections:


About Error Messages

This appendix describes the specific error messages sent through email, Simple Network Management Protocol (SNMP) notification, the liquid crystal display (LCD) panel, and the system log to notify the administrator in the event of a system error. SysMon, the monitoring thread in the NAS software, monitors the status of redundant array of independent disks (RAID) devices, uninterruptible power supplies (UPSs), file systems, NAS servers, controller units, expansion units, and environmental variables. Monitoring and error messages vary depending on model and configuration.


About SysMon Error Notification

SysMon, the monitoring thread in NAS appliances and gateway systems, captures events generated as a result of system errors. It then takes the appropriate action of sending an email, notifying the Simple Network Management Protocol (SNMP) server, displaying the error on the liquid crystal display (LCD) panel, writing an error message to the system log, or some combination of these actions. Email notification and the system log include the time of the event.


Reference: UPS Errors

Refer to TABLE B-1 for descriptions of uninterruptible power supply (UPS) error conditions.


TABLE B-1 UPS Error Messages

Event

Email Subject: Text

SNMP Trap

LCD Panel

Log

Power Failure

AC Power Failure:
AC power failure. System is running on UPS battery.

Severity = Error

Action: Restore system power.

EnvUpsOn
Battery

U20 on battery

UPS: AC power failure. System is running on UPS battery.

Power Restored

AC power restored:
AC power restored. System is running on AC power.

Severity = Notice

EnvUpsOff
Battery

U21 power restored

UPS: AC power restored.

Low Battery

UPS battery low:

UPS battery is low. The system will shut down if AC power is not restored soon.

Severity = Critical

Action: Restore AC power as soon as possible.

EnvUpsLow
Battery

U22 low battery

UPS: Low battery condition.

Normal Battery

UPS battery recharged:
The UPS battery has been recharged.

Severity = Notice

EnvUps
Normal
Battery

U22 battery normal

UPS: Battery recharged to normal condition.

Replace Battery

Replace UPS Battery:
The UPS battery is faulty.

Severity = Notice

Action: Replace the battery.

EnvUps
Replace
Battery

U23 battery fault

UPS: Battery requires replacement.

UPS Alarms - Ambient temperature or humidity outside acceptable thresholds

UPS abnormal temperature/humidity:
Abnormal temperature/humidity detected in the system.

Severity = Error

Action:

  1. Check UPS unit installation.
  2. Contact Sun Services.

EnvUps
Abnormal

U24 abnormal ambient

UPS: Abnormal temperature and/or humidity detected.

Write-back cache is disabled.

Controller Cache Disabled:
Either AC power or UPS is not charged completely.

Severity = Warning

Action:

  1. If AC power has failed, restore system power.
  2. If after a long time the UPS is not charged completely, check the UPS unit and replace if necessary.

 

Cache Disabled

write-back cache for ctrl x disabled

Write-back cache is enabled.

Controller Cache Enabled:
System AC power and UPS are reliable again. Write-back cache is enabled.

Severity = Notice

 

Cache Enabled

write-back cache for ctrl n enabled

UPS is shutting down.

UPS shutdown:
The system is being shut down because there is no AC power and the UPS battery is depleted.

Severity = Critical

 

 

!UPS: Shutting down

UPS Failure

UPS failure:
Communication with the UPS unit has failed.

Severity = Critical

Action:

  1. Check the serial cable connecting the UPS unit to the NAS server, or
  2. Check the UPS unit and replace if necessary.

EnvUpsFail

U25 UPS failure

UPS: Communication failure.



Reference: File-System Errors

TABLE B-2 describes file-system error messages that occur when the file-system usage exceeds a defined usage threshold. The default usage threshold is 95 percent.


TABLE B-2 File-System Errors

Event

Email Subject: Text

SNMP Trap

LCD Panel

Log

File System Full

File system full:
File system <name> is xx% full.

Severity=Error)

Action:

  1. Delete any unused or temporary files, or
  2. Extend the partition by using an unused partition, or
  3. Add additional disk drives and extend the partition after creating a new partition.

PartitionFull

F40 FileSystemName full

File system <name> usage capacity is xx%.



Reference: RAID Errors

TABLE B-3 displays events and error messages for the redundant array of independent disks (RAID) subsystem.


TABLE B-3 RAID Error Messages

Event

Email Subject: Text

SNMP Trap

LCD Panel

Log

LUN Failure

RAID LUN failure:
RAID LUN N failed and was taken offline. Slot n is offline.

Action: Replace bad drives and restore data from backup.

Severity = Error

RaidLunFail

R10 Lun failure

RAID LUN N failed and was taken offline. Slot n is offline.

(Severity=Error)

Disk Failure

Disk drive failure:
Disk drive failure. Failed drives are: Slot no., Vendor, Product ID, Size

Severity = Error

RaidDiskFail

R11 Drive failure

Disk drive failure. Failed drives are: Slot#, Vendor, Product ID, Size

(Severity=Error)

Controller Failure

RAID controller failure:
RAID controller N has failed.

Action: Contact Sun Services.

Severity = Error

RaidControllerFail

R12 Ctlr failure

RAID controller N failed.



Reference: IPMI Events

The NAS software uses the Intelligent Platform Management Interface (IPMI) board to monitor environmental systems, and to send messages regarding power supply and temperature anomalies. Device locations are shown in Appendix D.

TABLE B-4 describes the IPMI error messages for the NAS software.


TABLE B-4 IPMI Error Messages

Event

Email Subject: Text

SNMP Trap

LCD Panel

Log

Fan Error

Fan Failure:
Blower fan xx has failed. Fan speed = xx RPM.

Action: The fan must be replaced as soon as possible. If the temperature begins to rise, the situation could become critical.
Severity = Error

envFanFail trap

P11 Fan xx failed

Blower fan xx has failed!

Power Supply Module Failure

Power supply failure:
The power supply unit xx has failed.

Action: The power supply unit must be replaced as soon as possible.

Severity = Error

envPowerFail trap

P12 Power xx failed

Power supply unit xx has failed.

Power Supply Module Temperature

Power supply temperature critical:
The power supply unit xx is overheating.

Action: Replace the power supply to avoid any permanent damage.

Severity = Critical

envPowerTemp
Critical trap

P22 Power xx overheated

Power supply unit xx is overheating.

Temperature Error

Temperature critical:
Temperature in the system is critical. It is xxx Degrees Celsius.

Action: 1. Check for any fan failures, OR

2. Check for blockage of the ventilation, OR

3. Move the system to a cooler place.

Severity = Error

envTemperatue
Error trap

P51 Temp error

The temperature is critical.

Primary Power Cord Failure

Power cord failure:
The primary power cord has failed or been disconnected.

Action: 1. Check the power cord connections at both ends, OR

2. Replace the power cord.

Severity = Error

envPrimary
PowerFail trap

P31 Fail PWR cord 1

The primary power cord has failed.

Secondary Power Cord Failure

Power cord failure:
The secondary power cord has failed or been disconnected.

Action: 1. Check the power cord connections at both ends, OR

2. Replace the power cord.

Severity = Error

envSecondary
PowerFail trap

P32 Fail PWR cord 2

The secondary power cord has failed.