C H A P T E R  4

Monitoring the Sun Storage J4000 Array Family

This chapter describes the monitoring process and how to set up monitoring system wide and on individual arrays. It contains the following sections:

For more information about the concepts introduced in this chapter, see the appropriate topic in the online help.


Monitoring Overview

The Fault Management Service (FMS) is a software component of the Sun StorageTek Common Array Manager that is used to monitor and diagnose the storage systems. The primary monitoring and diagnostic functions of the software are:

An FMS agent, which runs as a background process, monitors all devices managed by the Sun StorageTek Common Array Manager.

The high-level steps of a monitoring cycle are as follows.

1. Verify that the agent is idle.

The system generates instrumentation reports by probing the device for all relevant information, and it saves this information. The system then compares the report data to previous reports and evaluates the differences to determine whether health-related events need to be generated.

Events are also created from problems reported by the array. If the array reports a problem, an alarm is generated directly. When the problem is no longer reported by the array, the alarm is removed.

2. Store instrumentation reports for future comparison.

Event logs are accessible by accessing the Events page for an array from the navigation pane in the user interface. The software updates the database with the necessary statistics. Some events require that a certain threshold be attained before an event is generated. For example, having the cyclic redundancy count (CRC) of a switch port increase by one is not sufficient to trigger an event, since a certain threshold is required.

3. Send the alarms to interested parties.

Alarms are sent only to recipients that have been set up for notification. The types of alarms can be filtered so that only pertinent alarms are sent to each individual.

Note: If they are enabled, the email providers receive notification of all alarms.

Alarms are created when a problem is encountered that requires action. When the root-cause problem of the alarm is corrected, the alarm will either be cleared automatically or you must manually clear the alarm. See the CAM Service Advisor procedures for details.

Monitoring Strategy

The following procedure is a typical strategy for monitoring.

1. Monitor the devices.

To get a broad view of the problem, the site administrator or Sun personnel can review reported information in context. This can be done by:

2. Isolate the problem.

For many alarms, information regarding the probable cause and recommended action can be accessed from the alarm view. In most cases, this information enables you to isolate the source of the problem. In cases where the problem is still undetermined, diagnostic tests are necessary.

Once the problem is fixed, in most cases the management software automatically clears the alarm for the device.

The Event Life-Cycle

Most storage network events are based on health transitions. For example, a health transition occurs when the state of a device goes from online to offline. It is the transition from online to offline that generates an event, not the actual offline value. If the state alone were used to generate events, the same events would be generated repeatedly. Transitions cannot be used for monitoring log files, so log events can be repetitive. To minimize this problem, the agent uses predefined thresholds to entries in the log files.

The software includes an event maximums database that keeps track of the number of events generated about the same subject in a single eight-hour time frame. This database prevents the generation of repetitive events. For example, if the port of a switch toggles between offline and online every few minutes, the event maximums database ensures that this toggling is reported only once every eight hours instead of every five minutes.

Event generation usually follows this process:

1. The first time a device is monitored, a discovery event is generated. It is not actionable but is used to set a monitoring baseline This event describes, in detail, the components of the storage device. Every week after a device is discovered, an audit event is generated with the same content as the discovery event.

2. A log event can be generated when interesting information is found in storage log files. This information is usually associated with storage devices and sent to all users.

3. Events are generated when the software detects a change in the Field Replaceable Unit (FRU) status. The software periodically probes the device and compares the current FRU status to the previously reported FRU status, which is usually only minutes old. ProblemEvent, LogEvent, and ComponentRemovalEvent categories represent most of the events that are generated.



Note - Aggregated events and events that require action by service personnel (known as actionable events) are also referred to as alarms. Some alarms are based on a single state change and others are a summary of events where the event determined to be the root cause is advanced to the head of the queue as an alarm. The supporting events are grouped under the alarm and are referred to as aggregated events.



Setting Up Notification for Fault Management

The fault management features of the Sun StorageTek Common Array Manager software enables you to monitor and diagnose your arrays and storage environment. Alarm notification can be provided by:

You can also set up Sun Service notification by enabling Auto Service Request as described in Setting Up Auto Service Request.

1. In the navigation pane, under General Configuration, choose Notification.

The following Notification Setup page is displayed.


TABLE 4-1 describes the fields and buttons on the Notification Setup page.


TABLE 4-1 Fields and Buttons on the Notification Setup Page

Field

Description

Email Notification Setup

Use this SMTP Server for Email

The address of the Simple Mail Transfer Protocol (SMTP) server that will process remote email transmission.

Test Email

Click to send a test email to a test email service.

SMTP Server User Name

The user name used with the SMTP server.

SMTP Server Password

The password used with the SMTP server.

Use secure SMTP connection

Check the box to enable the secure SMTP (SMTPS) protocol. Otherwise, the SMTP protocol will be used.

SMTP Port

The port used with by SMTP server.

Path to Email Program

The server path to the email application that is to be used when the SMTP server is unavailable.

Email Address of Sender

The email address to be specified as the sender for all email transmissions.

Maximum Email Size

The largest size allowed for a single email message.

Remote Notification Setup

Select Providers

Select the check box to enable the SNMP remote notification provider.


The Email Fault Notification Setup screen is where you specify SMTP servers and recipients of email notification.

2. Enable local email.

a. Enter the name of the SMTP server.

If the host running this software has the sendmail daemon running, you can accept the default server, localhost, or the name of this host in the required field.

b. Specify the other optional parameters, as desired.

c. If you have changed or entered any parameters, click Save.

d. (Optional) Click Test Local Email to test your local email setup by sending a test email.

If you need help on any of the fields, click the Help button.

3. (Optional) Set up remote notifications by SNMP traps to an enterprise management application.

a. Select SNMP as the provider.

b. Click Save.

4. Set up local email notification recipients.

a. Click Administration > Notification > Email.

The following Email Notification page is displayed.


TABLE 4-2 describes the fields and buttons on the Email Notification page.


TABLE 4-2 Fields and Buttons on the Email Notification Page

Field

Description

New

Click to add an email recipient.

Delete

Click to delete an email recipient.

Edit

Click to edit an email recipient’s information.

Email Address

The email address of a current email recipient.

Active

Whether the current email recipient is configured as active and receiving email notifications.

Category

The types of devices for which the corresponding email recipient receives email notifications. Options include one, multiple categories, or all categories of device types.

Priority

The alarm types for which the corresponding email recipient receives email notifications. Options include:

  • All
  • Major and Above
  • Critical and Above

b. Click New.

The following Add Email Notification page is displayed.


TABLE 4-3 describes the fields on the Add Email Notification page.


TABLE 4-3 Fields on the Add Email Notification Page

Field

Description

Type

The format of the notification: email or pager.

Email Address

The email address of the new email notification recipient.

Categories

The types of devices for which the email recipient will receive email notifications. Options include one, multiple categories, or all categories of device types.

Alarm Priority

The alarm types for which the email recipient will receive email notifications. Options include:

  • All
  • Major and Above
  • Critical and Above

Active

Select Yes to enable email notification for the new email notification recipient.

Apply Email Filters

Select Yes to apply email filters to this recipient.

Skip Components of Aggregated Events

Select Yes if you do not want notification sent for single events that are also part of aggregated events.

Turn Off Event Advisor

Select Yes if you do not want Event Advisor messages included in email notifications.

Send Configuration Change Events

Select Yes if you want to send configuration change notices in the notifications.


c. Enter an email address for local notification. At least one address is required to begin monitoring events. You can customize emails to specific severity, event type, or product type.

d. Click Save.

5. (Optional) Set up email filters to prevent email notification about specific events that occur frequently. You can still view filtered events in the event log.

a. Click Administration > Notification > Email Filters.

The following Email Filters page is displayed.


TABLE 4-4 describes the fields and buttons on the Email Filters page.


TABLE 4-4 Fields and Buttons on the Email Filters Page

Field

Description

Add New Filter

Click to add a new email filter.

Delete

Click to delete the selected email filter.

Edit

Click to edit the selected email filter.

Filter ID

The identification (ID) for the email filter.

Event Code

The event code to which this filter applies.

Decreased Severity

Select Information or No Event to prevent email notification for the specified event code.


b. Click Add New Filter.

The following Add Filter page is displayed.


TABLE 4-5 describes the fields on the Add Filter page.


TABLE 4-5 Fields on the Add/Edit Email Filters Page

Field

Description

Event Code

The event code to which this filter applies.

Decreased Severity

The alarm types to which the filter applies. Options include:

  • Information
  • No Event

c. Enter the event code that you want to filter. You can obtain event codes from the Event Details page of the event you want to filter to prevent email notification for events with that event code.

d. Click Save.

6. (Optional) Set up SNMP trap recipients.

a. Click Administration > Notification > SNMP

The following SNMP Notification page is displayed.


 

TABLE 4-6 describes the fields and buttons on the SNMP Notification page. See SNMP Trap MIB for more information.


TABLE 4-6 Fields and Buttons on the SNMP Notification Page

Field

Description

New

Click to add a Simple Network Management Protocol (SNMP) recipient.

Delete

Click to delete an SNMP recipient.

Edit

Click to edit an SNMP recipient’s information.

IP Name/Address

The identifying Internet Protocol (IP) address or name of the current SNMP recipient.

Port

Port to which (SNMP) notifications are sent.

Minimum Alert Level

The minimum alarm level for which SNMP notifications are sent to the corresponding SNMP recipient. Options include:

  • Down
  • Critical
  • Major
  • Notice

b. Click New.

The following Add SNMP Notification page is displayed.


TABLE 4-7 describes the fields on the Add SNMP Notification page.


TABLE 4-7 Fields on the Add SNMP Notification Page

Field

Description

IP Name/Address

The identifying Internet Protocol (IP) address or name of the new SNMP recipient.

Port

The port to which SNMP notifications are to be sent.

Minimum Alert Level

The minimum alarm level for which SNMP notifications are to be sent to the new SNMP recipient. Options include:

  • Down
  • Critical
  • Major
  • Notice

Send Configuration Change Events

Select Yes if you want to send configuration change notices in the SNMP notifications.


c. Enter the event code that you want to filter. You can obtain event codes from the Event Details page of the event you want to filter to prevent email notification for events with that event code.

d. Click Save.

7. (Optional) Set up remote notifications by SNMP traps to an enterprise management application.

a. Click Administration > Notification > SNMP

The SNMP Notification page is displayed.

b. Click New.

The Add SNMP Notification page is displayed.

c. Enter the following information

d. Click Save.

8. Perform optional fault management setup tasks:


Configuring Array Health Monitoring

To enable array health monitoring, you must configure the Fault Management Service (FMS) agent, which probes devices. Events are generated with content, such as probable cause and recommended action, to help facilitate isolation to a single field-replaceable unit (FRU).

You must also enable array health monitoring for each array you want monitored.


procedure icon  To Configure the FMS Agent

1. In the navigation pane, expand General Configuration.

The navigation tree is expanded.

2. Choose General Health Monitoring.

The following General Health Monitoring Setup page is displayed.


TABLE 4-8 describes the fields and buttons on the General Health Monitoring Setup page.


TABLE 4-8 Fields and Buttons on the General Health Monitoring Page

Field/Button

Description

Activate

Click to activate the health monitoring agent.

Deactivate

Click to deactivate the health monitoring agent.

Run Agent

Click to manually run the health monitoring agent.

Agent Information

Active

The status of the agent.

Categories to Monitor

The type of arrays to be monitored. You can select more than one type of array by using the shift key.

Monitoring Frequency

How often, in minutes, the agent monitors the selected array categories.

Maximum Monitoring Thread Allowed

The maximum number of arrays to be monitored concurrently. If the number of arrays to be monitored exceeds the number selected to be monitored concurrently, the agent will monitor the specified number of additional arrays serially.

Timeout Settings

Agent HTTP

The amount of time for which the agent will attempt to connect to the Internet before generating a timeout.

Ping

The amount of time for which the management station will attempt a ping operation before generating a timeout.

SNMP Access

The amount of time, in seconds, before an SNMP notification will generate a timeout.

Email

The amount of time, in seconds, before an email notification will generate a timeout.


3. Select the types of arrays that you want to monitor from the Categories to Monitor field. Use the shift key to select more than one array type.

4. Specify how often you want to monitor the arrays by selecting a value in the Monitoring Frequency field.

5. Specify the maximum number of arrays to monitor concurrently by selecting a value in the Maximum Monitoring Thread field.

6. In the Timeout Setting section, set the agent timeout settings.

The default timeout settings are appropriate for most storage area network (SAN) devices. However, network latencies, I/O loads, and other device and network characteristics may require that you customize these settings to meet your configuration requirements. Click in the value field for the parameter and enter the new value.

7. When all required changes are complete, click Save.

The configuration is saved.


procedure icon  To Enable Health Monitoring for an Array

1. In the navigation pane, select an array for which you want to display or edit the health monitoring status.

2. Click Array Health Monitoring

The following Array Health Monitoring Setup page is displayed.


TABLE 4-9 describes the fields on the Array Health monitoring Setup page.


TABLE 4-9 Fields on the Array Health Monitoring Setup Page

Field/Button

Description

Health Monitoring Status

Health Monitoring Agent Active

Identifies whether the health monitoring agent is active or inactive.

Device Category Monitored

Identifies whether health monitoring is enabled for this array type.

Monitoring for this Array

Health Monitoring

Enables or disables health monitoring for this array. Select the checkbox to enable health monitoring for the array; deselect the checkbox to disable health monitoring for this array.

Auto Service Request

Enables or disables the Auto Service Request monitoring service for this array. Select the checkbox to enable the Auto Service Request service for this array; deselect the checkbox to disable the Auto Service Request service for this array. Note: to enable Auto Service Request, you must also enable Health Monitoring for this array and the monitoring agent must be active.


3. For the array to be monitored, ensure that the monitoring agent is active and that the Device Category Monitored is set to Yes. If not, go to Configuring Array Health Monitoring

4. Select the checkbox next to Health Monitoring to enable health monitoring for this array; deselect the checkbox to disable health monitoring for the array.

5. Click Save.


Monitoring Alarms and Events

Events are generated to signify a health transition in a monitored device or device component. Events that require action are classified as alarms.

There are four event severity levels:

You can display alarms for all arrays listed or for an individual array. Events are listed for each array only.


procedure icon  To Display Alarm Information

1. To display alarms for all registered arrays, in the navigation pane, choose Alarms.

The following Alarm Summary page for all arrays is displayed.


TABLE 4-10 describes the fields and buttons on the Alarms page and the Alarms Summary page.


TABLE 4-10 Fields and Buttons on the Alarms Page and the Alarm Summary Page

Field

Description

Acknowledge

Click to change the state of any selected alarms from Open to Acknowledged.

Reopen

Click to change the state of any selected alarms from Acknowledged to Open. This button is grayed out until the alarm has been acknowledged.

Delete

Click to remove selected alarms. This button is grayed out for any auto-clear alarm.

Severity

The severity level of the event. Possible severity levels are:

  • Black - Down
  • Red - Critical
  • Yellow - Major
  • Blue - Minor

Alarm Details

Click to display detailed information about the alarm.

Component

The component to which the alarm applies.

Type

The general classification of the alarm.

Date

The date and time when the alarm was generated.

State

The current state of the alarm; for example, open or acknowledged.

Auto Clear

Whether or not this alarm will automatically be cleared when the underlying problem is resolved. Alarms which do not have the auto-clear state will need to be deleted by the user when the underlying problem is resolved.


2. To display alarms that apply to an individual array, in the navigation pane select the array whose alarms you want to view and choose Alarms below it.

The following Alarm Summary page for that array is displayed.


3. To view detailed information about an alarm, in the Alarm Summary page, click Details for the alarm.

The following Alarm Details page is displayed.


TABLE 4-11 describes the fields on the Alarm Details page.


TABLE 4-11 Fields and Buttons on the Alarm Details Page

Field

Description

Acknowledge

Click to change the state of this alarm from Open to Acknowledged.

Reopen

Click to change the state of this alarm from Acknowledged to Open. This button is grayed out until the alarm has been acknowledged.

View Aggregated Events

Click to display all events associated with this alarm.

Details

Severity

The severity level of the event. The possible severity levels are:

  • Down
  • Critical
  • Major
  • Minor

Date

The date and time when the alarm was generated.

State

The current state of the alarm; for example, Open or Acknowledged.

Acknowledged by:

The user who acknowledged the alarm. This field displays only if an alarm has not yet been acknowledged.

Reopened by:

The user who reopened the alarm.This field displays only after an alarm has been acknowledged and then reopened.

Auto Clear

Whether or not this alarm will automatically be cleared when the underlying problem is resolved. Alarms which do not have the auto-clear state will need to be deleted by the user when the underlying problem is resolved.

Description

A technical explanation of the condition that caused the alarm.

Info

A non-technical explanation of the condition that caused the alarm.

Device

The device to which the alarm applies. Click the device name for detailed information about the component; for example, J007(J4200).

Component

The component element to which the alarm applies.

Event Code

The event code used to identify this alarm type.

Aggregated Count

The number of events aggregated for this alarm.

Probable Cause

The most likely reasons that the alarm was generated.

Recommended Action

The procedure, if any, that you can perform to attempt to correct the alarm condition. A link to the Service Advisor is displayed if replacement of a field-replaceable unit (FRU) is recommended.

Notes

Optional. You can specify text to be stored with the alarm detail to document the actions taken to address this alarm.


4. To view the a list of events associated with an alarm, from the Alarm Details page, click Aggregated Events.

The following Aggregated Events page is displayed.



Note - The aggregation of events associated with an alarm can vary based on the time that an individual host probes the device. When not aggregated, the list of events, is consistent with all hosts.



Managing Alarms

An alarm that has the Auto Clear function set will be automatically deleted from the alarms page when the underlying fault has been addressed and corrected. To determine whether an alarm will be automatically deleted when it has been resolved, view the alarm summary page and examine the Auto Clear column. If the Auto Clear column is set to yes, then that alarm will be automatically deleted when the fault has been corrected, otherwise, the alarm will need to be manually removed after a service operation has been completed.

If the Auto Clear function is set to No, when resolved that alarm will not be automatically deleted from the Alarms page and you must manually delete that alarm from the Alarms page.

Acknowledging Alarms

When an alarm is generated, it remains open in the Alarm Summary page until you acknowledge it. Acknowledging an alarm is a way for administrators to indicate that an alarm has been seen and evaluated; it does not affect if or when an alarm will be cleared.


procedure icon  To Acknowledge One or More Alarms

1. Display the Alarm Summary page by doing one of the following in the navigation pane:

2. Select the check box for each alarm you want to acknowledge, and click Acknowledge.

The following Acknowledge Alarms confirmation window is displayed.


.

3. Enter an identifying name to be associated with this action, and click Acknowledge.

The Alarm Summary page is redisplayed, and the state of the acknowledged alarms is displayed as Acknowledged.

Note: You can also acknowledge an alarm from the Alarm Details page. You can also reopen acknowledged alarms from the Alarm Summary and Alarm Details pages.

Deleting Alarms

When you delete an open or acknowledged alarm, it is permanently removed from the Alarm Summary page.

Note: You cannot delete alarms which are designated as Auto Clear alarms. These alarms are removed from the Alarm Summary page either when the array is removed from the list of managed arrays or when the condition related to the problem is resolved.


procedure icon  To Delete One or More Alarms

1. In the navigation pane, display the Alarm Summary page for all registered arrays or for one particular array:

The Alarm Summary page displays a list of alarms.

2. Select the check box for each acknowledged alarm you want to delete, and click Delete.

The Delete Alarms confirmation window is displayed.

3. Click OK.

The Alarm Summary page is redisplayed without the deleted alarms.

Displaying Event Information

To gather additional information about an alarm, you can display the event log to view the underlying events on which the alarm is based.

Note: The event log is a historical representation of events in an array. In some cases the event log may differ when viewed from multiple hosts since the agents run at different times on separate hosts. This has no impact on fault isolation.


procedure icon  To Display Information About Events

1. In the navigation pane select the array for which you want to view the event log and choose Events.

The following Events page displays.


TABLE 4-12describes the fields on the Events page.


TABLE 4-12 Events Page

Field

Description

Date

The date and time when the event occurred.

Event Details

Click Details to display detailed information for the corresponding event.

Component

The component to which the event applies.

Type

A brief identifier of the nature of the event, such as Log, State Change, or Value Change.


2. To see detailed information about an event, click Details in the row that corresponds to the event.

The Event Details page is displayed for the selected event.


TABLE 4-13describes the fields on the Event Details page.


TABLE 4-13 Event Details Page

Field

Description

Details

Severity

The severity level of the event. Possible severity levels are:

  • Down
  • Critical
  • Major
  • Minor

Date

The date and time when the event was generated.

Actionable

Whether the event requires user action.

Description

A technical explanation of the condition that caused the event.

Data

Additional event data.

Component

The component to which the alarm applies.

Type

A brief identifier of the nature of the event, such as Log, State Change, or Value Change.

Info

A non-technical explanation of the condition that caused the event.

Event Code

The event code used to identify this event type.

Aggregated

The number of events aggregated for this event.

Probable Cause

The most likely reasons that the event was generated.

Recommended Action

The procedure, if any, that you can perform to correct the event condition.



Monitoring Field-Replaceable Units (FRUs)

The Common Array Manager software enables you to view a quick listing of the FRU components in the array, and to get detailed information about the health of each type of FRU. For a listing of the FRU components in your system, go to the FRU Summary page.



Note - All FRUs in the J4000 Array Family are also Customer Replaceable Units (CRUs).


For detailed information about each FRU type, refer to the hardware documentation for your array.


procedure icon  To View the Listing of FRUs in the Array

1. In the navigation pane, select the array whose FRUs you want to list and click FRUs.

The FRU Summary page is displayed. It lists the FRU types available and provides basic information about the FRUs. The types of FRU components available depend on the model of your array.

The following figure shows the FRU Summary page for the Sun Storage J4200 array.


TABLE 4-14describes the fields on the FRU Summary page.


TABLE 4-14 Fields on the FRU Summary Page

Field

Indicates

FRU Type

The type of FRU installed on the array.

Alarms

Alarms on the FRU type.

Installed

The quantity of FRU components of a particular type installed on array.

Slot Count

The quantity of slots allocated for the particular FRU type.


2. To view the list of FRU components of a particular type, click on name of the FRU in the FRU Type column.

The Component Summary page displays the list of FRUs available, along with basic information about each FRU component.


TABLE 4-15 describes the fields on the Component Summary page.


TABLE 4-15 Fields on the Component Summary Page

Field

Indicates

Name

Name of the FRU component.

State

The state of the FRU component. Valid values are:

  • Enabled
  • Disabled

Status

Status of the FRU component. Valid values are:

  • OK
  • Degraded
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Revision

The revision of the FRU component.

Unique Identifier

The unique identifier associated with this FRU component.


3. To view detailed health information about a particular FRU component, click on the component name.

Depending on the FRU type of the selected component, one of the following pages will display:

Disk Health Details Page

The disk drives are used to store data. For detailed information about the disk drives and each of its components, refer to the hardware documentation for your array.

The following figure shows the Disk Health Detail page.


TABLE 4-16 describes the fields on the Disk Health Details page.


TABLE 4-16 Fields on the Disk Health Detail Page

Field

Indicates

Availability

The availability of this disk drive. Valid values are:

  • Running/Full Power
  • Degraded
  • Not Installed
  • Unknown

Capacity

The total capacity of this disk.

Caption

The general name of this FRU type.

Element Status

The operational status of this FRU component. Valid values are:

  • OK
  • Degraded
  • Error
  • Lost Communication

Enabled State

Physical state of this disk drive. Valid values are:

  • Enabled
  • Removed
  • Other
  • Unknown

Host Path

The path where the disk drive is located.

Id

The unique ID assigned to this disk drive.

Name

The name assigned to this disk drive.

Physical ID

The physical ID assigned to this disk drive.

Product Firmware Version

The version of firmware running on this disk drive.

Product Name

Name of the disk drive manufacturer.

Name

Name assigned to this disk drive.

Product Name.

Model number of the array where this disk drive is installed.

SAS Address

SAS address assigned to this disk drive.

Serial Number

The serial number associated with this disk.

Speed

The speed at which this disk is rotating.

Status

Health status of this FRU component. Valid values are:

  • OK
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Type

The type of disk drive, such as SAS or SATA.


Fan Health Details Page

The fans in the Sun Storage J4000 Array Family circulate air inside the tray. Some array models, such as the J4200 array, contains two hot-swappable fans to provide redundant cooling. Other array models, such as the J4400, include fans in the power supplies. For detailed information, consult the hardware installation guide for your array.

The following figure shows the Fan Health Detail page.


TABLE 4-17 describes the fields on the Fan Health Details page.


TABLE 4-17 Fields on the Fan Health Details Page

Field

Indicates

Availability

The availability of this fan. Valid values are:

  • Running/Full Power
  • Degraded
  • Not Installed
  • Unknown

Caption

The general name of this FRU type.

Element Status

The operational status of this FRU component. Valid values are:

  • OK
  • Degraded
  • Error
  • Lost Communication

Enabled State

The physical state of this fan. Valid values are:

  • Enabled
  • Removed
  • Other
  • Unknown

ID

The unique ID assigned to this fan.

Name

Name assigned to the fan.

Part Number

The part number assigned to this fan.

Physical ID

The physical ID assigned to this fan.

Position

The location of this fan in the chassis when viewing the chassis from the back. Valid values are:

  • Left
  • Right

Serial Number

Serial number of the fan. The serial number is assigned by the fan manufacturer.

Speed

The speed, in rotations per minute (RPMs) at which the fan is operating.

Status

Health status of this FRU component. Valid values are:

  • OK
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Type

The type of FRU.


NEM Health Details Page

The NEM card is attached to the J4500 array. For detailed information about the disk drives and each of its components, refer to the hardware documentation for your array.

TABLE 4-18 describes the buttons and fields on the NEM Health Details page.


TABLE 4-18 Fields on the NEM Health Details Page

Field

Indicates

Availability

The availability of this component. Valid values are:

  • Running/Full Power
  • Degraded
  • Not Installed
  • Unknown

Caption

The general name of this FRU type.

Element Status

The status of this FRU component. Valid values are:

  • OK
  • Degraded
  • Error
  • Lost Communication

Enabled State

State of this FRU component. Valid values are:

  • Enabled
  • Removed
  • Other
  • Unknown

ID

The unique ID assigned to this component.

Model

The model name of this FRU component.

Name

Name assigned to the component.

Physical ID

The physical ID assigned to this fan.

Product Revision

Revision of this FRU component.

Serial Number

Serial number of the fan. The serial number is assigned by the fan manufacturer.

Status

Status of this FRU component. Valid values are:

  • OK
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Power Supply Health Details Page

Each tray in the Sun StorageTek J4000 Array Family has hot-swappable, redundant power supplies. If one power supply is turned off or malfunctions, the other power supply maintains electrical power to the array.

The following figure shows the Power Supply Health Detail page.


TABLE 4-19 describes the fields on the Power Supply Health Details page.


TABLE 4-19 Fields on the Power Supply Health Details Page

Field

Indicates

Availability

The availability of this power supply. Valid values are:

  • Running/Full Power
  • Degraded
  • Not Installed
  • Unknown

Caption

The general name of this FRU type.

Element Status

The operational status of this FRU component. Valid values are:

  • OK
  • Degraded
  • Error
  • Lost Communication

Enabled State

The physical state of this power supply. Valid values are:

  • Enabled
  • Removed
  • Other
  • Unknown

Fan 0 Speed

The speed, in rotations per minute (RPMs) at which this fan is operating. If the fan operation is not within acceptable limits, an alarm is reported.

Fan1 Speed

The speed, in rotations per minute (RPMs) at which this fan is operating. If the fan operation is not within acceptable limits, an alarm reported.

ID

Unique identifier assigned to this power supply.

Fan Status

Status of the fan associated with this power supply. Valid values are:

  • Normal

 

Name

Name assigned to this power supply.

Status

Health status of this FRU component. Valid values are:

  • OK
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Type

Type of component.


SIM Health Details Page

The SAS Interface Module (SIM) is a hot-swappable board that contains two SAS outbound connectors, one SAS inbound connector, and one serial management port. The serial management port is reserved for Sun Service personnel only.

The following figure shows the SIM Health Detail page.


TABLE 4-20 describes the fields on the SIM Health Details page.


TABLE 4-20 Fields on the SIM Health Details Page

Field

Indicates

Availability

The availability of this SIM. Valid values are:

  • Running/Full Power
  • Degraded
  • Not Installed
  • Unknown

Caption

The general name of this FRU type.

Controller Temperature 1

Temperature of the controller at location 1. If the temperature at this location is not within acceptable limits, an alarm is reported.

Controller Temperature 2

Temperature of the controller at location 2. If the temperature at this location is not within acceptable limits, an alarm is reported.

Controller Temperature 3

Temperature of the controller at location 3. If the temperature at this location is not within acceptable limits, an alarm is reported.

Element Status

The operational status of this FRU component. Valid values are:

  • Enabled
  • OK
  • Degraded
  • Error
  • Lost Communication

Enabled State

The physical state of this FRU component. Valid values are:

  • Enabled
  • Removed
  • Other
  • Unknown

Host Path

/dev/es/ses#

ID

Unique ID assigned to this controller.

Model

The model number of the array.

Name

The name assigned to this controller.

Part Number

The part number assigned to this controller.

Physical ID

The physical ID associated with this controller.

Product Firmware Version

The version of the firmware loaded on the controller.

SAS Address

SAS address assigned to this controller.

SCSI Mode

The SCSI mode assigned to this controller.

SES Serial Number

Serial number assigned to SIM’s enclosure.

SES Temperature 1

Temperature within the SES enclosure at location 1. If the temperature at this location is not within acceptable limits, an alarm is reported.

SES Temperature 2

Temperature within the SES enclosure at location 2. If the temperature at this location is not within acceptable limits, an alarm is reported.

Serial number

Serial number assigned to the SIM.

Status

Health status of this FRU component. Valid values are:

  • OK
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Voltage (1.2V)

The actual voltage of this 1.2 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage (12V)

The actual voltage of this 12 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage (3.3V)

The actual voltage of this 3.3 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage (5V)

The actual voltage of this 5 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.


Storage Module Health Details Page

The storage module is available as part of the Sun Storage B6000 array. For information about the system controller, refer to the hardware documentation for your array.

TABLE 4-21 describes the buttons and fields on the Storage Module Health Details page.


TABLE 4-21 Fields and Buttons on the Storage Module Health Details Page

Field

Indicates

Availability

The availability of this storage module. Valid values are:

  • Running/Full Power
  • Degraded
  • Not Installed
  • Unknown

Caption

The general name of this FRU type.

Element Status

The status of this FRU component. Valid values are:

  • OK
  • Degraded
  • Error
  • Lost Communication

Enabled State

State of this FRU component. Valid values are:

  • Enabled
  • Removed
  • Other
  • Unknown

Expander 0 Host Path

The path the operating system uses to access this expander.

Expander 0 Name

The location of this expander.

Expander 0 Product Revision

Revision of the firmware on this expander.

Expander 0 Serial Number

The serial number assigned to this expander.

Expander 0 Status

The operating status of this expander. Valid values are OK or Failed.

Expander 1 Host Path

The path the operating system uses to access this expander.

Expander 1 Name

The location of this expander.

Expander 1 Product Revision

Revision of the firmware on this expander.

Expander 1 Serial Number

The serial number assigned to this expander.

Expander 1 Status

The operating status of this expander. Valid values are OK or Failed.

ID

Unique ID assigned to this storage module.

Name

The name assigned to this storage module.

Part Number

The part number assigned to this storage module.

Physical ID

The physical ID associated with this storage module.

Product Name

The model number of the array

Product Firmware Version

The version of the firmware loaded on the storage module.

Serial number

Serial number assigned to the storage module.

Status

Status of this FRU component. Valid values are:

  • OK
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Temp Sensor Ambient Temp

One of two temperature sensors on the storage module. If the temperature at this location is not within acceptable limits, an alarm is reported.

Temp Sensor Exp Junct Temp

One of two temperature sensors on the storage module. If the temperature at this location is not within acceptable limits, an alarm is reported.

Voltage Sensor 12 V In

The actual voltage of this 12 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage Sensor 3.3V

The actual voltage of this 3.3 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage Sensor 5V In

The actual voltage of this 5 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.


System Controller Health Details Page

The system controller is available as part of the Sun Storage J4500 array. The system controller is a hot-swappable board that contains four LSI SAS x36 expanders. These expanders provide a redundant set of independent SAS fabrics (two expanders per fabric), enabling two paths to the array’s disk drives. The serial management is reserved for Sun Service personnel only.

For more information about the system controller, refer to the hardware documentation for your array.

The following figure shows the Component Summary for the System Controller page.


TABLE 4-22 describes the buttons and fields on the System Controller Health Details page.


TABLE 4-22 Fields and Buttons on the System Controller Health Details Page

Field

Indicates

Availability

The availability of this system controller. Valid values are:

  • Running/Full Power
  • Degraded
  • Not Installed
  • Unknown

Caption

The general name of this FRU type.

Element Status

The status of this FRU component. Valid values are:

  • OK
  • Degraded
  • Error
  • Lost Communication

Enabled State

State of this FRU component. Valid values are:

  • Enabled
  • Removed
  • Other
  • Unknown

Expander 0 Host Path

The path the operating system uses to access this expander.

Expander 0 Name

The location of this expander.

Expander 0 Product Revision

Revision of the firmware on this expander.

Expander 0 Serial Number

The serial number assigned to this expander.

Expander 0 Status

The operating status of this expander. Valid values are OK or Failed.

Expander 1 Host Path

The path the operating system uses to access this expander.

Expander 1 Name

The location of this expander.

Expander 1 Product Revision

Revision of the firmware on this expander.

Expander 1 Serial Number

The serial number assigned to this expander.

Expander 1 Status

The operating status of this expander. Valid values are OK or Failed.

Expander 2 Host Path

The path the operating system uses to access this expander.

Expander 2 Name

The location of this expander.

Expander 2 Product Revision

Revision of the firmware on this expander.

Expander 2 Serial Number

The serial number assigned to this expander.

Expander 2 Status

The operating status of this expander. Valid values are OK or Failed.

Expander 3 Host Path

The path the operating system uses to access this expander.

Expander 3 Name

The location of this expander.

Expander 3 Product Revision

Revision of the firmware on this expander.

Expander 3 Serial Number

The serial number assigned to this expander.

Expander 3 Status

The operating status of this expander. Valid values are OK or Failed.

ID

Unique ID assigned to this controller.

Name

The name assigned to this controller.

Part Number

The part number assigned to this controller.

Physical ID

The physical ID associated with this controller.

Product Name

The model number of the array

Product Firmware Version

The version of the firmware loaded on the controller.

Serial number

Serial number assigned to the system controller.

Status

Status of this FRU component. Valid values are:

  • OK
  • Uninstalled
  • Degraded
  • Disabled
  • Failed
  • Critical
  • Unknown

Temp Sensor Ambient Temp

One of two temperature sensors on the system controller board. If the temperature at this location is not within acceptable limits, an alarm is reported.

Temp Sensor LM75 Temp Sensor

One of two temperature sensors on the system controller board. If the temperature at this location is not within acceptable limits, an alarm is reported.

Voltage Sensor 12 V In

The actual voltage of this 12 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage Sensor 3.3V Main

The actual voltage of this main 3.3 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage Sensor 3.3V Stby

The actual voltage of this standby 3.3 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage Sensor 5V In

The actual voltage of this 5 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage Sensor AIN0

The actual voltage of this 5 volt circuit. If the voltage is not within acceptable limits, an alarm is reported.

Voltage Sensor VCCP

The actual voltage of this VCCP circuit. If the voltage is not within acceptable limits, an alarm is reported.



Viewing Activity on All Arrays

The activity log lists user-initiated actions performed for all registered arrays, in chronological order. These actions may have been initiated through either the Sun StorageTek Common Array Manager or the command-line interface (CLI).


procedure icon  To View the Activity Log

1. In the navigation pane, click General Configuration > Activity Log.

The Activity Log Summary page is displayed.


TABLE 4-23 describes the fields on the Activity Log Summary page.


TABLE 4-23 Fields on the Activity Log Page

Field

Description

Time

The date and time when an operation occurred on the array.

Event

The type of operation that occurred, including the creation, deletion, or modification of an object type.

Details

Details about the operation performed, including the specific object affected and whether the operation was successful.



Monitoring Storage Utilization

Common Array Manager graphically provides a summary of the total storage capacity of an array and the number of disk drives that provide that storage.


TABLE 4-24 describes the buttons and fields on the Storage Utilization page.


TABLE 4-24 Fields on the Storage Utilization Page

Field

Description

Key

A color-coded key that corresponds to the type of disk drive represented in the pie chart.

Type

The type of disk drive: FC, SATA or SAS.

Drives

The number of disk drives of the specified type.

Total Capacity

The sum of the capacities of all discovered disks, including spares and disks whose status is not optimal

Non Optimal

The number of disk drives that are in any of the following states:

  • Unknown
  • Failed
  • Replaced
  • Bypassed
  • Unresponsive
  • Removed
  • Predictive Failure