C H A P T E R  4

Running a Hardware Diagnostic Suite Test Session

This chapter describes how to configure, run, schedule, and review a Hardware Diagnostic Suite test session. The main topics are:

The procedures in this chapter assume that the Hardware Diagnostic Suite is already running as described in Chapter 3.

See Appendix B for descriptions of all Hardware Diagnostic Suite console panels, buttons, and menus.


Preparing Devices for a Test Session

The following tests require media to be installed before the test is run:

See the appropriate test description in Appendix A for details, and install the necessary media before starting the test.


Selecting Devices for a Test Session

When the Hardware Diagnostic Suite window is displayed for a host, the system configuration is probed to display devices that can be tested. Select the device that you want to test in the hierarchy view. Expand the hierarchy view if device listings are collapsed.


procedure icon  To Select a Device to Test

1. If necessary, expand the hierarchy view to show the devices on the host by clicking one of the hierarchy view buttons (FIGURE 4-1).



Note - For more details about the Collapse/Expand Hierarchical View panel buttons, refer to Hierarchical View Panel.



 FIGURE 4-1 Expanding the Hierarchical View

Screen shot showing the device hierarchy.[ D ]

2. Click the device or device group that you want to test.

The device is highlighted as shown in FIGURE 4-1.

By default, if you select another device, the previous device is no longer selected.

You can select individual devices, entire groups of devices, or the top level device (host) for testing with a single click at the appropriate level.

If you click a device, the Device Display panel shows additional information about that device.



Note - To select more than one device, hold down the Control key to pick and choose devices to select, or hold down the Shift key while clicking on whole sections of devices. The Device Description panel on the right describes the last device that was selected.




procedure icon  To Reprobe the System for Devices

The Hierarchical View panel only displays devices recognized by the Hardware Diagnostic Suite agent when the application is first started. If, for example, you add hot-pluggable devices or perform a dynamic reconfiguration after starting Hardware Diagnostic Suite, use the Reprobe function to check the system and update the list of testable devices.



Note - When you add a device to your system, you must first perform the appropriate task (such as a reconfiguration boot) to enable the Solaris kernel to recognize the device. Once the device is recognized by the Solaris operating environment, use the Reprobe command.



1. Select Reprobe for Devices from the Options drop-down menu, just above the Hierarchical View panel.

The Hardware Diagnostic Suite agent rechecks the system for all testable devices and displays them in the Hierarchical View panel.


Starting a Test Session

Before you start a test session, make the following decisions:



Note - All tests are designed so they will not interfere with the applications that are currently running on a system.




procedure icon  To Run a Full Test Session Now

single-step bulletAfter selecting devices to test, click the Full Test button.

Functional tests for each selected device are run sequentially until all tests are complete.

For information about viewing the progress of the test session, refer to Monitoring a Test Session.


procedure icon  To Run a Quick Check Test Now

single-step bulletAfter selecting devices to test, click the Quick Check button.

Quick connectivity tests for each selected device are run sequentially until all tests are complete.

For information about viewing the progress of the test session, refer to Monitoring a Test Session.


Monitoring a Test Session

The Hardware Diagnostic Suite console displays information about each device and each test as it runs and displays the results of each test.


procedure icon  To Monitor the Tests in Progress

1. View the progress of each test as it runs (FIGURE 4-2).

As each device is tested, device information is shown in the Device Description panel, and test information is displayed in the Progress panel.

FIGURE 4-2 Device Description Panel and Progress Panel

Screen shot showing the device information in the Device Description panel, and the test progress in the Progress panel.

The Progress panel (FIGURE 4-2) displays the following information:

  • The device under test, the subtest that is currently running, and test messages.
  • A bar that represents the progress of the current test.
  • The status (passed/failed) of the previous test.

2. View the status of all tested devices in the Hierarchy View.

When a Hardware Diagnostic Suite test detects a successful or failing test on a device, the pass or fail condition is immediately displayed in the Hierarchical View panel (FIGURE 4-3). TABLE 4-2 describes the test indicators.

 FIGURE 4-3 Pass and Fail Conditions in the Hierarchy View

Screen shot showing devices with different colored text and different symbols according to test status: failed, successful, or unknown.

 

TABLE 4-2 Hierarchical View Panel Indicators

Indicator

Condition

Description

 

Unknown test symbol

Unknown

The device is in an unknown state, usually because it has not been tested or the test has not completed. The device name is displayed in black text.

 

Successful test symbol

 

Successful
test pass

When a test completes with no failures detected, the device is marked with a green chicanery in the Hierarchical View panel. The device name is displayed in green text.

 

Failed test symbol

 

Failed
test pass

As soon as a failure is detected, the device is marked with this indicator. The device name and the group(s) that the failing device belongs to are displayed in red text. The red text highlights the hierarchy of devices involved in the detected failure. The information and error log files are updated with the error condition information. In addition, if you double-click the device, a pop-up window displays the error message.


3. To view additional information about a device, click the device name in the hierarchy view.

If the device is in an unknown (untested) state or has a successful test pass indicator, additional information about that device is displayed in the Device Description panel.

If the device shows a failing test indicator, a pop-up window displays more information about the failure (FIGURE 4-4). This failure information is also recorded in the error log. See Reviewing Test Results.

 FIGURE 4-4 Error Message Pop-up

Screen shot showing an example of an error message dialog box.


Suspending, Resuming, and Stopping a Test Session

You can suspend a Hardware Diagnostic Suite test session and resume it as described in the following procedures.


procedure icon  To Suspend a Test Session

1. While a test session is running, select the Options button to access the Options menu.

2. Select the Suspend option.

The Hardware Diagnostic Suite test session is suspended until you resume it. The progress panel displays, "Testing Suspended."


procedure icon  To Resume a Test Session

1. While a test session is suspended, select the Options button to access the Options menu.

2. Select Resume.

The Hardware Diagnostic Suite test session that was suspended starts to run again.


procedure icon  To Stop a Test Session

single-step bulletWhile a test session is running, click the Stop Testing button.

All testing stops.


Reviewing Test Results

In addition to the test results displayed in the Hierarchical View panel, two log files contain information about every Hardware Diagnostic Suite test session:

  • Information Log--Contains informative messages, such as start and stop times and pass and failure information. The information messages are recorded in the
    /var/opt/SUNWhwdiag/logs/hwdiag.info file.
  • Error Log--Contains all Hardware Diagnostic Suite error messages that have occurred during the test sessions. The error messages are recorded in the
    /var/opt/SUNWhwdiag/logs/hwdiag.err file.

procedure icon  To View the Hardware Diagnostic Suite Log files

1. Select the Logs button, just above the Hierarchical View panel, to access the Logs menu.

2. Select the log (Information or Errors) that you want to view.

A window that contains the Hardware Diagnostic Suite messages is displayed.

TABLE 4-3 describes the types of error messages.

TABLE 4-3 Error Message Categories

Message Category

Description

FATAL

Severe errors that indicate a serious hardware failure was detected while testing the device. The problem might be so severe that the test was unable to communicate with the device in any way. The Hardware Diagnostic Suite test might have detected a data compare or a hardware error. These errors are recorded in the Error log file.

ERROR

A hardware error was detected, such as missing media, a loose cable, or a disconnection. This type of error is usually less severe than a fatal error. These errors are recorded in the Error log file.

WARNING

Some occurrence was detected that is not a hardware error. These messages are recorded in the Information log file.

INFO

Informative, nonerror type events such as start and stop times. These messages are recorded in the Information log file.



Resetting the Hardware Diagnostic Suite Console

To clear the Hardware Diagnostic Suite console of previous test information, perform a reset as described below.


procedure icon  To Reset the Console

1. Select the Options button to access the Options menu.

2. Select the Reset option.

All previous test results are cleared from the console.



Note - The Hardware Diagnostic Suite log files are not cleared.




Scheduling a Test Session

The Hardware Diagnostic Suite scheduling function creates entries in the superuser's crontab file. When the start date and time criteria are met, the test session, as configured in the Scheduler, starts automatically. You do not need to start Sun Management Center software to run a scheduled test session.

To check the results of any prior test session, view the Hardware Diagnostic Suite log files as described in To View the Hardware Diagnostic Suite Log files.


procedure icon  To Schedule a Test Session

1. In the Hardware Diagnostic Suite console, click the Schedule button.

The Schedule panel with scheduling instructions is displayed (FIGURE 4-5).

 FIGURE 4-5 Schedule Panel

Screen shot showing the scheduling panel. Existing schedules are displayed. Buttons are New, Delete, Modify, and Close.

Note - A scheduled test session will not start if Hardware Diagnostic Suite is already running a test session at that time.



2. Select the New button.

The Schedule Form is displayed (FIGURE 4-6).

 FIGURE 4-6 Schedule Form

Screen shot of test scheduling form.[ D ]

3. Enter a schedule name in the Name field.

You can use the name that is shown in the Name field (the Hardware Diagnostic Suite displays a unique name each time a schedule is created), or specify a different name. The following naming rules apply:

  • The name must be a unique schedule name.
  • The name must be between 1 and 20 alphanumeric characters.
  • The only permitted nonalphanumeric character is the _ (underscore).

4. Enter the start time for the test session that you are scheduling.

You can use the 24-hour clock settings in 15-minute intervals that are located in the pull-down list, or type in your own start time in the Start Time field.

5. Enter the date for the test session in the Run On field.

  • Choose the Periodic tab (FIGURE 4-6) to create a schedule that runs the Hardware Diagnostic Suite test session at regular intervals. Select the days of the week that you want testing to occur. This schedule remains in effect until you delete or modify it.
  • Choose the One Time tab (FIGURE 4-6) to create a schedule that will only run one time. Specify the date using the mm/dd/yyyy format. This schedule will only run one time, but remains in the list of schedules so that you can modify it if you want it to run again. You must delete the schedule to remove it from the list.

6. Configure the test mode and the devices to test in the Configuration field.

There are two methods to do this:

  • Choose the Custom tab (FIGURE 4-6) to create a schedule that tests the devices selected in the Hierarchical View panel:

i. Select either Full Test or Quick Check for the test mode (see TABLE 4-1 for test mode descriptions).

ii. Select the devices to test in the Hierarchical View panel.

  • Choose the Pre-packaged tab (FIGURE 4-6) to create a schedule that runs a predefined Hardware Diagnostic Suite test session, and select one of the predefined tests as described in TABLE 4-4.
  • TABLE 4-4 Predefined Tests

    Test Name

    Description

    Connection Check

    Sets up a schedule to run Quick Check tests on all available devices.

    Functional Check

    Sets up a schedule to run Full Tests on all available devices.

    Processor(s) Check

    Sets up a schedule that runs the Processor test (in Full Test mode) on all processors in the system.

    Hard Disk Check

    Sets up a schedule that runs the Disk test (in Full Test mode) on all the disks in the system.

    Odd Disk Testing

    Sets up a schedule that runs the Disk test (in Full Test mode) on every other disk in the system, beginning with the first disk (as seen in the Hierarchical View panel). This test is useful when there are many disks in the system.

    Even Disk Testing

    Sets up a schedule that runs the Disk test (in Full Test mode) on every other disk in the system beginning with the second disk (as see in the Hierarchical View panel). This test is useful when there are many disks in the system.


7. Apply your test session schedule information by clicking the OK button.

Your schedule information is applied, the Schedule Form is closed, and the Schedule panel is displayed. Your new Hardware Diagnostic Suite test session schedule is listed in the Existing Schedules list (FIGURE 4-7).



Note - For descriptions of all Scheduling buttons, refer to Schedule Form Buttons.



 FIGURE 4-7 Existing Schedules List

Screen shot showing the existing schedules list. Buttons are New, Delete, Modify, and Close.

8. Exit the scheduling function by clicking the Close button.

The Schedule panel is closed.


procedure icon  To Modify a Schedule

1. Select the Schedule button.

The Hardware Diagnostic Suite displays the Schedule panel with the list of schedules.

2. Select the schedule that you want to modify.

The schedule is highlighted.

3. Select the Modify button.

The Schedule Form is displayed (FIGURE 4-6).

4. Change the schedule entries as needed.



Note - If you change the name of the schedule, the Hardware Diagnostic Suite creates another schedule with the newly specified name. It does not modify the name of the original schedule.



5. Click the OK button to apply your changes.

6. Click the Close button to close the Schedule panel.


procedure icon  To Delete a Schedule

1. Select the Schedule button.

The Schedule panel with the list of schedules is displayed.

2. Click the schedule that you want to delete.

The schedule is highlighted.

3. Select the Delete button.

The selected schedule is deleted and removed from the list.

4. Select the Close button to close the Schedule panel.


Running the Hardware Diagnostic Suite in a DR Environment

The Hardware Diagnostic Suite agent is aware of dynamic reconfiguration (DR) operations that are performed when you use the cfgadm command (unconfigure or configure). When the Hardware Diagnostic Suite is running and a DR operation is performed, the console is replaced with a message indicating that a DR event is taking place. When the DR operation is finished, the Hardware Diagnostic Suite reprobes the system to determine and display all testable devices.



Note - The Hardware Diagnostic Suite does not automatically reprobe the devices after a DR power-on or power-off operation. To test the devices that were added after a power-on, perform a reprobe from the Options menu.