C H A P T E R 4 |
Setting Up Server Blades and Performing Initial Diagnostics |
This chapter tells you how to power on a server blade and access its console. It then tells you how to perform preliminary diagnostics using the various tools (apart from the Advanced Lights-out Management Software described in the Sun Fire B1600 Blade System Chassis Software Setup Guide) that are available.
For general information about running diagnostics on Solaris systems refer to the OpenBoot Command Reference Manual and the SunVTS Users Guide. These are available on the Software Supplement CD supplied with the Solaris Media Kit. You can also access them from:
http://www.sun.com/documentation
The chapter contains the following sections:
Note - Whenever you are at a blade console, type #. to return to the active System Controller's sc> prompt. |
When you apply power to a SPARC Solaris B100s server blade that is in its factory default state, the blade boots automatically from an operating system stub on its local hard disk. It then searches for a Network Install Server from which to complete the Operating Environment installation process.
To set up a Network Install Server, follow the instructions in the Solaris Advanced Installation Guide (supplied with the Solaris 8 12/02 media kit).
For supplementary information about using Web Start Flash Archives to speed up the process of configuring a series of server blades in a system chassis, refer to Appendix C in this manual.
Before you can use a Linux or Solaris x86 blade, you need to configure it temporarily to boot from the network. This is to enable it to perform the PXE boot process by which it first receives its operating system.
To set up a PXE server, follow the instructions in the Sun Fire B100x and B200x Server Blade Installation and Setup Guide.
Type the following command at the System Controller's sc> prompt to cause the blade to boot from the network
where n is the number of the slot containing the blade.
When you are ready, power on a server blade and boot it by following the instructions below:
where n is the number of the slot containing the server blade.
2. Log into the console of the server blade to view (and/or participate in) the booting process.
Type the following at the sc> prompt to access the blade's console:
where n is the number of the slot containing the blade.
Your next action depends on which of the Solaris installation methods you have chosen from the Solaris Advanced Installation Guide.
3. For SPARC Solaris blades, if you require you can interrupt the boot process either to control it yourself or to run diagnostics.
To interrupt the boot process[1], type:
where n is the number of the slot containing the blade.
4. Follow the instructions in the remainder of this chapter if you want to perform initial diagnostics on a SPARC Solaris server blade.
For information about performing diagnostics on a Sun Fire B10n Content Load Balancing Blade, refer to the Sun Fire B10n Content Load Balancing Administration Guide.
Note - Whenever you are at a blade console, type #. to return to the active System Controller's sc> prompt. |
This section tells you how to control the POST diagnostic process that (by default) takes place on a B100s (SPARC Solaris) blade during booting.
There are three levels of diagnostic testing available for POST diagnostics:
Set the level you require by using the OpenBoot PROM variable diag-level. The default setting for diag-level is min. To set it, type:
where level is min, max, or off.
You can use the System Controller's bootmode command to override the diag-level and diag-switch? settings temporarily.
To cause the server blade to boot with diagnostics when it is not configured to do so:
a. Type #. to return to the System Controller's command-line interface.
where n is the number of the slot whose blade you are intending to configure.
The effect of this command is equivalent to the effect of setting diag-switch? to true and diag-level to min for a single booting only. (If diag-level on the blade is set to max or min, the bootmode command does not alter its setting.)
To cause the server blade to boot without running diagnostics when it is configured to run diagnostics:
a. Type #. to return the System Controller's command-line interface.
where n is the number of the slot whose blade you are configuring.
The effect of this command is equivalent to the effect of setting diag-switch? to false.
If the OpenBoot PROM (OBP) variable diag-switch? is set to true, then POST diagnostics will run automatically when you power on the server. However, the default setting for diag-switch? is false.
To initialize POST diagnostics, you need to set the diag-switch? variable to true and diag-level to max or min (and not off). When you have done this, you need to reset the server blade. Follow the instructions below:
1. From the ok prompt on the server blade, type:
2. Type #. to return to the System Controller's command-line interface.
where n is the slot number of the blade.
4. Within two-to-three seconds (if possible) of powering on the blade, access the blade's console to view the diagnostics output.
5. When booting is complete, you can inspect the boot-time console output by typing #. to return to the System Controller's command-line interface and then typing:
If POST detects an error, it displays an error message describing the failure.
If POST detects a "fatal" error (for example, a hardware problem with the onboard memory or the CPU), it powers off the server blade and lights the blade's Fault LED).
To run OpenBoot Diagnostics, do the following:
This displays the OpenBoot Diagnostics menu:
The tests are described in TABLE 4-2. Note the number that corresponds to the test you want to perform, and use it with the test command. For example, to run a test on the primary Ethernet port, type:
3. When you have finished testing, exit OpenBoot Diagnostics and restore the value of auto-boot? to true.
The function of each test is shown below.
This section describes the OpenBoot PROM commands you can run and explains what each command does.
Use the OpenBoot PROM show-devs command to list the devices in the OBP device tree.
Use the OpenBoot PROM printenv command to display the OpenBoot PROM configuration variables stored in the system NVRAM. The display includes the current values for these variables as well as the default values. You can also specify a variable to display the current value for that variable only. For example, typing printenv diag-level will print the current value for the diag-level variable.
The watch-clock command displays a number that increments once per second. During normal operation the seconds counter repeatedly increments from 0 to 59. The following shows an example snapshot of output from the watch-clock command.
ok watch-clock Watching the `seconds' register of the real time clock chip. It should be `ticking' once a second. Type any key to stop. 4 |
The watch-net and watch-net-all commands monitor Ethernet packets on the blade's Ethernet interfaces. Good packets received by are indicated by a period (.). Errors such as the framing error and the cyclic redundancy check (CRC) error are indicated with an X and an associated error description.
The following examples show watch-net and the watch-net-all command output.
ok watch-net 1000 Mbps FDXLink up Link is -- up Looking for Ethernet Packets. `.' is a Good Packet. `X' is a Bad Packet. Type any key to stop. ................................ ok |
The probe-ide command causes the IDE controller on the blade to send an enquiry to each of its four possible IDE devices (in fact there is only ever one device connected to the IDE controller). If you observe a not present response for the primary master device, this indicates a problem with the hard disk or with the connection to the hard disk from the IDE controller.
SunVTS, the Sun Validation and Test Suite, is an online diagnostics tool that you can use to verify the configuration and functionality of hardware controllers, devices, and platforms. SunVTS is available from the Software Supplement for the Solaris Operating Environment CD.
You need to run it from a Solaris prompt:
SunVTS software lets you view and control a testing session on a remotely connected server. Below is a list of example tests:
To check whether SunVTS is already installed on a server blade, type:
SunVTS is distributed on the Software Supplement for the Solaris Operating Environment CD. For information about installing it, refer to the Sun Hardware Platform Guide. The default directory to use when you install SunVTS software is /opt/SUNWvts.
To test a Sun Fire B100s server blade by running a SunVTS session from a workstation using the SunVTS graphical user interface, follow the procedure below:
1. Use the xhost command on the workstation to give the server blade access to the local display.
where remote_hostname is the host name of the server blade.
2. Remotely log into the server blade as superuser or root.
where local_hostname is the name of the workstation you are using.
Note - The directory /opt/SUNWvts/bin is the default directory for SunVTS software. If you have the software installed in a different directory, use that path instead. |
When you start SunVTS software, the SunVTS kernel probes the test system devices and displays the results on the Test Selection panel. There is an associated SunVTS test for each hardware device on your system.
You can fine-tune your testing session by selecting the appropriate check boxes for each of the tests you want to run.
Copyright © 2004, Sun Microsystems, Inc. All rights reserved.