C H A P T E R  4

Setting Up Server Blades and Performing Initial Diagnostics

This chapter tells you how to power on a server blade and access its console. It then tells you how to perform preliminary diagnostics using the various tools (apart from the Advanced Lights-out Management Software described in the Sun Fire B1600 Blade System Chassis Software Setup Guide) that are available.

For general information about running diagnostics on Solaris systems refer to the OpenBoot Command Reference Manual and the SunVTS Users Guide. These are available on the Software Supplement CD supplied with the Solaris Media Kit. You can also access them from:

http://www.sun.com/documentation

The chapter contains the following sections:



Note - Whenever you are at a blade console, type #. to return to the active System Controller's sc> prompt.




4.1 Booting and Powering On Server Blades

4.1.1 Booting SPARC Solaris B100s Blades

When you apply power to a SPARC Solaris B100s server blade that is in its factory default state, the blade boots automatically from an operating system stub on its local hard disk. It then searches for a Network Install Server from which to complete the Operating Environment installation process.

To set up a Network Install Server, follow the instructions in the Solaris Advanced Installation Guide (supplied with the Solaris 8 12/02 media kit).

For supplementary information about using Web Start Flash Archives to speed up the process of configuring a series of server blades in a system chassis, refer to Appendix C in this manual.

4.1.2 Booting Linux or Solaris x86 B100x or B200x Blades for the First Time

Before you can use a Linux or Solaris x86 blade, you need to configure it temporarily to boot from the network. This is to enable it to perform the PXE boot process by which it first receives its operating system.

To set up a PXE server, follow the instructions in the Sun Fire B100x and B200x Server Blade Installation and Setup Guide.

Type the following command at the System Controller's sc> prompt to cause the blade to boot from the network

sc> bootmode bootscript="boot net" sn

where n is the number of the slot containing the blade.



Note - This command is effective for 10 minutes after that the BIOS reverts to its previous booting behavior. Therefore, to cause the blade to boot from the network you must power it on within 10 minutes of running the bootmode command. If the blade was already powered on when you ran the bootmode command, then to cause it to boot from the network you must reset the blade within 10 minutes by typing:
sc> reset sn



4.1.3 Powering on the Blades

When you are ready, power on a server blade and boot it by following the instructions below:

1. Power on the blade.

Type:

sc> poweron sn

where n is the number of the slot containing the server blade.

2. Log into the console of the server blade to view (and/or participate in) the booting process.

Type the following at the sc> prompt to access the blade's console:

sc> console sn

where n is the number of the slot containing the blade.

Your next action depends on which of the Solaris installation methods you have chosen from the Solaris Advanced Installation Guide.

3. For SPARC Solaris blades, if you require you can interrupt the boot process either to control it yourself or to run diagnostics.

To interrupt the boot process[1], type:

sc> break sn

where n is the number of the slot containing the blade.

4. Follow the instructions in the remainder of this chapter if you want to perform initial diagnostics on a SPARC Solaris server blade.

For information about performing diagnostics on a Sun Fire B10n Content Load Balancing Blade, refer to the Sun Fire B10n Content Load Balancing Administration Guide.



Note - Whenever you are at a blade console, type #. to return to the active System Controller's sc> prompt.




4.2 Using Power-on Self-test (POST) Diagnostics on B100s Blades

This section tells you how to control the POST diagnostic process that (by default) takes place on a B100s (SPARC Solaris) blade during booting.

4.2.1 Controlling the Amount of Diagnostic Testing

There are three levels of diagnostic testing available for POST diagnostics:

Set the level you require by using the OpenBoot PROM variable diag-level. The default setting for diag-level is min. To set it, type:

ok diag-level level

where level is min, max, or off.

4.2.2 Overriding the Blade's Diagnostic Settings From the System Controller

You can use the System Controller's bootmode command to override the diag-level and diag-switch? settings temporarily.

single-step bulletTo cause the server blade to boot with diagnostics when it is not configured to do so:

a. Type #. to return to the System Controller's command-line interface.

b. Type:

sc> bootmode diag sn

where n is the number of the slot whose blade you are intending to configure.

The effect of this command is equivalent to the effect of setting diag-switch? to true and diag-level to min for a single booting only. (If diag-level on the blade is set to max or min, the bootmode command does not alter its setting.)

single-step bulletTo cause the server blade to boot without running diagnostics when it is configured to run diagnostics:

a. Type #. to return the System Controller's command-line interface.

b. Type:

sc> bootmode skip_diag sn

where n is the number of the slot whose blade you are configuring.

The effect of this command is equivalent to the effect of setting diag-switch? to false.

4.2.3 Running POST Diagnostics

If the OpenBoot PROM (OBP) variable diag-switch? is set to true, then POST diagnostics will run automatically when you power on the server. However, the default setting for diag-switch? is false.

To initialize POST diagnostics, you need to set the diag-switch? variable to true and diag-level to max or min (and not off). When you have done this, you need to reset the server blade. Follow the instructions below:

1. From the ok prompt on the server blade, type:

ok setenv diag-switch? true

2. Type #. to return to the System Controller's command-line interface.

3. Power cycle the blade:

Type:

sc> poweroff sn

where n is the slot number of the blade.

Then type:

sc> poweron sn

4. Within two-to-three seconds (if possible) of powering on the blade, access the blade's console to view the diagnostics output.

Type:

sc> console sn

5. When booting is complete, you can inspect the boot-time console output by typing #. to return to the System Controller's command-line interface and then typing:

sc> consolehistory boot sn

If POST detects an error, it displays an error message describing the failure.

If POST detects a "fatal" error (for example, a hardware problem with the onboard memory or the CPU), it powers off the server blade and lights the blade's Fault LED).


4.3 Using OpenBoot Diagnostics (obdiag) on SPARC Solaris Blades

To run OpenBoot Diagnostics, do the following:

1. From the ok prompt, type:

ok setenv auto-boot? false
ok reset-all

2. Type:

ok obdiag

This displays the OpenBoot Diagnostics menu:

TABLE 4-1 The obdiag Menu

obdiag

1 bscv@0,0

4 network@b

7 serial@0,3f8

2 ide@d

5 pmu@3

3 network@a

6 rtc@0,70

 

 

Commands: test test-all except help what setenv exit

diag-passes=1 diag-level=max test-args=


The tests are described in TABLE 4-2. Note the number that corresponds to the test you want to perform, and use it with the test command. For example, to run a test on the primary Ethernet port, type:

obdiag> test 3
Hit the spacebar to interrupt testing
Testing /pci@1f,0/network@a ...........................passed
Pass:1 (of 1) Errors:0 (of 0) Tests Failed:0 Elapsed Time: 0:0:0:2
 
Hit any key to return to the main menu.

3. When you have finished testing, exit OpenBoot Diagnostics and restore the value of auto-boot? to true.

To do this, type:

obdiag> exit
ok setenv auto-boot? true
ok auto-boot? true
ok boot

The function of each test is shown below.

TABLE 4-2 Open Boot Diagnostics Tests

1

bscv@0,0

tests the Blade Support Chip

2

ide@d

tests the ide controller

3

network@a

tests the primary Ethernet interface

4

network@b

tests the secondary ethernet interface

5

pmu@3

tests the power management unit

6

rtc@0,70

tests the real-time clock device

7

serial@0,3f8

tests the serial interface to the System Controller



4.4 Using Other OpenBoot PROM Commands on SPARC Solaris Blades

This section describes the OpenBoot PROM commands you can run and explains what each command does.

The show-devs Command

Use the OpenBoot PROM show-devs command to list the devices in the OBP device tree.

The printenv Command

Use the OpenBoot PROM printenv command to display the OpenBoot PROM configuration variables stored in the system NVRAM. The display includes the current values for these variables as well as the default values. You can also specify a variable to display the current value for that variable only. For example, typing printenv diag-level will print the current value for the diag-level variable.

The watch-clock Command

The watch-clock command displays a number that increments once per second. During normal operation the seconds counter repeatedly increments from 0 to 59. The following shows an example snapshot of output from the watch-clock command.

ok watch-clock
Watching the `seconds' register of the real time clock chip.
It should be `ticking' once a second.
Type any key to stop.
4

The watch-net and watch-net-all Commands

The watch-net and watch-net-all commands monitor Ethernet packets on the blade's Ethernet interfaces. Good packets received by are indicated by a period (.). Errors such as the framing error and the cyclic redundancy check (CRC) error are indicated with an X and an associated error description.

The following examples show watch-net and the watch-net-all command output.

ok watch-net
1000 Mbps FDXLink up
Link is -- up
Looking for Ethernet Packets.
`.' is a Good Packet. `X' is a Bad Packet.
Type any key to stop.
................................
ok

ok watch-net-all
/pci@1f,0/network@b
1000 Mbps FDXLink up
Link is -- up
Looking for Ethernet Packets.
`.' is a Good Packet. `X' is a Bad Packet.
Type any key to stop.
................................
/pci@1f,0/network@a
1000 Mbps FDXLink up
Link is -- up
Looking for Ethernet Packets.
`.' is a Good Packet. `X' is a Bad Packet.
Type any key to stop.
................................
ok

 
The probe-ide Command

The probe-ide command causes the IDE controller on the blade to send an enquiry to each of its four possible IDE devices (in fact there is only ever one device connected to the IDE controller). If you observe a not present response for the primary master device, this indicates a problem with the hard disk or with the connection to the hard disk from the IDE controller.

CODE EXAMPLE 4-1 probe-ide Output Message
ok probe-ide
 Device 0  ( Primary Master ) 
          ATA Model: TOSHIBA MK3019GAB
 
 Device 1  ( Primary Slave ) 
      Not Present
 
 Device 2  ( Secondary Master )
      Not Present
 
 Device 3  ( Secondary Slave )
      Not Present
 


4.5 Using SunVTS on SPARC Solaris Blades

SunVTS, the Sun Validation and Test Suite, is an online diagnostics tool that you can use to verify the configuration and functionality of hardware controllers, devices, and platforms. SunVTS is available from the Software Supplement for the Solaris Operating Environment CD.

You need to run it from a Solaris prompt:

SunVTS software lets you view and control a testing session on a remotely connected server. Below is a list of example tests:

TABLE 4-3 SunVTS Tests

SunVTS Test

Description

disktest

Verifies local disk drives

fputest

Checks the floating-point unit

nettest

Checks the networking hardware on the system CPU board and on network adapters contained in the system.

pmem

Tests the physical memory (read only)

vmem

Tests the virtual memory (a combination of the swap partition and the physical memory)

bsctest

Tests the Blade Support Chip on the server blade.




Note - Sun VTS is not currently available for B100x and B200x blades running Solaris x86. For information about tools for performing memory diagnostics on these blades, refer to the Sun Fire B100x and B200x Server Blade Installation and Setup Guide.



4.5.1 Finding Out If SunVTS is Installed

To check whether SunVTS is already installed on a server blade, type:

# pkginfo -l SUNWvts

4.5.2 Installing SunVTS

SunVTS is distributed on the Software Supplement for the Solaris Operating Environment CD. For information about installing it, refer to the Sun Hardware Platform Guide. The default directory to use when you install SunVTS software is /opt/SUNWvts.

4.5.3 Running SunVTS

To test a Sun Fire B100s server blade by running a SunVTS session from a workstation using the SunVTS graphical user interface, follow the procedure below:

1. Use the xhost command on the workstation to give the server blade access to the local display.

Type:

# /usr/openwin/bin/xhost + remote_hostname

where remote_hostname is the host name of the server blade.

2. Remotely log into the server blade as superuser or root.

3. Type:

# cd /opt/SUNWvts/bin
# ./sunvts -display local_hostname:0

where local_hostname is the name of the workstation you are using.



Note - The directory /opt/SUNWvts/bin is the default directory for SunVTS software. If you have the software installed in a different directory, use that path instead.



When you start SunVTS software, the SunVTS kernel probes the test system devices and displays the results on the Test Selection panel. There is an associated SunVTS test for each hardware device on your system.

You can fine-tune your testing session by selecting the appropriate check boxes for each of the tests you want to run.


1 (Footnote) For information about configuring a blade not to accept break commands, refer to the kbd(1) MAN page.