C H A P T E R 11 |
Removing and Replacing Boards |
The Sun Fire 6800/4810/4800/3800 Systems Service Manual and the Sun Fire E6900/E4900 Systems Service Manual provide instructions for physically removing and replacing boards. However, board removal and replacement also involves firmware steps that must be performed before a board is removed from the system and after a new board replaces the old one. This chapter discusses the firmware steps involved with the removal and replacement of the following boards, cards, and assemblies:
This chapter also discusses how to unassign a board from a domain and disable the board.
To troubleshoot board and component failures, see Board and Component Failures. To remove and install the FrameManager, ID board, power supplies, and fan trays, refer to the Sun Fire 6800/4810/4800/3800 Systems Service Manual and the Sun Fire E6900/E4900 Systems Service Manual.
Before you begin, have the following books available:
You will need these books for Solaris operating environment steps and the hardware removal and installation steps. The Sun Hardware Platform Guide and the Sun Fire Midrange Systems Dynamic Reconfiguration User Guide are available with your Solaris operating environment release.
The following procedures describe the software steps involved with:
Refer to the Sun Fire Midrange Systems Dynamic Reconfiguration User Guide for details on:
To Remove and Replace a System Board |
This procedure does not involve dynamic reconfiguration commands.
1. Access the domain that contains the board or assembly to be removed by performing the following:
a. Connect to the domain console.
For details on accessing the domain console, see To Navigate Between The Platform Shell And a Domain and To Go From a Domain Shell To a Domain Console.
b. Halt the Solaris operating environment from the domain console as superuser.
c. Type the escape sequence to get to the domain shell prompt.
By default, the escape sequence is #., the pound sign, followed by a period.
The domain shell prompt is displayed.
2. Turn the domain keyswitch to the standby position with the
setkeyswitch standby command and then power off the board or assembly.
where board_name is sb0 - sb5 or ib6 - ib9.
Verify the green power LED ( ) is off.
3. Remove the board or assembly and replace with a new board or assembly.
Refer to the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
4. Power on the board or assembly.
where board_name is sb0-sb5 or ib6-ib9.
5. Check the version of the firmware that is installed on the board by using the showboards command:
The firmware version of the new replacement board must be compatible with the system controller firmware.
6. If the firmware version of the replacement board or assembly is not compatible with the SC firmware, update the firmware on the board.
a. Use the flashupdate -c command to update the firmware from another board in the current domain.
For details on the flashupdate command syntax, refer to the command description in the Sun Fire Midrange System Controller Command Reference Manual.
b. After you run the flashupdate command to update the board firmware to a compatible firmware version, and if the board is in a Failed state, as indicated by showboards output, power off the board to clear the Failed state.
7. Before you bring an I/O assembly back to the Solaris operating environment, test the I/O assembly in a spare domain that contains at least one CPU/Memory board with a minimum of one CPU.
8. Turn the domain keyswitch to the on position with the setkeyswitch on command.
This command turns the domain on and boots the Solaris operating environment if the OpenBoot PROM parameters are set as follows:
If the Solaris operating environment did not boot automatically, continue with Step 9. If the appropriate OpenBoot PROM parameters are not set up to take you to the login: prompt, you will see the ok prompt. For more information on the OpenBoot PROM parameters, refer to the OpenBoot documentation included in the Sun Hardware Documentation Set.
9. At the ok prompt, type the boot command:
After the Solaris operating environment is booted, the login: prompt is displayed.
To Unassign a Board From a Domain or Disable a System Board |
If a CPU/Memory board or I/O assembly fails, perform one of the following tasks:
To Hot-Swap a CPU/Memory Board Using DR |
1. Use DR to unconfigure and disconnect the CPU/Memory board out of the domain.
Refer to the Sun Fire Midrange Systems Dynamic Reconfiguration User Guide.
2. Verify the state of the LEDs on the board.
Refer to the CPU/Memory board chapter of the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
3. Remove and replace the board.
Refer to the CPU/Memory board chapter of the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
where board_name is sb0-sb5 or ib6-ib9.
5. Check the version of the firmware that is installed on the board by using the showboards command:
The firmware version of the new replacement board must be compatible with the system controller firmware.
6. If the firmware version of the replacement board or assembly is not compatible with the SC firmware, use the flashupdate -c command to update the firmware from another board in the current domain.
For a description of command syntax, refer to the flashupdate command in the Sun Fire Midrange System Controller Command Reference Manual.
7. Use DR to connect and configure the board back into the domain.
Refer to the Sun Fire Midrange Systems Dynamic Reconfiguration User Guide.
8. Verify the state of the LEDs on the board.
Refer to the CPU/Memory board chapter of the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
To Hot-Swap an I/O Assembly Using DR |
The following procedure describes how to hot-swap an I/O assembly and test it in a spare domain that is not running the Solaris operating environment.
1. Use DR to unconfigure and disconnect the I/O assembly out of the domain.
Refer to the Sun Fire Midrange Systems Dynamic Reconfiguration User Guide
2. Verify the state of the LEDs on the assembly.
Refer to the I/O assembly chapter of the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
3. Remove and replace the assembly.
Refer to the I/O assembly chapter of the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
5. Check the version of the firmware that is installed on the assembly by using the showboards command:
The firmware version of the new replacement board must be compatible with the system controller firmware.
6. If the firmware version of the replacement board or assembly is not compatible with the SC firmware, use the flashupdate -c command to update the firmware from another board in the current domain:
For details on the flashupdate command syntax, refer to the command description in the Sun Fire Midrange System Controller Command Reference Manual.
7. Before you bring the board back to the Solaris operating environment, test the I/O assembly in a spare domain that contains at least one CPU/Memory board with a minimum of one CPU.
For details, see Testing an I/O Assembly.
8. Use DR to connect and configure the assembly back into the domain running the Solaris operating environment.
Refer to the Sun Fire Midrange Systems Dynamic Reconfiguration User Guide.
If you need to remove and replace a CompactPCI or PCI card, use the procedures that follow. These procedures do not involve DR commands. For additional information on replacing CompactPCI and PCI cards, refer to the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
To Remove and Replace a PCI Card |
1. Halt the Solaris operating environment in the domain, power off the I/O assembly, and remove it from the system.
Complete Step 1 and Step 2 in To Remove and Replace a System Board.
2. Remove and replace the card.
Refer to the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
3. Replace the I/O assembly and power it on.
Complete Step 3 and Step 4 in To Remove and Replace a System Board.
4. Reconfigure booting of the Solaris operating environment in the domain.
At the ok prompt, type boot -r.
To Remove and Replace a CompactPCI Card |
1. Halt the Solaris operating environment in the domain, power off the I/O assembly, and remove it from the system.
Complete Step 1 and Step 2 in To Remove and Replace a System Board.
2. Remove and replace the CompactPCI card from the I/O assembly.
For details, refer to the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
3. Reconfigure booting of the Solaris operating environment in the domain.
At the ok prompt, type boot -r.
This section discusses the firmware steps necessary to remove and replace a Repeater board. Only the Sun Fire E6900/E4900/6800/4810/4800 systems have Repeater boards. The Sun Fire 3800 system has the equivalent of two Repeater boards on the active centerplane.
To Remove and Replace a Repeater Board |
1. Determine which domains are active by typing the showplatform -p status system controller command from the platform shell.
2. Determine which Repeater boards are connected to each domain (TABLE 11-1).
Equivalent of two Repeater boards integrated into the active centerplane. |
Complete Step 1 through Step 3 in To Power Off the System.
4. Power off the Repeater board with the poweroff command.
where board_name is the name of the Repeater board (rp0, rp1, rp2, or rp3).
5. Verify that the green power LED ( ) is off.
Caution - Be sure you are properly grounded before you remove and replace the Repeater board. |
6. Remove and replace the Repeater board.
Refer to the Sun Fire 6800/4810/4800/3800 Systems Service Manual and the Sun Fire E6900/E4900 Systems Service Manual.
7. Boot each domain using the boot procedure described in To Power On the System.
This section discusses how to remove and replace a System Controller board.
To Remove and Replace the System Controller Board in a Single SC Configuration |
Note - This procedure assumes that your system controller has failed and that there is no spare system controller. |
1. For each active domain, use an SSH or Telnet session to access the domain (see Chapter 2 for details), and halt the Solaris operating environment in the domain.
2. Turn off the system completely.
Caution - Be sure to power off the circuit breakers and the power supply switches for the Sun Fire 3800 system. Make sure you power off all the hardware components to the system. |
Refer to the "Powering Off and On" chapter in the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
3. Remove the defective System Controller board and replace the new System Controller board.
Refer to the "System Controller Board" chapter in the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
4. Check the firmware version of the new replacement board by using the showsc command:
The firmware version of the new System Controller board must be compatible with other components in the system. If the firmware version is not compatible, use the flashupdate command to upgrade or downgrade the firmware on the new system controller board. Refer to the Install.info file for instructions on upgrading or downgrading system controller firmware.
5. Power on the redundant transfer units (RTUs), AC input boxes, and the power supply switches.
Refer to the "Powering Off and On" chapter in the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual. When the specified hardware is powered on, the System Controller board will automatically power on.
You must have saved the latest platform and domain configurations of your system with the dumpconfig command in order to restore the latest platform and domain configurations with the restoreconfig command. For command syntax and examples, see the restoreconfig command in the Sun Fire Midrange System Controller Command Reference Manual.
7. Check the date and time for the platform and each domain.
Type the showdate command in the platform shell and in each domain shell.
If you need to reset the date or time, go to Step 8. Otherwise, skip to Step 9.
8. Set the date and time for the platform and for each domain (if needed).
a. Set the date and time for the platform shell.
See the setdate command in the Sun Fire Midrange System Controller Command Reference Manual.
b. Set the date for each domain shell.
9. Check the configuration for the platform by typing showplatform at the platform shell. If necessary, run the setupplatform command to configure the platform.
See To Configure Platform Parameters.
10. Check the configuration for each domain by typing showdomain in each domain shell. If necessary, run the setupdomain command to configure each domain.
See To Configure Domain-Specific Parameters.
11. Boot the Solaris operating environment in each domain you want powered on.
12. Complete Step 4 and Step 5 in To Power On the System.
To Remove and Replace a System Controller Board in a Redundant SC Configuration |
1. Run the showsc or showfailover -v command to determine which system controller (SC) is the main.
2. If the working SC (the one that is not to be replaced) is not the main, perform a manual failover:
The working system controller becomes the main SC.
3. Power off the system controller to be replaced:
where component_name is the name of the System Controller board to be replaced, either SSC0 or SSC1.
The System Controller board is powered off, and the hot-plug LED is illuminated. A message indicates when you can safely remove the system controller.
4. Remove the System Controller board to be replaced and insert the new System Controller board.
The new System Controller board powers on automatically.
5. Verify that the firmware on the new system controller matches the firmware on the working SC.
You can use the showsc command to check the firmware version (the ScApp version) running on the system controller. If the firmware versions do not match, use the flashupdate command to upgrade or downgrade the firmware on the new system controller so that it matches the firmware version of the other SC. Refer to the Install.info file for details.
6. Re-enable SC failover by running the following command on the main or spare SC:
This section explains how to remove and replace an ID board and centerplane.
To Remove and Replace an ID Board and Centerplane |
1. Before you begin, be sure to have a terminal connected to the serial port of the system controller and have the following information available (it will be used later in this procedure):
You can find information on labels affixed to the system. Refer to the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual for more information on label placement.
In most cases, when only the ID board and centerplane are replaced, the original System Controller board will be used. The above information was already cached by the system controller and will be used to program the replacement ID board. You will be asked to confirm the above information.
2. Complete the steps to remove and replace the centerplane and ID board.
Refer to the "Centerplane and ID Boards" chapter of the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
Note - The ID board can be written only once. Exercise care to manage this replacement process carefully. Any errors may require a new ID board. |
3. After removing and replacing the ID board, make every attempt to use the original System Controller board installed in slot ssc0 in this system.
Using the same System Controller board allows the system controller to automatically prompt with the correct information.
4. Power on the hardware components.
Refer to the "Power Off and On" chapter of the Sun Fire 6800/4810/4800/3800 Systems Service Manual or the Sun Fire E6900/E4900 Systems Service Manual.
The system controller boots automatically.
5. If you have a serial port connection, access the console for the system controller because the system will prompt you to confirm the board ID information (CODE EXAMPLE 11-1).
The prompting will not occur with a remote connection (SSH or telnet).
If you have a new System Controller board, skip Step 6 and go to Step 7.
6. Compare the information collected in Step 1 with the information you have been prompted with in Step 5.
7. If you answer no to the question in Step 6 or if you are replacing both the ID board and the System Controller board at the same time, you will be prompted to enter the ID information manually.
Note - Enter this information carefully, as you have only one opportunity to do so. Use the information collected in Step 1 to answer the questions prompted for in CODE EXAMPLE 11-2. Be aware that you must specify the MAC address and Host ID of domain A (not the SC). |
8. Complete Step 3 and Step 4 in To Power On the System.
Copyright © 2004, Sun Microsystems, Inc. All rights reserved.