Go to main content

SPARC M8 and SPARC M7 Servers Service Manual

Exit Print View

Updated: January 2022
 
 

Component Fault Tolerances

This topic describes the level of fault tolerance for specific components and provides guidelines for replacing these components when a fault occurs.

  • CMIOUs. Remove a faulty CMIOU only when a replacement CMIOU is available. Install the new CMIOU as quickly as possible, within 10 minutes, if possible. You must prepare a CMIOU before removing it from the server.

  • DIMMs. If a DIMM is diagnosed to be faulty while the system is running, the memory will dynamically switch from 16-way to 15-way interleave by distributing the contents of the faulty DIMM into the other 15 DIMMs. For more information, refer to DIMM Sparing in SPARC M8 and SPARC M7 Servers Administration Guide.

  • Fan modules (CMIOU chassis). The CMIOU chassis has eight fan modules. The server will continue to operate at full capacity with seven fan modules installed in the CMIOU chassis. The server will not operate with fewer than seven fans. If the server is operating with seven fan modules and one or more of those fans fails, the server might power down to keep from overheating.

  • Fan modules (switch chassis). The switch chassis has 36 fan modules. Each switch unit has six dedicated fan modules. The server will continue to operate at full capacity with five of the six fans operating for each switch unit. If one of the five operating fans fails, the server might power down to keep from overheating.

  • PCIe cards. If the PCIe card is assigned to an I/O domain on a logical domain, you must prepare the card for removal to avoid a configuration that is unsupported.

  • Power supplies. The power supplies for the CMIOU and switch chassis are 2N redundant. If a power supply fails in either a switch or CMIOU chassis, the server can operate normally with only three power supplies in the switch chassis or five power supplies in the CMIOU chassis.

    There are no restrictions into which slots the power supplies have to be installed. You can install them in any of the power supply slots as long as all power supplies are installed.

  • SPs. Replace one SP in the server at a time. You must prepare an SP before removing it from the server.

  • Switch units. Switch units are configured to work together as a single unit. If a switch unit fails, the server will operate in degraded mode. At least five switch units must be functioning for the server to operate. You must prepare a switch unit before removing it from the server.


Note -  If you remove any of these components while the server is powered on, wait 30 seconds before installing the replacement component. Doing so ensures that Oracle ILOM has enough time to detect the new component, which is required for the software to power it on.

Related Information