Skip Navigation Links | |
Exit Print View | |
Sun Server X2-8 (formerly Sun Fire X4800 M2) Diagnostics Guide Sun Server X2-8 (formerly Sun Fire X4800 M2) Documentation Library |
Overview of the Diagnostics Guide
Introduction to System Diagnostics
How to Gather Service Visit Information
How to Troubleshoot Power Problems
How to Externally Inspect the Server
How to Internally Inspect the Server
How to Isolate and Correct Persistent DIMM Errors
How to Detect DIMM Errors Using BIOS POST and the Oracle ILOM SEL
How to Detect DIMM Errors Using the CE Log
Default BIOS Power-On Self-Test (POST) Events
Using Oracle ILOM to Monitor the Host
Viewing the Oracle ILOM Sensor Readings
Viewing the Oracle ILOM System Event Log
Interpreting Event Log Time Stamps
Creating a Data Collector Snapshot
How to Create a Snapshot With the Oracle ILOM Web Interface
How to Create a Snapshot With the Oracle ILOM Command-Line Interface
Using SunVTS Diagnostics Software
Introduction to SunVTS Diagnostic Test Suite
How to Diagnose Server Problems With the Bootable Diagnostics CD
Performing Pc-Check Diagnostic Tests
How to Run Pc-Check Diagnostics
U-Boot Diagnostic Start-Up Tests
Running the U-Boot Diagnostic Tests and Viewing the Results
Sample SP Environmental Variables Showing U-Boot Test Status
This procedure shows how to isolate and replace faulty DIMMs using the Oracle ILOM.
All DIMMs are configured, identified, and replaced in pairs. When there is a fault in a DIMM, you must replace both the faulty DIMM and the other DIMM in the pair. To identify DIMM pairings, see DIMM Hardware.
Because of the server's architecture, when the server detects a faulty DIMM, it disables other DIMM pairs as well. When the Oracle ILOM displays the faulty DIMMs, it also displays the disabled DIMMs. However, they are clearly identified as disabled, not faulty. When the faulty DIMMs are repaired, the server automatically places the disabled DIMMS back into service.
Caution - Before handling components, attach an antistatic wrist strap to a chassis ground (any unpainted metal surface). The system’s printed circuit boards and hard disk drives contain components that are extremely sensitive to static electricity. |
Before You Begin
This task requires access to the Oracle ILOM and the ability to remove a CMOD from the server. It also requires that you have a pair of DIMMs to replace the faulty pair.
From the CLI:
Enter the show faulty command.
If the system has detected a faulty DIMM pair, the output contains one or more entries similar to the following:
Target | Property | Value --------------------+------------------------+--------------------------------- /SP/faultmgmt/0 | fru | /SYS/BL0/P0/D0 /SP/faultmgmt/0/ | class | fault.memory.intel.dimm_ue faults/0 | | ... (other information) /SP/faultmgmt/1 | fru | /SYS/BL0/P0/D4 /SP/faultmgmt/1/ | class | fault.memory.intel.dimm_ue
This identifies the faulty pair as DIMM 0 and DIMM 4 on CPU 0 of CMOD 0 (BL0).
Note - This display might include other DIMM pairs that have been temporarily disabled. These are automatically returned to service when the faulty DIMM pair is repaired.
From the Oracle ILOM web interface:
Click System Information > Fault Management.
The following display identifies the faulty pair as DIMM 0 and DIMM 4 on CPU 0 of CMOD 0 (BL0).
Note - This display might include other DIMM pairs that have been temporarily disabled. These are automatically returned to service when the faulty DIMM pair is repaired.
The LEDs on the board adjacent to the DIMM slots light to identify faulty DIMM pairs.
For the locations of DIMM fault LEDs, DIMM pairs, and the fault remind button, see DIMM Hardware.
Note - If the DIMM fault LEDs do not identify the same DIMM pair as the Oracle ILOM, contact Oracle Service.
If a DIMM is damaged, replace the DIMM pair.
If the DIMM slot is cracked or broken, contact Oracle Service.
If there is dirt or other contamination on the DIMM or on the slot, you might choose to clean it, reseat the DIMMs, and check the system again.
A pair is two of the following belonging to the same CPU:
|
Refer to Removing and Installing DIMMs (CRU) in Sun Server X2-8 (formerly Sun Fire X4800 M2) Service Manual for additional information.
—> set /SYS/MB/Px/Dy clear_fault_action=true Are you sure you want to clear /SYS/MB/Px/Dx (y/n) y Set 'clear_fault_action' to true
Where x is the node number and y is the DIMM number.
This command clears the DIMM of random information that might be interpreted as faulty data.
See Also