JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Sun Server X2-8 Product Documentation     Sun Server X2-8 (formerly Sun Fire X4800 M2) Documentation Library
search filter icon
search icon

Document Information

Using This Documentation

Hardware Installation

Operating System Installation

Administration, Diagnostics, and Service

Diagnostics Guide

Overview of the Diagnostics Guide

Introduction to System Diagnostics

Troubleshooting Options

Diagnostic Tools

Troubleshooting the Server

How to Gather Service Visit Information

How to Troubleshoot Power Problems

How to Externally Inspect the Server

How to Internally Inspect the Server

Troubleshooting DIMM Problems

How to Isolate and Replace Faulty DIMM Pairs

How to Isolate and Correct Persistent DIMM Errors

Additional Tests

How to Detect DIMM Errors Using BIOS POST and the Oracle ILOM SEL

How to Detect DIMM Errors Using the CE Log

DIMM Hardware

BIOS POST

Default BIOS Power-On Self-Test (POST) Events

BIOS POST Errors

Using Oracle ILOM to Monitor the Host

Viewing the Oracle ILOM Sensor Readings

How to Use the Oracle ILOM Web Interface to View the Sensor Readings

How to Use the Oracle ILOM Command-Line Interface to View the Sensor Readings

Viewing Fault Status

How to View Fault Status Using the Oracle ILOM Web Interface

How to View Fault Status Using the Oracle ILOM Command-Line Interface

Clearing Faults

How to Clear Faults Using the Oracle ILOM Web Interface

How to Clear Faults Using the Oracle ILOM Command-Line Interface

Viewing the Oracle ILOM System Event Log

How to View the System Event Log Using the Oracle ILOM Web Interface

How to View the System Event Log With the Oracle ILOM Command-Line Interface

Clearing the System Event Log

How to Clear the System Event Log Using the Oracle ILOM Web Interface

How to Clear the System Event Log Using the Oracle ILOM Command-Line Interface

Interpreting Event Log Time Stamps

Resetting the SP

How to Reset the Oracle ILOM SP Using the Web Interface

How to Reset the Oracle ILOM SP Using the Command-Line Interface

Creating a Data Collector Snapshot

How to Create a Snapshot With the Oracle ILOM Web Interface

How to Create a Snapshot With the Oracle ILOM Command-Line Interface

Using SunVTS Diagnostics Software

Introduction to SunVTS Diagnostic Test Suite

SunVTS Log Files

SunVTS Documentation

How to Diagnose Server Problems With the Bootable Diagnostics CD

Performing Pc-Check Diagnostic Tests

Pc-Check Diagnostics Overview

How to Run Pc-Check Diagnostics

Pc-Check Main Menu

System Information Menu

Advanced Diagnostics

Burn-In Testing

Standard Scripts

How to Perform Immediate Burn-In Testing

How to Create and Save Scripts for Deferred Burn-in Testing

Viewing the Pc-Check Results

How to View Pc-Check Files With the Text File Editor

How to View Test Results Using Show Results Summary

How to Print the Results of Diagnostics Tests

U-Boot Diagnostic Start-Up Tests

U-Boot Test Options

Running the U-Boot Diagnostic Tests and Viewing the Results

How to Run the U-Boot Diagnostic Tests

U-Boot Diagnostic Test Output

Sample SP Environmental Variables Showing U-Boot Test Status

Index

How to Detect DIMM Errors Using BIOS POST and the Oracle ILOM SEL

When the server is started or rebooted, the BIOS POST tests memory by performing a write/read test of every location using the pattern 55aa. Then BIOS polls the memory controllers for both correctable and non correctable memory errors, and logs those errors into the SP SEL.

BIOS does not perform this test if Quick Boot is enabled.

For more information about BIOS POST, see BIOS POST.

  1. Log in to the Oracle ILOM and access the system event log using the web interface or CLI.

    See Viewing the Oracle ILOM System Event Log.

  2. Identify the location of the DIMM error.

    See the following example:

    Event# | Date | Time | Memory | Uncorrectable Error |Asserted | OEM Data-2 0x12 OEM Data-3 0x9d
    • Data-2 contains two nibbles (0x12 is “hex one and hex two” not “hex twelve”)

      Consider the data from the preceding sample (0x12). In binary, it is 0001,0010.

      • Bits 6-7 = 00. This identifies the error as an ECC memory error. It should not change.

      • Bits 4-5 = 01. This identifies the memory branch. This number is unused in this context.

      • Bits 0-3 = 0010. Converted to decimal, these identify CPU node 2.

        Nodes map to CPUs as follows:

        The following table shows the mapping of nodes to CMODs and CPUs. In the physical system, CMOD 0 is on the bottom, and CMOD 3 is on the top.

        Node
        CMOD
        CPU
        2
        3
        0
        3
        1
        6
        2
        0
        7
        1
        4
        1
        0
        5
        1
        0
        0
        0
        1
        1

        In this example, the value 2 identifies CMOD 3, CPU 0.

    • Data–3 contains two nibbles (0x9d is “hex nine and hex d” not “hex nine d”). These numbers identify the DIMMs in the pair.

      Consider the data from the preceding sample (0x9d). Converted to decimal it identifies DIMMs 9 and 13.