Sun Enterprise 10000 Dynamic Reconfiguration User Guide

Correctable Memory Errors

Correctable memory errors indicate that the memory on a system board (that is, one or more of its Dual Inline Memory Modules (DIMMs), or portions of the hardware interconnect) may be faulty and need replacement. When the SSP detects correctable memory errors, it initiates a record-stop dump to save the diagnostic data, which can interfere with a DR detach operation. Therefore, Sun Microsystems suggests that when a record-stop occurs from a correctable memory error, you allow the record-stop dump to complete its process before you initiate a DR detach operation.

If the faulty component causes repeated reporting of correctable memory errors, the SSP performs multiple record-stop dumps. If this happens, you should temporarily disable the dump-detection mechanism on the SSP, allow the current dump to finish, then initiate the DR detach operation. After the detach operation finishes, you should re-enable the dump detection.

To Re-Enable Dump Detection
  1. Log in to the SSP as the user ssp.

  2. Disable record-stop dump detection:


    SSP% edd_cmd -x stop
    

    This command suspends all event detection on all of the domains.

  3. Monitor the in-progress record-stop dump:


    SSP% ps -ef | grep hpost
    

    In the grep(1) output, the -D option of hpost indicates that a record-stop dump is in progress.

  4. Perform the DR detach operation.

  5. Enable event detection:


    SSP% edd_cmd -x start