Edit mcelog File if Faults Are Not Present in the Fault Management Database

If the entry raw = yes in the mcelog.conf file is commented out, the Oracle Linux Fault Management software cannot obtain the information it needs to create a fault case. If that happens, fault cases for machine check events processed by mcelog are not added to the fault management database.

  1. Confirm that the format of mcelog file messages are in the raw format by opening /var/log/mcelog in a text editor.

    The following table shows an example of a default message and a "raw" message (required by Oracle Linux FMA).

    Default Format Raw Format
    Hardware event. This is not a software error.
    MCE 0
    CPU 0 BANK 8
    MISC 7 ADDR 102bfc0368
    TIME 1383171020 Wed Oct 30 18:10:20 2013
    MCG status:EIPV MCIP
    MCi status:
    Corrected error
    Error enabled
    MCi_MISC register valid
    MCi_ADDR register valid
    MCA: MEMORY CONTROLLER RD_CHANNEL0_ERR
    Transaction: Memory read error
    STATUS 9c00000000000090 MCGSTATUS 6
    MCGCAP 1000c14 APICID 20 SOCKETID 1
    CPUID Vendor Intel Family 6 Model 45 
    CPU 0
    BANK 8
    TSC 0
    RIP 00:0
    MISC 0x85
    ADDR 0x102bfc0368
    STATUS 0x9c00000000000090
    MCGSTATUS 0x6
    PROCESSOR 0:0x306f1
    TIME 1383171020
    SOCKETID 1
    APICID 20
    MCGCAP 0x1000c14
  2. If the messages in the mcelog file are in the default format, edit the /etc/mcelog/mcelog.conf file to uncomment the “raw = yes” entry.
  3. Delete the old mcelog file that was in the default format.

    rm /var/log/mcelog

  4. Then restart the mcelog daemon, as follows:

    service mcelogd restart