Este problema se corrigió en el firmware del sistema Sun 9.5.2.g.
Si el dominio principal se configura sin recursos suficientes (dos SCC o menos) y los errores corregibles disparan una acción de retiro de FMA que afecta ambos SCC, el dominio se bloquea al reiniciarlo. Los demás dominios no se ven afectados y continúan funcionando normalmente siempre y cuando sus propias unidades y tarjetas de red sigan estando disponibles. Si un error dispara un retiro de dominio, puede ver un fallo utilizando el comando fmadm faulty.
SUNW-MSG-ID: SPSUN4V-8001-YA, TYPE: Problem, VER: 1, SEVERITY: Major EVENT-TIME: Tue Oct 6 18:50:50 EDT 2015 PLATFORM: SPARC T7-2, CSN: 12345678, HOSTNAME: bur-t72-303-sp SOURCE: fdd, REV: 1.0 EVENT-ID: f78853a2-87cf-e147-efb3-ecc370ef147e DESC: An event was received indicating a fault was diagnosed by another fault manager. AUTO-RESPONSE: Refer to the document at http://support.oracle.com/msg/SPSUN4V-8001-YA. IMPACT: Refer to the document at http://support.oracle.com/msg/SPSUN4V-8001-YA. REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/SPSUN4V-8001-YA for the latest service procedures and policies regarding this diagnosis. -> fmadm faulty Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2015-10-06/22:51:00 abea80bd-6d18-46a4-e9cc-fda7df765748 SPSUN4V-8001-YA Major Problem Status : open [injected] Diag Engine : fdd 1.0 System Manufacturer : Oracle Corporation Name : SPARC T7-2 Part_Number : 87654321 Serial_Number : 12345678 ---------------------------------------- Suspect 1 of 1 Fault class : fault.cpu.generic-sparc.l2d-uc Certainty : 100% Affects : /SYS/MB/CM0/CMP/SCC3/L2D1 Status : faulted FRU Status : faulty Location : /SYS/MB Manufacturer : Oracle Corporation Name : ASY,MB,T7-2 Part_Number : 7093274 Revision : 02 Serial_Number : 465769T+1434NH00JJ Chassis Manufacturer : Oracle Corporation Name : SPARC T7-2 Part_Number : 87654321 Serial_Number : 12345678 Description : A cpu has experienced an uncorrectable level 2 data cache error (UE). Response : Cpu cores associated with the cache will be deconfigured. Impact : Some services may be lost and performance may be impacted. Action : Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/SPSUN4V-8001-YA for the latest service procedures and policies regarding this diagnosis. ------------------- ------------------------------------ -------------- -------- Time UUID msgid Severity ------------------- ------------------------------------ -------------- -------- 2015-10-06/22:50:50 f78853a2-87cf-e147-efb3-ecc370ef147e SPSUN4V-8001-YA Major Problem Status : open [injected] Diag Engine : fdd 1.0 System Manufacturer : Oracle Corporation Name : SPARC T7-2 Part_Number : 87654321 Serial_Number : 12345678 ---------------------------------------- Suspect 1 of 1 Fault class : fault.cpu.generic-sparc.l2d-uc Certainty : 100% Affects : /SYS/MB/CM0/CMP/SCC3/L2D0 Status : faulted FRU Status : faulty Location : /SYS/MB Manufacturer : Oracle Corporation Name : ASY,MB,T7-2 Part_Number : 7093274 Revision : 02 Serial_Number : 465769T+1434NH00JJ Chassis Manufacturer : Oracle Corporation Name : SPARC T7-2 Part_Number : 87654321 Serial_Number : 12345678 Description : A cpu has experienced an uncorrectable level 2 data cache error (UE). Response : Cpu cores associated with the cache will be deconfigured. Impact : Some services may be lost and performance may be impacted. Action : Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/SPSUN4V-8001-YA for the latest service procedures and policies regarding this diagnosis.
Este error es la causa raíz de un retiro de dominio si el fallo se notifica en los mismos núcleos que ejecutan el dominio principal, y el dominio principal se bloquea al reiniciarlo.
Solución alternativa: asegúrese de que el dominio invitado principal tenga asignados dos SCC o más (es decir, un mínimo de dos SCC y algunos núcleos adicionales) en el mismo nodo.
Recuperación: restablezca a la fuerza el dominio (reset -f /HOST) para acceder nuevamente. Al reiniciar, el servidor no puede acceder a la configuración de SPM guardada recientemente, y, en su lugar, se revierte a la configuración por defecto de fábrica.