Este problema se corrigió en el firmware del sistema Sun 9.5.2.g.
Si el dominio principal se configura sin recursos suficientes (dos SCC o menos) y los errores corregibles disparan una acción de retiro de FMA que afecta ambos SCC, el dominio se bloquea al reiniciarlo. Los demás dominios no se ven afectados y continúan funcionando normalmente siempre y cuando sus propias unidades y tarjetas de red sigan estando disponibles. Si un error dispara un retiro de dominio, puede ver un fallo utilizando el comando fmadm faulty.
SUNW-MSG-ID: SPSUN4V-8001-YA, TYPE: Problem, VER: 1, SEVERITY: Major
EVENT-TIME: Tue Oct 6 18:50:50 EDT 2015
PLATFORM: SPARC T7-2, CSN: 12345678, HOSTNAME: bur-t72-303-sp
SOURCE: fdd, REV: 1.0
EVENT-ID: f78853a2-87cf-e147-efb3-ecc370ef147e
DESC: An event was received indicating a fault was diagnosed by another fault manager.
AUTO-RESPONSE: Refer to the document at http://support.oracle.com/msg/SPSUN4V-8001-YA.
IMPACT: Refer to the document at http://support.oracle.com/msg/SPSUN4V-8001-YA.
REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/SPSUN4V-8001-YA for the latest service procedures and policies regarding this diagnosis.
-> fmadm faulty
Time UUID msgid Severity
------------------- ------------------------------------ -------------- --------
2015-10-06/22:51:00 abea80bd-6d18-46a4-e9cc-fda7df765748 SPSUN4V-8001-YA Major
Problem Status : open [injected]
Diag Engine : fdd 1.0
System
Manufacturer : Oracle Corporation
Name : SPARC T7-2
Part_Number : 87654321
Serial_Number : 12345678
----------------------------------------
Suspect 1 of 1
Fault class : fault.cpu.generic-sparc.l2d-uc
Certainty : 100%
Affects : /SYS/MB/CM0/CMP/SCC3/L2D1
Status : faulted
FRU
Status : faulty
Location : /SYS/MB
Manufacturer : Oracle Corporation
Name : ASY,MB,T7-2
Part_Number : 7093274
Revision : 02
Serial_Number : 465769T+1434NH00JJ
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T7-2
Part_Number : 87654321
Serial_Number : 12345678
Description : A cpu has experienced an uncorrectable level 2 data cache
error (UE).
Response : Cpu cores associated with the cache will be deconfigured.
Impact : Some services may be lost and performance may be impacted.
Action : Use 'fmadm faulty' to provide a more detailed view of this
event. Please refer to the associated reference document at
http://support.oracle.com/msg/SPSUN4V-8001-YA for the latest
service procedures and policies regarding this diagnosis.
------------------- ------------------------------------ -------------- --------
Time UUID msgid Severity
------------------- ------------------------------------ -------------- --------
2015-10-06/22:50:50 f78853a2-87cf-e147-efb3-ecc370ef147e SPSUN4V-8001-YA Major
Problem Status : open [injected]
Diag Engine : fdd 1.0
System
Manufacturer : Oracle Corporation
Name : SPARC T7-2
Part_Number : 87654321
Serial_Number : 12345678
----------------------------------------
Suspect 1 of 1
Fault class : fault.cpu.generic-sparc.l2d-uc
Certainty : 100%
Affects : /SYS/MB/CM0/CMP/SCC3/L2D0
Status : faulted
FRU
Status : faulty
Location : /SYS/MB
Manufacturer : Oracle Corporation
Name : ASY,MB,T7-2
Part_Number : 7093274
Revision : 02
Serial_Number : 465769T+1434NH00JJ
Chassis
Manufacturer : Oracle Corporation
Name : SPARC T7-2
Part_Number : 87654321
Serial_Number : 12345678
Description : A cpu has experienced an uncorrectable level 2 data cache
error (UE).
Response : Cpu cores associated with the cache will be deconfigured.
Impact : Some services may be lost and performance may be impacted.
Action : Use 'fmadm faulty' to provide a more detailed view of this
event. Please refer to the associated reference document at
http://support.oracle.com/msg/SPSUN4V-8001-YA for the latest
service procedures and policies regarding this diagnosis. Este error es la causa raíz de un retiro de dominio si el fallo se notifica en los mismos núcleos que ejecutan el dominio principal, y el dominio principal se bloquea al reiniciarlo.
Solución alternativa: asegúrese de que el dominio invitado principal tenga asignados dos SCC o más (es decir, un mínimo de dos SCC y algunos núcleos adicionales) en el mismo nodo.
Recuperación: restablezca a la fuerza el dominio (reset -f /HOST) para acceder nuevamente. Al reiniciar, el servidor no puede acceder a la configuración de SPM guardada recientemente, y, en su lugar, se revierte a la configuración por defecto de fábrica.