JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
SPARC T4-4 Server

Product Notes

search filter icon
search icon

Document Information

Using This Documentation

1.  Late-Breaking Information

Preinstalled Software

Supported Versions of Oracle Solaris OS, Firmware, and Software

OS Package and Patch Updates

Determining Oracle Solaris 11 OS Package Update Version

Determining Oracle Solaris 10 Patch Revision

Minimum Required Patchset for Oracle Solaris 10 8/11 OS

Minimum Required Patchsets and SPARC Bundle for Oracle Solaris 10 9/10 OS

Minimum Required Patchsets and SPARC Bundle for Oracle Solaris 10 10/09 OS

Support for new 16 Gbyte and 32 Gbyte DIMMs

Support for 1.5 TByte Memory Configuration

2.  Known Product Issues

Hardware Issues

Maximizing Memory Bandwidth

Direct I/O Support

Use Links Labeled SPARC T3 to Download sas2ircu Software for SPARC T4 Servers

Sun Type 6 Keyboards Are Not Supported by SPARC T4 Series Servers

Hardware RAID 1E Not Supported

Server Panics When Booting From a USB Thumbdrive Attached to the Front USB Ports (Bug ID 15667682)

Performance Limitations Occur When Performing a Hot-Plug Installation of a x8 Card Into a Slot Previously Occupied With a x4 Card (Bug ID 15671185)

Unrecoverable USB Hardware Errors Occur In Some Circumstances (Bug ID 15677875, Bug ID 15765407)

PSH Might Not Clear a Retired Cache Line on a Replaced CPU Module (Bug ID 15705327, Bug ID 15713018)

PCIe Correctable Errors Might Be Reported (Bug ID 15720000, 15722832)

L2 Cache Uncorrectable Errors Might Lead to an Entire Processor Being Faulted (Bug ID 15727651, Bug ID 15732875, Bug ID 15732876, Bug ID 15733117)

L2 Cache UEs Are Sometimes Reported as Core Faults Without Any Cache Line Retirements (Bug ID 15731176)

Upon a Reboot After an Unrecoverable Hardware Error, CPUs Might Not Start (Bug ID 15733431)

Intermittent Power Supply Faults Occur During Power On (Bug ID 15727974)

Non-Critical Power Supply Threshold Messages Occur Under Heavy Load (Bug ID 15728319)

Spurious Power Supply Errors Might Be Reported (Bug ID 15800916)

Servers Equipped With a Single Processor Module are Limited to 12 Express Modules (CR 14850481)

Oracle Solaris OS Issues

The cfgadm -al Command Takes a Long Time to Print Output (Bug ID 15631390, Bug ID 15723609)

Spurious Interrupt Message in System Console (Bug ID 15651697, Bug ID 15771956, Bug ID 15771958)

Spurious Error Message During Initial Oracle Solaris OS Installation (Bug ID 15658412)

When diag-switch? Is Set to true, Oracle Solaris OS Fails to Update EEPROM for Automatic Rebooting (Bug ID 15666767)

Memory Allocation Issues With Emulex 8Gb HBAs in a Magma I/O Expansion Box (Bug ID 15666779)

The Fault Management Suite Sometimes Sends Resolved Cases to the SP (Bug ID 15667874, Bug ID 15741999)

Gigabit Ethernet (nxge) Driver Not Loading on Systems With Oracle Solaris 10 10/09 OS and Solaris 10 9/10 Patch Bundle (Bug ID 15677751)

nxge Driver Warning Messages Displayed After Reboot (Bug ID 15710067, Bug ID 15777789, Bug ID 15777790)

The trapstat -T Command Causes Bad Watchdog Resets at TL2 (Bug ID 15720390)

Watchdog Timeouts Occur With Heavy Workloads and Maximum Memory Configurations (Bug ID 15737671, Bug ID 15744469, Bug ID 15771943)

ereport.fm.fmd.module Generated During a Reboot of an SDIO Domain (Bug ID 15738845, Bug ID 15742069)

Oracle VTS dtlbtest Hangs When the CPU Threading Mode Is Set to max-ipc (Bug ID 15743740, Bug ID 15744945)

Some pciex8086,105f Devices Fail to Attach On Servers Equipped with System Firmware 8.2.0.b (Bug ID 15774699)

L2 Cache Uncorrectable Errors Causing a Reboot Abort (Bug ID 15826320)

Firmware Issues

create-raid10-volume Command Fails to Create a RAID 10 Volume on a Sun Storage 6 Gb SAS PCIe HBA (Bug ID 15635981)

Timestamp for an ILOM Fault/Critical Event Might Be Off by One Hour (Bug ID 15802097)

Missing Interrupt Causes USB Hub Hot-plug Thread to Hang, Resulting In Process Hangs (Bug ID 15655752)

Message From cpustat Refers to Processor Documentation Incorrectly (Bug ID 15717099, Bug ID 15717100, Bug ID 15749141)

reboot disk Command Occasionally Fails When disk Argument Picks Up Extra Characters (Bug ID 15816272, Bug ID 15719738)

Blue LED On Drive Does Not Light When the Drive Is Ready to Remove (Bug ID 15737491)

Cold Reset Adds One Day to System Time (CR 15764743, Bug ID 15765255, Bug ID 15765770)

Hardware Issues

This section describes issues related to SPARC T4-4 server components.

Maximizing Memory Bandwidth

To maximize processor module memory bandwidth, Oracle recommends that only fully-populated memory configurations—as opposed to half-populated configurations—be considered for performance-critical applications.

For specific memory installation and upgrade instructions, see the SPARC T4-4 Server Service Manual.

Direct I/O Support

Only certain PCIe cards can be used as direct I/O endpoint devices on an I/O domain. You can still use other cards in your Oracle VM Server for SPARC environment, but these other cards cannot be used with the Direct I/O feature. Instead, other PCIe cards can be used for service domains and for I/O domains that have entire root complexes assigned to them.

For the most up-to-date list of supported PCIe cards, refer to https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=REFERENCE&id=1325454.1


Note - Not all cards listed on the Direct I/O web page are supported in the SPARC T4-4 server. Check the server hardware compatibility list before installing any PCIe cards.


Use Links Labeled SPARC T3 to Download sas2ircu Software for SPARC T4 Servers

To download sas2ircu firmware and documentation for SPARC T4-4 server from the current LSI web site, you must use links labeled SPARC T3-1 and T3-2. The software and documentation is the same for both sets of servers.

This is the web site for downloading sas2ircu software from LSI:

http://www.lsi.com/sep/Pages/oracle/index.aspx

This is the web site for downloading sas2ircu documentation from LSI:

http://www.lsi.com/sep/Pages/oracle/sparc_t3_series.aspx

Sun Type 6 Keyboards Are Not Supported by SPARC T4 Series Servers

Sun Type 6 keyboards cannot be used with SPARC T4 series servers.

Hardware RAID 1E Not Supported

Although hardware RAID 0 and 1 are supported on the SPARC T4-4 server, hardware RAID 1E is not supported. Other RAID formats are available through software RAID.

Server Panics When Booting From a USB Thumbdrive Attached to the Front USB Ports (Bug ID 15667682)


Note - This issue was originally listed as CR 6983185.


When attempting to boot a USB thumbdrive inserted in either front USB port (USB2 or USB3), the server might panic.

Workaround: Use the server's rear USB ports (USB0 or USB1) whenever booting from an external USB device.

Performance Limitations Occur When Performing a Hot-Plug Installation of a x8 Card Into a Slot Previously Occupied With a x4 Card (Bug ID 15671185)


Note - This issue was originally listed as CR 6987359.


If you hot-plug a Dual 10GbE SFP+ PCIe2.0 EM Network Interface Card (NIC) (part number 1110A-Z) into a PCI Express Module slot that had previously held a 4-Port (Cu) PCIe (x4) ExpressModule (part number (X)7284A-Z-N), the expected performance benefit of the Dual 10GbE SFP+ PCIe2.0 NIC might not occur.

This problem does not occur if the slot was previously unoccupied, or if it had been occupied by any other option card. In addition, this problem occurs if the card is present when the system is powered on.

Workaround: Hotplug the Dual 10Gbe SFP+ PCIe2.0 EM card a second time, using one of the following methods.


Note - You don't need to physically remove and re-insert the card as part of the second hot plug operation.


Unrecoverable USB Hardware Errors Occur In Some Circumstances (Bug ID 15677875, Bug ID 15765407)


Note - This issue was originally listed as CR 6995634.


In some rare instances, unrecoverable USB hardware errors occur, such as the following:

usba: WARNING: /pci@400/pci@1/pci@0/pci@8/pci@0/usb@0,2 (ehci0): Unrecoverable USB Hardware Error
usba: WARNING: /pci@400/pci@1/pci@0/pci@8/pci@0/usb@0,1/hub@1/hub@3 (hubd5): Connecting device on port 2 failed

Workaround: Reboot the system. Contact your service representative if these error messages persist.

PSH Might Not Clear a Retired Cache Line on a Replaced CPU Module (Bug ID 15705327, Bug ID 15713018)


Note - This issue was originally listed as CR 7031216.



Note - This issue was fixed in Oracle Solaris 11.1.


When a CPU module is replaced to repair a faulty CPU, PSH might not clear retired cache lines on the replacement FRU. In such cases, the cache line remains disabled.

Workaround: Manually clear the disabled cache line by running the following command:

# fmadm repaired fmri | label

For example:

# fmdump -avNov 03 10:34:56.6192 e1ee44ed-72f7-c32b-855b-e9f4b03144af SUN4V-8002-V3
TIME                 UUID                                 SUNW-MSG-IDProblem in: hc://:product-id=ORCL,SPARC-T4-4:product-sn=xxxxyyyxxx:server-id=xxxxx:chassis-id=xxxxyyyxxx/chassis=0/cpuboard=0/chip=0/l3cache=0/cacheindex=256/cacheway=7Affects: hc://:product-id=ORCL,SPARC-T4-4:product-sn=xxxxyyyxxx:server-id=xxxxx:chassis-id=xxxxyyyxxx/chassis=0/cpuboard=0/chip=0/l3cache=0/cacheindex=256/cacheway=7
 FRU: hc://:product-id=ORCL,SPARC-T4-4:product-sn=xxxxyyyxxx:server-id=xxxxx:chassis-id=xxxxyyyxxx:serial=465769T+1115H50061:part=7013822:revision=01/chassis=0/cpuboard=0
# fmadm repaired hc://:product-id=ORCL,SPARC-T4-4:product-sn=xxxxyyyxxx:server-id=xxxxx:chassis-id=xxxxyyyxxx/chassis=0/cpuboard=0/chip=0/l3cache=0/cacheindex=256/cacheway=7Location: /SYS/PM0100%  fault.cpu.generic-sparc.cachelinefmadm: recorded repair to of hc://:product-id=ORCL,SPARC-T4-4:product-sn=xxxxyyyxxx:server-id=xxxxx:chassis-id=xxxxyyyxxx/chassis=0/cpuboard=0/chip=0/l3cache=0/cacheindex=256/cacheway=7
# fmdump -aTIME                 UUID                                 SUNW-MSG-ID
Nov 03 10:34:56.6192 e1ee44ed-72f7-c32b-855b-e9f4b03144af SUN4V-8002-V3
Nov 03 10:37:40.3545 e1ee44ed-72f7-c32b-855b-e9f4b03144af FMD-8000-4M RepairedNov 03 10:37:40.3610 e1ee44ed-72f7-c32b-855b-e9f4b03144af FMD-8000-6U Resolved
 
 
 
 
 
 

PCIe Correctable Errors Might Be Reported (Bug ID 15720000, 15722832)


Note - This issue was originally listed as CR 7051331.



Note - This issue was fixed in Oracle Solaris 11.


In rare cases, PCI Express Gen2 or low-profile PCIe devices in the server might report I/O errors that are identified and reported by PSH. For example:

--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Aug 10 13:03:23 a7d43aeb-61ca-626a-f47b-c05635f2cf5a  PCIEX-8000-KP  Major
 
Host        : dt214-154
Platform    : ORCL,SPARC-T4-4  Chassis_id  :
Product_sn  :
 
Fault class : fault.io.pciex.device-interr-corr 67%
              fault.io.pciex.bus-linkerr-corr 33%
Affects     : dev:////pci@400/pci@1/pci@0/pci@c
              dev:////pci@400/pci@1/pci@0/pci@c/pci@0
                  faulted but still in service
FRU         : "/SYS/MB" (hc://:product-id=ORCL,SPARC-T4-4:product-sn=xxxx:server-id=xxxx:chassis-id=0000000-0000000000:serial=xxxx:part=541-424304:revision=50/chassis=0/motherboard=0) 67%
              "FEM0" (hc://:product-id=ORCL,SPARCT4-4:product-sn=xxxxx:server-id=xxxxx:chassis-id=0000000-0000000000/chassis=0/motherboard=0/hostbridge=0/pciexrc=0/pciexbus=1/pciexdev=0/pciexfn=0/pciexbus=2/pciexdev=12/pciexfn=0/pciexbus=62/pciexdev=0) 33%
                  faulty
 
Description : Too many recovered bus errors have been detected, which indicates
              a problem with the specified bus or with the specified
              transmitting device. This may degrade into an unrecoverable
              fault.
              ...
 
Response    : One or more device instances may be disabled
 
Impact      : Loss of services provided by the device instances associated with
              this fault
 
Action      : If a plug-in card is involved check for badly-seated cards or
              bent pins. Otherwise schedule a repair procedure to replace the
              affected device.  Use fmadm faulty to identify the device or
              contact Sun for support.

These errors might be an indication of a faulty or incorrectly seated device. Or these errors might be erroneous.

Workaround: Ensure that the device is properly seated and functioning. If the errors continue, apply patch 147705-01 or higher.

L2 Cache Uncorrectable Errors Might Lead to an Entire Processor Being Faulted (Bug ID 15727651, Bug ID 15732875, Bug ID 15732876, Bug ID 15733117)


Note - This issue was originally listed as CR 7065563.



Note - This issue was fixed in System Firmware 8.1.4.


An L2 cache uncorrectable error might lead to an entire processor being faulted when only specific core strands should be faulted.

Workaround: Schedule a service call with your authorized Oracle service provider to replace the processor module containing the faulty core. Until it is replaced, you can return the strands related to the functioning cores to service using the following procedure. This restores as much system functionality as the active cores provide.

  1. Identify the faulty core:

    # fmdump -eV -c ereport.cpu.generic-sparc.l2tagctl-uc

    The detector portion of the fmdump output is displayed as follows.


    Note - Key elements in the example are highlighted for emphasis. They would not be highlighted in the actual output.


         detector = (embedded nvlist)
              nvlist version: 0
                      version = 0x0
                      scheme = hc
                      hc-root =
                      hc-list-sz = 4
                      hc-list = (array of embedded nvlists)
                      (start hc-list[0])
                      nvlist version: 0
                              hc-name = chassis
                              hc-id = 0
                      (end hc-list[0])
                      (start hc-list[1])
                      nvlist version: 0
                            hc-name = cpuboard 
                            hc-id = 1 
     (start hc-list[2])
    (end hc-list[1])
    hc-name = chip nvlist version: 0hc-id = 2   
                      (end hc-list[2])
     (start hc-list[3])
    nvlist version: 0
    hc-name = core  hc-id = 19 
                     (end hc-list[3])
     
              (end detector)
     
     
     
     

    In this example, the faulted chip is indicated by the following FMRI values:

    • Chassis = 0

    • CPU Board = 1

    • Chip = 2

    • Core = 19

    The following table includes additional examples with corresponding Nomenclature Architecture Council (NAC) names.

    Sample fmdump Output
    Corresponding NAC Name
    cpuboard=0/chip=0/core=0
    /SYS/PM0/CMP0/CORE0
    cpuboard=1/chip=2/core=16
    /SYS/PM1/CMP0/CORE0
    cpuboard=1/chip=2/core=19
    /SYS/PM1/CMP0/CORE3

    For example, given a FMRI of chassis=0/cpuboard=x/chip=y/core=z, the corresponding NAC name for /SYS/PMa/CMPb/COREc can be derived as follows:

    • a = x

    • b = (y mod 2)

    • c = (z mod 8)

  2. Halt the Oracle Solaris OS, and power off the server.

  3. Disable the faulty core. From the Oracle ILOM CLI:

    -> cd /SYS/PM1/CMP0/CORE0
    /SYS/PM1/CMP0/CORE0
    -> show /SYS/PM1/CMP0/CORE01331
    -> set component_state=disabled Targets:
                  P0
                  P1
                  P2
                  P3
                  P4
                  P5
                  P6
                  P7
                  L2CACHE
                  L1CACHE
     
              Properties:
                  type = CPU Core
                  component_state = Enabled
     
              Commands:
                  cd
                  set
                  show
  4. Power on the server, and restart the Oracle Solaris OS.

    Refer to the SPARC T4 Series Servers Administration Guide for information on powering on the server from the Oracle ILOM prompt.

  5. Override the FMA diagnosis manually.

    The faulty component's UUID value is provided in the first line of the fmdump output.

     # fmadm repair uuid-of-fault

L2 Cache UEs Are Sometimes Reported as Core Faults Without Any Cache Line Retirements (Bug ID 15731176)


Note - This issue was originally listed as CR 7071237.


When a processor cache line encounters an uncorrectable error (UE), the fault manager is supposed to attempt to retire the cache line involved in the error. Because of this defect, the fault manager might not retire the faulty cache line and instead report the entire chip as faulted.

Workaround: Schedule a replacement of the FRU containing the faulty component. For additional information about UEs in processor cache lines, search for message ID SUN4V-8002-WY on the Oracle support site, http://support.oracle.com.

Upon a Reboot After an Unrecoverable Hardware Error, CPUs Might Not Start (Bug ID 15733431)


Note - This issue was originally listed as CR 7075336.


In rare cases, if the server or sever module experiences a serious problem that results in a panic, when the server is rebooted, a number of CPUs might not start, even though the CPUs are not faulty.

Example of the type of error displayed:

rebooting...
Resetting...
 
ERROR: 63 CPUs in MD did not start

Workaround: Power cycle the server.

-> stop /SYS
Are you sure you want to stop /SYS (y/n)? y
Stopping /SYS
-> start /SYS
Are you sure you want to start /SYS (y/n) ? y
Starting /SYS

Intermittent Power Supply Faults Occur During Power On (Bug ID 15727974)


Note - This issue was originally listed as CR 7066165.


In rare instances, the system FRU power-up probing routine might fail to list all installed system power supplies. The power supplies themselves are not faulted, but commands listing system FRUs do not show the presence of the non-probed power supply.

The fault sets the system fault LED, but no power supply fault LED is illuminated. To find the fault, use the fmadm utility from the ILOM fault management shell.

Start the fmadm utility from the ILOM CLI:

-> start /SP/faultmgmt/shell
 Are you sure you want to start /SP/faultmgmt/shell (y/n)? y
faultmgmtsp> 

To view the fault, type the following:

faultmgmtsp> fmadm faulty
------------------- ------------------------------------ -------------- ------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- ------
2011-09-21/13:59:35 f13524d6-9970-4002-c2e6-de5d750f4088 ILOM-8000-2V   Major
 
Fault class : fault.fruid.corrupt
 
FRU         : /SYS/PS0
              (Part Number: 300-2159)
              (Serial Number: 476856F+1115CC0001)
 
Description : A Field Replaceable Unit (FRU) has a corrupt FRUID SEEPROM
 
Response    : The service-required LED may be illuminated on the affected
              FRU and chassis.
 
Impact      : The system may not be able to use one or more components on
              the affected FRU.  This may prevent the system from powering
              on.
 
Action      : The administrator should review the ILOM event log for
              additional information pertaining to this diagnosis.  Please
              refer to the Details section of the Knowledge Article for
              additional information.

Workaround: From the fault management shell prompt, clear the fault, exit the fault management shell, and reset the SP. For example:

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y
faultmgmtsp> fmadm repair /SYS/PS0
faultmgmtsp> exit
 
-> reset /SP
Are you sure you want to reset /SP (y/n)? y

After the SP has reset, verify that all installed power supplies appear in the list of system devices:

-> ls /SYS

If the problem occurs again after applying this workaround, contact your authorized Oracle Service Provider for further assistance.

Non-Critical Power Supply Threshold Messages Occur Under Heavy Load (Bug ID 15728319)


Note - This issue was originally listed as CR 7066726.


In some instances under heavy load, power supply threshold messages similar to the following appear in the /var/adm/messages file:

SC Alert: [ID 579591 daemon.notice] Sensor | minor: Power Unit : /SYS/VPS : Upper Non-critical going high : reading 2140 >= threshold 2140 Watts
SC Alert: [ID 807701 daemon.notice] Sensor | minor: Power Unit : /SYS/VPS : Upper Non-critical going low  : reading 2100 <= threshold 2140 Watts

Workaround: From the fault management shell prompt, clear the fault, exit the fault management shell, and reset the SP. For example:

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? yfaultmgmtsp> fmadm repair /SYS/PS0
faultmgmtsp> exit
 
-> reset /SP
Are you sure you want to reset /SP (y/n)? y

Spurious Power Supply Errors Might Be Reported (Bug ID 15800916)


Note - This issue was originally listed as CR 7180259.


In some cases, the Oracle ILOM firmware identifies and reports spurious power supply errors. For example:

ereport.chassis.voltage-lnc-glo@/sys/rio /SYS/RIO/VDD_+1V0
ereport.chassis.voltage-lnc-glo@/sys/rio /SYS/RIO/VDD_+1V8
ereport.chassis.voltage-lnc-glo@/sys/rio /SYS/RIO/VDD_+3V3
ereport.chassis.voltage-lnc-glo@/sys/rio /SYS/RIO/VDD_+5V0
fault.chassis.power.missing

Workaround: Update the server to System Firmware 8.2.0.f. If these errors persist, they indicate a power supply fault. Refer to the SPARC T4-2 Server Service Manual for service instructions.

Servers Equipped With a Single Processor Module are Limited to 12 Express Modules (CR 14850481)

Dual-processor servers (those equipped with just one processor module) are currently restricted to twelve or fewer PCIe Express Module adaptors. In addition, Slots 6, 7, 14, and 15 should not be populated with PCIe Express Module adaptors.

Quad-processor servers (servers equipped with two processor modules) do not have these restrictions.