Go to main content

SPARC T7 Series Servers Product Notes

Exit Print View

Updated: July 2019
 
 

Virtual_TTE_invalid Error On Assigned IOV Device (22138210)

When the primary or root domain is rebooted, the I/O domain is notified to suspend the assigned virtual functions from domain being rebooted, and resume them when the reboot is complete. However, in some cases the resume notification might be issued prematurely, causing the I/O domain to fail to resume one or more of its assigned virtual functions. This issue occurs only on M7/T7 platforms when the LDOMS failure policy on the I/O domain is not set, or is set to ignore.

Two symptoms indicate this failure:

  • A warning on the console in the I/O domain (also logged in /var/adm/messages)

  • An FMA fault in the primary or root domain to which the physical function is assigned

The warning from the I/O domain console is as follows:

WARNING: pxsoft_msi_resume: retry limit exceeded.

The FMA fault in the primary or root domain is fault.io.pciex.device-invreq (PCIEX-8000-8R). It includes an ereport.io.pciex.rc.epkt error report with the following string:

event_name = Virtual_TTE_invalid

To list any logged FMA error reports, type:

# fmdump -e

For a verbose listing which might include the event_name = Virtual_TTE_invalid string, type:

# fmdump -eV

The warning from the I/O domain console is as follows:

WARNING: pxsoft_msi_resume: retry limit exceeded.

If this issue occurs, one or more assigned virtual functions will no longer work properly in the I/O domain following a primary or root domain reboot. The device drivers for the affected virtual functions cannot process any interrupt signals from the underlying hardware devices.

Recovery

To recover from this issue and regain the affected virtual functions, reboot the affected I/O domain.

To make the I/O domain more resilient against this issue, configure the following setting in its /etc/system file:

set pxsoft:pxsoft_resume_max_retries=1024

This setting only impacts the resume operations of virtual functions in the I/O domain. You must reboot I/O domain for this new setting to take effect.

Mitigation

This issue occurs if an I/O domain is assigned multiple virtual functions coming from multiple physical functions in the same PCIe bus. You can avoid this issue if you assign virtual functions in the I/O domain only from a single physical function of that PCIe bus.