Oracle ILOM Command Force Stop of PCIe Slot Power Can Cause Server PCIe Bus Error

Bug ID: 334503411

Issue: SW3.3.3 Enterprise Oracle ILOM build02: 5.1.0.23_r146986 Enhancement 34371396 implements CLI command stop /SYS/MB/PCIEn. Use of the stop /SYS/MB/PCIEn command can cause some Smart NIC PCIe cards to stop operation and the system may report a PCIe bus error. Systems with this firmware enhancement should not need to use this command if they do not have any Cavium LiquidIO III 100 Gb Network Interface Card (NIC)s installed in PCIe slots. If not resetting Smart NIC Add-In Card (AIC) PCIe slots, avoid using the stop /SYS/MB/PCIEn command on PCIe slots and cards unless instructed by Oracle Service personnel.

  1. When host power is off, you can start any PCIe slot power with the start /SYS/MB/PCIEn command and stop any PCIe slot power with the stop /SYS/MB/PCIEn command without PCIe bus errors.

    • -> stop /SYS/MB/PCIE#
    • -> start /SYS/MB/PCIE#
  2. Note: When host power is on, you can add use the –force option to force stop/start PCIe slot power. But there is a risk of causing system and PCIe bus errors.

    Before using the –force option to force stop or force start PCIe slot power, ensure the following preconditions.

    1. Verify that the UEFI BIOS has already enabled the PCIe slot hotplug feature.

    2. Verify that the OS is Linux UEK4 or UEK5, and the Cavium LiquidIO III 100 Gb Network Interface Card (NIC) remote console is idle.

    3. Shutdown PCIe communication/data traffic for the slot with the installed Cavium LiquidIO III 100 Gb Network Interface Card (NIC).

  3. Use these actions only for the PCIe slot when installing Cavium LiquidIO III 100 Gb Network Interface Card (NIC)s when host power is on.

    • -> stop –force /SYS/MB/PCIE#
    • -> start –force /SYS/MB/PCIE#

Cavium LiquidIO III 100 Gb Network Interface Card (NIC) may require start/stop power cycles without affecting other AIC cards using the -force option to start or stop PCIe slot power when server host main power is on. For all other AIC cards installed in system configurations, exercise caution when using the stop /SYS/MB/PCIEn command to force to start/stop PCIe slot power. Using force stop on Oracle Flash Accelerator F640 PCIe Cards and Oracle Quad Port 10GBase-T Adapters installed in PCIe slots may result in PCIe bus issues. Refer to Enhancement 34371396 for details.

Example of force stop command that may cause a reset:

-> stop /SYS/MB/PCIE8
Are you sure you want to stop /SYS/MB/PCIE8 (y/n)? y
stop: Operation is not allowed when Host power is on.
-> stop -f /SYS/MB/PCIE8
Are you sure you want to immediately stop /SYS/MB/PCIE8 (y/n)? y
Stopping /SYS/MB/PCIE8 immediately
-> ls
 /SYS/MB/PCIE8
    Targets:        POWER        PRSNT        SERVICE
    Properties:        type = PCIE Module        fault_state = OK
        clear_fault_action = (none)        power_state = Off    
Commands:        cd        reset        set        show        start        stop
-> show faulty
Target                       |
          Property                          | Value      
-----------------------------+-----------------------------------+------------
-> show /sp/logs/event/list/
Event
ID     Date/Time                 Class          Type      Severity
-----  ------------------------  --------  --------  --------
324    Wed Aug 10 06:41:56 2022  Chassis   State     minor   
       /SYS/MB/PCIE8 power is disabled
323    Wed Aug 10 06:41:51 2022  Power     Off       major   
       Power to /SYS/MB/PCIE8 has been turned off by: Shell session, 
Username:
322    Wed Aug 10 05:23:51 2022  IPMI      Log       minor   
       ID =  131 : 08/10/2022 : 05:23:51 : System Firmware Progress : BIOS : 
System boot initiated : Asserted
      

Affected Hardware: Oracle Communications Server E6-2L, Oracle Server X7-2, Oracle Server X7-2L

Affected Software:

x86 server software Oracle ILOM releases or later: SW3.3.3 Enterprise ILOM build02: 5.1.0.23_r146986, BIOS: 42120100

Oracle Communications Server E6-2L, Oracle Server X7-2, Oracle Server X7-2L

For details, refer to Oracle Integrated Lights Out Manager (ILOM) documentation at Systems Management Documentation.

Workaround: Use Oracle ILOM CLI command stop /SYS/MB/PCIEn only for Cavium LiquidIO III 100 Gb Network Interface Card (NIC) slot required start/stop power cycles. Do not use Oracle ILOM CLI command stop /SYS/MB/PCIEn for any other purpose unless instructed by Oracle Service personnel.

Oracle Service personnel can find more information about x86 servers at My Oracle Support. Refer to Oracle ILOM update Enh 34371396 - Added --force to start/stop PCIe slot power.