8.7 Upgrading and Downgrading RoCE Network Fabric Switch Firmware

This topic describes procedures to upgrade and downgrade the firmware on the RoCE Network Fabric switches.

Note the following when updating the RoCE Network Fabric switch firmware:

  • Upgrading and downgrading the RoCE Network Fabric switch firmware is performed using the patchmgr utility.
  • Starting with Oracle Exadata System Software release 19.3.0, there are separate patchmgr distributions for the servers and RDMA Network Fabric switches. Ensure that you use the patchmgr distribution included in the patch ZIP file for the RoCE Network Fabric switches.
  • Download the appropriate patch ZIP file to any machine with access to the RoCE Network Fabric switches. Refer to My Oracle Support note 888828.1 for the patch information.
  • patchmgr configures ssh passwordless-access to each switch, which requires you to provide the password of the admin user for the switch.
  • You must use a non-root user to perform the patching, and must include the -log_dir option with patchmgr.
  • Switch firmware is always upgraded in a rolling manner (one switch at a time).
  • Perform storage server updates separately from RDMA Network Fabric switch updates. Do not update storage servers and RDMA Network Fabric switches concurrently. RDMA Network Fabric network connections must be stable during some critical stages of storage server updates. The RDMA Network Fabric switch firmware upgrade requires a switch reboot, which disrupts some connections on the RDMA Network Fabric.
  • Commencing with the August 2022 patchmgr release, patchmgr performs an additional series of checks on the RoCE Network Fabric. The checks occur immediately before any firmware upgrade or downgrade and also during prerequisite checking using the --roceswitch-precheck option. These checks mitigate the risks of failure associated with unexpected problems in the RoCE Network Fabric. For example, if one of the RoCE Network Fabric ports on a storage server is down, the storage server would become unavailable if the switch connected to the only operational port is taken offline for an upgrade. If any check fails, patchmgr reports the problem and ends immediately. In this case, you must correct the problem with the RoCE Network Fabric before you can perform the upgrade or downgrade.

Use the following procedures to upgrade and downgrade the firmware on the RoCE Network Fabric switches: