8.7.1 Preparing for RoCE Network Fabric Switch Firmware Upgrades or Downgrades

You must follow a specific order when upgrading the RoCE Network Fabric switches.

  1. Log in to a server that has access to the RoCE Network Fabric switches.
  2. Download the appropriate patch file to the server.

    Starting with Oracle Exadata System Software release 19.3.0, the updates for the switches are in a separate patch. Refer to My Oracle Support note 888828.1 for patch information.

  3. Unzip the patch file.

    The files are unzipped to the patch_switch_release directory.

  4. Change to the directory that contains the patchmgr utility.

    For example:

    # cd patch_switch_19.3.0.0.0.190915
  5. Create a switch list file to drive the update of the RoCE Network Fabric switches.
    1. Create a file that contains the host name or IP address of the switches that you want to upgrade. Place each switch on a separate line.
      For example, create a file named switches.lst, which contains the host name of each switch on separate lines. On a single rack system, the file might contain:
      switch123-rocea0
      switch123-roceb0
    2. Tag each line to specify the configuration type for each switch.

      To specify the configuration type for each switch, append a colon (:) and tag to each switch host name or IP address in the switch list file. The following tags are supported:

      • leaf - Identifies a leaf switch in a single rack system. This configuration type is assumed if no tag is specified.
      • mspine - Identifies a spine switch. Note that one spine switch configuration supports all spine switches on single and multi-rack systems, with and without Exadata Secure RDMA Fabric Isolation.
      • mleaf - Identifies a leaf switch in a multi-rack X8M system.
      • sfleaf - Identifies a leaf switch in a single rack system that is enabled to support Exadata Secure RDMA Fabric Isolation.
      • msfleaf - Identifies a leaf switch in a multi-rack X8M system that is enabled to support Exadata Secure RDMA Fabric Isolation.
      • leaf23 - Identifies a leaf switch in a single rack system that is configured with 23 host ports. This configuration is required only for 8-socket systems (X8M-8 and later) with 3 database servers and 11 storage servers.
      • mleaf23 - Identifies a leaf switch in a multi-rack system that is configured with 23 host ports. This configuration is required only for 8-socket X8M-8 systems with 3 database servers and 11 storage servers.
      • mleaf_u14 - Identifies a leaf switch in a multi-rack system that is configured with 14 inter-switch links. This is the typical multi-rack leaf switch configuration for X9M and later model systems.
      • msfleaf_u14 - Identifies a leaf switch in a multi-rack system that is enabled to support Exadata Secure RDMA Fabric Isolation and is configured with 14 inter-switch links. This configuration is required for X9M and later model systems with Secure Fabric enabled.
      • mleaf23_u13 - Identifies a leaf switch in a multi-rack system that is configured with 23 host ports and 13 inter-switch links. This configuration is required only for 8-socket X9M-8 systems with three database servers and 11 storage servers.
      For example:
      switch123-rocea0:leaf
      switch123-roceb0:leaf
    3. For multi-rack configurations only, specify a unique loopback octet for each switch.

      The loopback octet is the last octet of the switch loopback address, which uniquely identifies a switch.

      To specify the loopback octet for each switch, append a period (.) and numeric loopback octet value to each tagged switch entry in the switch list file. The range of valid loopback octet values is:

      • 101-118 for leaf switches
      • 201-208 for spine switches
      For example, the switch list file for a 2-rack system might contain:
      rack1sw-rocea0:mleaf.101
      rack1sw-roceb0:mleaf.102
      rack1sw-roces0:mspine.201
      rack2sw-rocea0:mleaf.103
      rack2sw-roceb0:mleaf.104
      rack2sw-roces0:mspine.202
  6. Run the prerequisite check prior to either upgrading or downgrading the firmware.
    # ./patchmgr --roceswitches switches.lst {--upgrade | --downgrade} --roceswitch-precheck [--force]
      [-log_dir {absolute_path_to_log_directory | AUTO}]

    In the patchmgr command:

    • --roceswitch-precheck instructs patchmgr to perform a firmware upgrade or downgrade simulation on the switch.

    • --force optionally proceeds with the operation even if the switch is already on the target firmware version or the RoCE Network Fabric is experiencing non-critical failures.

    • -log_dir specifies the absolute path to the log directory, or AUTO instructs patchmgr to automatically create the log directory. This option is required when running patchmgr as a non-root user.

    Note:

    Prior to Oracle Exadata System Software release 19.3.9, you must run patchmgr as a non-root user for patching RoCE Network Fabric switches.

    Note:

    The current user is expected to have SSH equivalency configured prior to running patchmgr. If it is not configured, then patchmgr will give you the option to setup keys and key exchange for SSH equivalency.

    If the output from the command shows overall status is SUCCESS, then proceed with the upgrade. If the output from the command shows overall status is FAIL, then review the error summary in the output to determine which checks failed, and then correct the errors. After correcting all the errors, rerun the prerequisite checks until it is successful.

    Note:

    You may see a Config validation failed error because the golden configuration settings have not been applied to the switch. In this case, apply the golden configuration settings from the current patch and then repeat the prerequisite check. See Applying Golden Configuration Settings on Cisco Nexus 9336C-FX2 RoCE Network Fabric Switches.