5 Upgrading a Compute Node

Caution:

Ensure that all preparation steps for system upgrade have been completed. For instructions, see Preparing the Upgrade Environment.

When upgrading to appliance software version 3.0.2-b1081557 or later, the ZFS Storage Appliance firmware must be upgraded before all other components. For more information, see Checking Upgrade Plan Status and Progress.

The compute node upgrade ensures that the latest Oracle Linux kernel and user space packages are installed, as well as the ovm-agent package with appliance-specific optimizations. Compute nodes must be locked and upgraded one at a time; concurrent upgrades are not supported. After successful upgrade, when a compute node has rebooted, the administrator must manually remove the locks to allow the node to return to normal operation.

Note:

In case the ILOM also needs to be upgraded, you can integrate it into this procedure by executing the optional steps. The combined procedure eliminates the need to evacuate and reboot the same node twice.

Note:

In software versions 3.0.2-b892153 and later the Upgrader service uses the upgrade plan, generated during the pre-upgrade process, to determine whether a component needs to be upgraded. If a component is already at the required version, the upgrade command does start an upgrade job, but it is completed immediately because the upgrade plan indicates there is nothing to do.

Practically speaking, when a component is already at the required version, the upgrade procedure is skipped. However, a same-version upgrade can be forced using the Service Web UI or Service CLI command option, if necessary. For example: upgradeCN hostIp=100.96.2.64 force=True.

Obtaining a Host IP Address

From the Service CLI, compute nodes are upgraded one at a time, using each one's internal IP address as a command parameter. However, the locking commands use the compute node ID instead. To run all commands for a compute node upgrade you need both identifiers.

To obtain the host IP address and ID, as well as other information relevant to the upgrade procedure, use the Service CLI command provided in the following example. You can run the command as often as needed to check and confirm status as you proceed through the upgrade of all compute nodes.

PCA-ADMIN> list computeNode fields hostname,ipAddress,ilomIp,state,firmwareVersion,provisioningLocked,maintenanceLocked orderby hostname ASCENDING
Data:
  id                                     Hostname   Ip Address    ILOM Ip Address   State   Firmware Version           Provisioning Locked   Maintenance Locked
  --                                     --------   ----------    ---------------   -----   ----------------           -------------------   ------------------
  cf488903-fef8-4a51-8a41-c6990e4755c5   pcacn001   100.96.2.64   100.96.0.64       On      PCA Hypervisor:3.0.2-681   false                 false             
  42a7594d-1173-4dbd-4755-07810cc2d527   pcacn002   100.96.2.65   100.96.0.65       On      PCA Hypervisor:3.0.2-681   false                 false             
  bc0f37d5-ba77-423e-bc11-017704b47e59   pcacn003   100.96.2.66   100.96.0.66       On      PCA Hypervisor:3.0.2-681   false                 false             
  2e5ac527-01f5-4230-ae41-0522fcb57c9a   pcacn004   100.96.2.67   100.96.0.67       On      PCA Hypervisor:3.0.2-681   false                 false             
  5a6b61cf-7e99-4df2-87e4-b37c5fb0bfb8   pcacn005   100.96.2.68   100.96.0.68       On      PCA Hypervisor:3.0.2-681   false                 false             
  885f2aa4-f017-41e8-b2bc-e588cc0c6162   pcacn006   100.96.2.69   100.96.0.69       On      PCA Hypervisor:3.0.2-681   false                 false             

Using the Service Web UI

  1. Set the provisioning and maintenance locks for the compute node you are about to upgrade. Ensure that no active compute instances are present on the node.

    Caution:

    Depending on the high-availability configuration of the Compute service, automatic instance migrations can prevent you from successfully locking a compute node. For more information, refer to the following sections in the Hardware Administration chapter of the Oracle Private Cloud Appliance Administrator Guide:

    1. In the navigation menu, click Rack Units. In the Rack Units table, click the name of the compute node you want to upgrade to display its detail page.

    2. In the top-right corner of the compute node detail page, click Controls and select the Provisioning Lock command.

    3. When the provisioning lock has been set, click Controls again and select the Migrate All Vms command. The Compute service evacuates the compute node, meaning it migrates the running instances to other compute nodes.

    4. When compute node evacuation is complete, click Controls again and select the Maintenance Lock command. This command might fail if instance migrations are in progress. Wait a few minutes and retry.

  2. In the navigation menu, click Upgrade & Patching.

  3. Optionally, upgrade the server ILOM first.

    1. In the top-right corner of the Upgrade Jobs page, click Create Upgrade or Patch. The Create Request window appears.

    2. Choose Upgrade as the Request Type. Select the appropriate upgrade request type: Upgrade ILOM.

      Fill out the server's assigned IP address in the ILOM network. This is an IP address in the internal 100.96.0.0/23 range.

    3. Click Create Request. The new upgrade request appears in the Upgrade Jobs table.

    4. Wait 5 minutes to allow the ILOM upgrade job to complete. Then proceed to the host upgrade.

  4. In the top-right corner of the Upgrade Jobs page, click Create Upgrade or Patch.

    The Create Request window appears. Choose Upgrade as the Request Type.

  5. Select the appropriate upgrade request type: Upgrade CN.

  6. Fill out the upgrade request parameters:

    • Host IP: Enter the compute node's assigned IP address in the internal administration network. This is an IP address in the internal 100.96.2.0/23 range.

    • Image Location: Enter the path to the location where the ISO image is stored. This parameter is deprecated in software version 3.0.2-b892153 and later.

    • ISO Checksum: Enter the checksum to verify the ISO image. It is stored alongside the ISO file. This parameter is deprecated in software version 3.0.2-b892153 and later.

    • Log Level: Optionally, select a specific log level for the upgrade log file. The default log level is "Information". For maximum detail, select "Debug".

    • Advanced Options JSON: Optionally, add a JSON string to provide additional command parameters.

  7. Click Create Request.

    The new upgrade request appears in the Upgrade Jobs table.

  8. When the compute node has been upgraded successfully, release the provisioning and maintenance locks.

    For more information, refer to the section "Performing Compute Node Operations". It can be found in the chapter Hardware Administration of the Oracle Private Cloud Appliance Administrator Guide.

    1. Open the compute node detail page.

    2. In the top-right corner of the compute node detail page, click Controls and select the Maintenance Unlock command.

    3. When the maintenance lock has been released, click Controls again and select the Provisioning Unlock command.

Using the Service CLI

  1. From the output you obtained with the compute node list command earlier, get the ID and the IP address of the compute node you intend to upgrade.

  2. Set the provisioning and maintenance locks for the compute node you are about to upgrade.

    Caution:

    Depending on the high-availability configuration of the Compute service, automatic instance migrations can prevent you from successfully locking a compute node. For more information, refer to the following sections in the Hardware Administration chapter of the Oracle Private Cloud Appliance Administrator Guide:

    1. Disable provisioning for the compute node.

      PCA-ADMIN> provisioningLock id=cf488903-fef8-4a51-8a41-c6990e4755c5
      Status: Success
      JobId: 6ee78c8a-e227-4d31-a770-9b9c96085f3f
    2. Evacuate the compute node. Wait for the migration job to finish before proceeding to the next step.

      PCA-ADMIN> migrateVm id=cf488903-fef8-4a51-8a41-c6990e4755c5 force=true
      Status: Running
      JobId: 6f1e94bc-7d5b-4002-ada9-7d4b504a2599
      
      PCA-ADMIN> show Job id=6f1e94bc-7d5b-4002-ada9-7d4b504a2599
        Run State = Succeeded
    3. Lock the compute node for maintenance.

      PCA-ADMIN> maintenanceLock id=cf488903-fef8-4a51-8a41-c6990e4755c5
      Status: Success
      JobId: e46f6603-2af2-4df4-a0db-b15156491f88
    4. Optionally, rerun the compute node list command to confirm lock status. For example:

      PCA-ADMIN> list computeNode fields hostname,ipAddress,ilomIp,state,firmwareVersion,provisioningLocked,maintenanceLocked orderby hostname ASCENDING
      Data:
        id                                     Hostname   Ip Address    ILOM Ip Address   State   Firmware Version           Provisioning Locked   Maintenance Locked
        --                                     --------   ----------    ---------------   -----   ----------------           -------------------   ------------------
        cf488903-fef8-4a51-8a41-c6990e4755c5   pcacn001   100.96.2.64   100.96.0.64       On      PCA Hypervisor:3.0.2-681   true                  true              
        42a7594d-1173-4dbd-4755-07810cc2d527   pcacn002   100.96.2.65   100.96.0.65       On      PCA Hypervisor:3.0.2-681   false                 false             
        bc0f37d5-ba77-423e-bc11-017704b47e59   pcacn003   100.96.2.66   100.96.0.66       On      PCA Hypervisor:3.0.2-681   false                 false             
        2e5ac527-01f5-4230-ae41-0522fcb57c9a   pcacn004   100.96.2.67   100.96.0.67       On      PCA Hypervisor:3.0.2-681   false                 false             
        5a6b61cf-7e99-4df2-87e4-b37c5fb0bfb8   pcacn005   100.96.2.68   100.96.0.68       On      PCA Hypervisor:3.0.2-681   false                 false             
        885f2aa4-f017-41e8-b2bc-e588cc0c6162   pcacn006   100.96.2.69   100.96.0.69       On      PCA Hypervisor:3.0.2-681   false                 false             
  3. Optionally, upgrade the server ILOM first.

    1. Enter the ILOM upgrade command.

      Syntax (entered on a single line):

      upgradeIlom
      hostIp=<ilom-ip>

      Example:

      PCA-ADMIN> upgradeIlom hostIp=100.96.0.64
      Data:
        Service request has been submitted. Upgrade Job Id = 1620921089806-ilom-21480 Upgrade Request Id = UWS-732d6fce-9f06-4329-b972-d093bee40010
      
      PCA-ADMIN> getUpgradeJob upgradeJobId=1620921089806-ilom-21480
    2. Wait 5 minutes to allow the ILOM upgrade job to complete. Then proceed to the host upgrade.

  4. Enter the compute node upgrade command.

    Syntax (entered on a single line):

    upgradeCN 
    hostIp=<compute-node-ip>
    [optional] imageLocation=<path-to-iso>
    [optional] isoChecksum=<iso-file-checksum>

    The parameters marked optional are deprecated in software version 3.0.2-b892153 and later. For earlier versions, include the ISO image parameters with the command.

    Example:

    PCA-ADMIN> upgradeCN hostIp=100.96.2.64 \
    imageLocation="http://host.example.com/pca-<version>-<build>.iso" \
    isoChecksum=240420cfb9478f6fd026f0a5fa0e998e086275fc45e207fb5631e2e99732e192e8e9d1b4c7f29026f0a5f58dadc4d792d0cfb0279962838e95a0f0a5fa31dca7
    Status: Success
    Data:
      Service request has been submitted. Upgrade Job Id = 1630938939109-compute-7545 Upgrade Request Id = UWS-61736806-7e5a-4648-9259-07c54c39cacb
  5. Use the request ID and the job ID to check the status of the upgrade process.

    PCA-ADMIN> getUpgradeJobs
      id                               upgradeRequestId                           commandName   result
      --                               ----------------                           -----------   ------
      1630938939109-compute-7545       UWS-61736806-7e5a-4648-9259-07c54c39cacb   compute       Passed
      1632850650836-platform-68465     UWS-26dba234-9b52-426d-836c-ac11f37e717f   platform      Passed
      1632849609034-kubernetes-35545   UWS-edfa3b32-c32a-4b67-8df5-2357096052bf   kubernetes    Passed
    
    PCA-ADMIN> getupgradejob upgradeJobId=1630938939109-compute-7545
    Data:
      Upgrade Request Id = UWS-61736806-7e5a-4648-9259-07c54c39cacb
      Name = compute
      Start Time = 2021-09-26T06:35:39
      End Time = 2021-09-26T06:45:55
      Pid = 7545
      Host = pcamn02
      Log File = /nfs/shared_storage/pca_upgrader/log/pca-upgrader_compute_2021_09_26-06.35.39.log
      Arguments = {"verify_only":false,"upgrade":false,"diagnostics":false,"host_ip":"100.96.2.64","result_override":null,"log_level":null,"switch_type":null,"precheck_status":false,"task_time":0,"fail_halt":false,"fail_upgrade":null,"component_names":null,"upgrade_to":null,"image_location":null,"epld_image_location":null,"expected_iso_checksum":null,"checksum":null,"composition_id":null,"request_id":"UWS-61736806-7e5a-4648-9259-07c54c39cacb","display_task_plan":false,"dry_run_tasks":false}
      Status = Passed
      Execution Time(sec) = 616
      Tasks 1 - Name = Copy Scripts
      Tasks 1 - Description = Copy scripts to shared storage
      Tasks 1 - Time = 2021-09-26T06:35:39
    [...]
  6. When the compute node upgrade has completed successfully and the node has rebooted, release the locks.

    For more information, refer to the section "Performing Compute Node Operations". It can be found in the chapter Hardware Administration of the Oracle Private Cloud Appliance Administrator Guide.

    1. Release the maintenance lock.

      PCA-ADMIN> maintenanceUnlock id=cf488903-fef8-4a51-8a41-c6990e4755c5
      Status: Success
      JobId: 625af20e-4b49-4201-879f-41d4405314c7
    2. Release the provisioning lock.

      PCA-ADMIN> provisioningUnlock id=cf488903-fef8-4a51-8a41-c6990e4755c5
      Status: Success
      JobId: 523892e8-c2d4-403c-9620-2f3e94015b46
  7. Proceed to the next compute node and repeat this procedure.

    The output from the compute node list command indicates the current status. For example:

    PCA-ADMIN> list computeNode fields hostname,ipAddress,ilomIp,state,firmwareVersion,provisioningLocked,maintenanceLocked orderby hostname ASCENDING
    Data:
      id                                     Hostname   Ip Address    ILOM Ip Address   State   Firmware Version           Provisioning Locked   Maintenance Locked
      --                                     --------   ----------    ---------------   -----   ----------------           -------------------   ------------------
      cf488903-fef8-4a51-8a41-c6990e4755c5   pcacn001   100.96.2.64   100.96.0.64       On      PCA Hypervisor:3.0.2-696   false                 false             
      42a7594d-1173-4dbd-4755-07810cc2d527   pcacn002   100.96.2.65   100.96.0.65       On      PCA Hypervisor:3.0.2-696   false                 false             
      bc0f37d5-ba77-423e-bc11-017704b47e59   pcacn003   100.96.2.66   100.96.0.66       On      PCA Hypervisor:3.0.2-696   false                 false             
      2e5ac527-01f5-4230-ae41-0522fcb57c9a   pcacn004   100.96.2.67   100.96.0.67       On      PCA Hypervisor:3.0.2-696   false                 false             
      5a6b61cf-7e99-4df2-87e4-b37c5fb0bfb8   pcacn005   100.96.2.68   100.96.0.68       On      PCA Hypervisor:3.0.2-681   false                 false             
      885f2aa4-f017-41e8-b2bc-e588cc0c6162   pcacn006   100.96.2.69   100.96.0.69       On      PCA Hypervisor:3.0.2-681   false                 false