Ensure the System Is In Ready State

Upgrades can be performed with limited impact on the system. No downtime is required, and user workloads continue to run while the underlying infrastructure is being upgraded in stages. However, it is considered good practice to ensure that backups are created of the system and the resources in your environment.

Fault Log

When preparing for an upgrade, check the system fault log. Issues with potential impact on upgrade and patching are flagged as upgrade faults, and will prevent any upgrade or patch command from running. The Service CLI provides filtering options for these faults, as shown in these examples:

PCA-ADMIN> list fault fields upgradeFault,Status,Severity
Data:
  id                                     Upgrade Fault   Status    Severity
  --                                     -------------   ------    --------
  37f6cefe-f7d5-49a8-adff-76c1a020bcc8   False           Cleared   Critical
  2be57600-4dbd-40f0-a0f8-e4ffbb2c8468   True            Active    Critical
  297c770e-16e1-11ef-9e78-a8698c107234   True            Cleared   Major
  d27e0895-eb87-4e66-bd4b-f6500153cf64   True            Cleared   Critical
  77752a35-4cf7-49ed-88df-06158d846358   True            Active    Critical
  9ff4fe22-8ed6-463f-95b5-458d6a76d185   False           Active    Critical
  0a09204c-953e-4df3-912b-a2a175ce8c1a   True            Cleared   Critical
[...]

PCA-ADMIN> list fault where upgradeFault EQ True
Data:
  id                                     Name                                        Status    Severity
  --                                     ----                                        ------    --------
  2be57600-4dbd-40f0-a0f8-e4ffbb2c8468   pcamn01--PCA-8000-44--asrclient             Active    Critical
  297c770e-16e1-11ef-9e78-a8698c107234   ilom-pcacn001--PCA-8000-EA--ilom-pcacn001   Cleared   Major
  0a09204c-953e-4df3-912b-a2a175ce8c1a   pcamn01--PCA-8000-AH--mysql_cluster         Cleared   Critical
  f7da7c82-03bc-4383-95cf-542cefbb5d39   pcamn01--PCA-8000-CD--mysql_cluster         Cleared   Critical
  74fd23ac-3006-4ab6-9c50-544869ba78f9   pcamn02--PCA-8000-6C--registry              Cleared   Critical
  dc77b088-939d-4725-b45f-1072024b02ba   pcamn02--PCA-8000-0E--etcd                  Cleared   Critical
  d27e0895-eb87-4e66-bd4b-f6500153cf64   pcamn01--PCA-8000-22--vault                 Cleared   Critical
  77752a35-4cf7-49ed-88df-06158d846358   pcamn01--PCA-8000-93--mysql_cluster         Active    Critical

Pre-Checks

Every upgrade operation is preceded by a set of pre-checks. These are built into the upgrade code and will report an error if the system is not in the required state for the upgrade. The upgrade will only begin if all pre-checks are passed.

You can use the pre-checks to test in advance for any system health issues that would prevent a successful upgrade. After preparing the upgrade environment, run any or all of the upgrade commands with the "verify only" option.

Caution:

Oracle strongly recommends testing that the Private Cloud Appliance is ready for upgrading, by executing the full rack upgrade command in verify-only mode. The output provides a readiness report you can use to plan any corrective actions as well as the upgrade.

This verification can take a long time to complete. Run it long enough in advance so the actual upgrade can be completed within the scheduled window.

In the Service Web UI the verify-only option is activated with a check box when you create the upgrade request; in the Service CLI you use the optional upgrade command parameter shown in this example:

PCA-ADMIN> upgradeRack type=ISO action=APPLY component=KUBERNETES verifyOnly=True
JobId: d14646fb-b2a9-445d-9702-b99ac605a993
Data: Service request has been submitted. Upgrade Job Id = 1744286305509-kubernetes_verify-2265363 Upgrade Request Id = UWS-7a4a52a4-dc7c-439c-9e32-9e61358d4e56

PCA-ADMIN> getUpgradeJobs
Data:
  id                                        Upgrade Request Id                                Command Name        Result
  --                                        ------------------                                ------------        ------
  1744286305509-kubernetes_verify-2265363   UWS-7a4a52a4-dc7c-439c-9e32-9e61358d4e56          kubernetes_verify   Passed

Rack-Wide Health Check

The full rack upgrade workflow provides a command option to run a suite of health checks across the entire appliance. The latest checks are available after the setup phase of upgrade preparation (preUpgrade) is completed. The rack-wide health check is designed to run before an upgrade, but you can use it to scan for potential issues at any time, as long no upgrade job is in progress.

Using the Service Web UI
  1. In the navigation menu, go to the Maintenance section and click Upgrade & Patching.

  2. In the top-right corner of the Upgrade Jobs page, click Create Upgrade or Patch.

    The Create Request window appears. Choose Upgrade as the Request Type.

  3. Select Upgrade Rack and enter these upgrade request parameters:

    • Type: For upgrade, select ISO. The ULN option applies to patching.

    • Action: Select Health Check.

  4. Click Create Request.

    The new upgrade request appears in the Upgrade Jobs table.

  5. [describe how to display status/result of all checks]

Using the Service CLI
  1. Start the rack-wide health check workflow.

    PCA-ADMIN> upgradeRack action=HEALTHCHECK
    JobId: de9eca17-e357-42c4-8aab-c73c311f787f
    Data: Service request has been submitted. Upgrade Request Id = UWS-ab2e72d9-d937-4afb-b6eb-112c87988f56
  2. Use the request ID to check the status of the health check workflow.

    PCA-ADMIN> getUpgradeStatus requestId=UWS-ab2e72d9-d937-4afb-b6eb-112c87988f56
    Data: 
      Request id = UWS-ab2e72d9-d937-4afb-b6eb-112c87988f56
      Status = Running
      Composition ID = fullrack
      Type = ISO
      Previous Build = 3.0.2-b1300130
      Target Build = 3.0.2-b1300385
      Jobs 1 = ...
      Jobs 2 = ...
      Jobs 3 = ...
      Completed Components 1 = ...
      Completed Components 2 = ...
      Completed Components 3 = ...
      Pending Components 1 = ...
      Pending Components 2 = ...
      Pending Components 3 = ...
  3. Optionally, check the status of an individual upgrade job.

    Because the rack-wide health check is a multi-component process, there are multiple upgrade jobs associated with the upgrade request. You can filter for those jobs based on the request ID. Using the job ID, you can drill down into the details of each upgrade job.

    PCA-ADMIN> getUpgradeJobs requestId=UWS-ab2e72d9-d937-4afb-b6eb-112c87988f56
    Data:
      id                                Upgrade Request Id                         Command Name   Result
      --                                ------------------                         ------------   ------
      1111111111111-component-000000    UWS-ab2e72d9-d937-4afb-b6eb-112c87988f56   component      None
      1111111111111-component-000000    UWS-ab2e72d9-d937-4afb-b6eb-112c87988f56   component      Passed
      1111111111111-component-000000    UWS-ab2e72d9-d937-4afb-b6eb-112c87988f56   component      Passed
    
    PCA-ADMIN> getUpgradeJob upgradeJobId=1111111111111-component-000000
    Data:
      Upgrade Request Id = UWS-8cbb37e3-1d9b-43bc-a793-2a84633fa718
      Composition Id = fullrack
      Name = component
      Pid = 000000
      Host = pcamn01
      Log File = /nfs/shared_storage/pca_upgrader/log/pca-upgrader_host_os_pcamn02_2025_01_21-21.22.11.log
      Arguments = {"component_names":null,"diagnostics":false,"display_task_plan":false,"dry_run_tasks":false,"expected_iso_checksum":null,"fail_halt":false,"fail_upgrade":null,"image_location":null,"online_upgrade":null,"precheck_status":false,"repo_config_override":null,"result_override":null,"task_time":0,"test_run":false,"upgrade":false,"upgrade_to":null,"user_uln_base_url":null,"verify_only":false,"host_ip":"100.96.2.34","log_level":null,"switch_type":null,"epld_image_location":null,"checksum":null,"composition_id":"fullrack","request_id":"UWS-8cbb37e3-1d9b-43bc-a793-2a84633fa718","uln":null,"patch":"false"}
      Status = Passed
      Execution Time(sec) = 1736
    [...]

    The output of the getUpgradeJob command provides detailed information about the tasks performed during the health check procedure. It displays descriptions, time stamps, duration, and success or failure. Whenever a task fails, the command output indicates which check has failed. For in-depth troubleshooting you can search the log file at the location provided near the start of the command output.

  4. [complete this procedure]

If issues are detected, either from the fault log or the health checks, you can resolve them before the planned upgrade window, and keep the actual system upgrade as fluent and short as possible.

It is important to note that concurrent upgrade operations are not supported. An upgrade job must be completed before a new one can be started.