Upgrade
Upgrading components of Private Cloud Appliance is the responsibility of the appliance administrator. The system provides a framework to verify the state of the appliance prior to an upgrade, and to execute an upgrade procedure as a workflow that initiates each individual task and tracks its progress until it completes. Thanks to built-in redundancy at all system levels, the appliance components can be upgraded without service interruptions to the operational environment.
The source content for upgrades – packages, archives, deployment charts and so on – is delivered through ISO images. During the preparation of the upgrade environment, the ISO images are downloaded to shared storage, and the upgrader itself is upgraded to the latest version included in the ISO content. At this stage the appliance is ready for upgrades to be applied.
The administrator can perform an upgrade either through the Service Web UI or the Service CLI. In general, a full rack upgrade is performed because it conveniently integrates all steps into a single workflow. In certain situations it might be advisable to upgrade in phases, work through components in groups, or even target an individual component. Those options are also available.
Upgrade Plan and History
During the preparation stages of the upgrade environment, the metadata file with the target component versions is compared to the current installation, and the result of this metadata comparison is recorded in a detailed upgrade plan. For each item, the plan contains the current and target versions and builds, and whether upgrade is required or not. If an item needs upgrading, the plan indicates which infrastructure components are impacted, whether a reboot is required, and how much time the upgrade takes.
Every reboot can affect the performance and availability of the appliance. For minimum risk and optimal use of time, the logic to populate the upgrade plan includes detailed evaluation of new packages. Many packages can be upgraded without a reboot or only require a service restart. The reboot flag in the upgrade plan is set only if a component absolutely must be rebooted.
The upgrade plan serves as a checklist that drives all upgrade procedures in a prescribed order. An upgrade job is started for every upgrade command, but if the plan indicates that a component is already at the required version, no further action is taken and the upgrade is skipped. However, a same-version upgrade can be forced if necessary.
As the upgrade operations progress, the upgrade plan is updated to reflect the current status, so that an administrator can check at any time which components have already been upgraded or still need to be upgraded. When current and target versions are identical for each item in the upgrade plan, the entire system is up-to-date.
At any time during or after an upgrade, the administrator can use the job framework to verify the status of requests and jobs. An upgrade command corresponds with an upgrade request, which triggers a workflow that might contain several upgrade jobs. Jobs, in turn, consist of a series of tasks. Details about all these elements can be retrieved using either the Service Web UI or the Service CLI. While the upgrade plan shows the status at a point in time, the job framework lets an administrator view status and history by drilling down into the details of Upgrader activity, which also makes troubleshooting easier.
For the purpose of long-term tracking, metadata files and upgrade plans are saved, while detailed information about all component upgrades and patches is captured in upgrade jobs. The upgrade history feature allows an administrator to drill down into the upgrade and patching activity on the appliance. The upgrade history presents all upgrade and patch information in a categorized way so you can see which version upgrades have been performed, which jobs have been run for each of those upgrades, and from which source (ISO upgrade or ULN patch). Details include build versions, component versions before and after, job completion, success or failure, time stamps, and duration.
Pre-Checks
All upgrade operations are preceded by a verification process to ensure that system components are in the correct state to be upgraded. For these pre-checks, the upgrade mechanism relies on the platform-level health checks. Even though health checks are executed continually for monitoring purposes, they must be run specifically before an upgrade operation. The administrator is not required to run the pre-checks manually; they are executed by the upgrade code when an upgrade command is entered. All checks must pass for an upgrade to be allowed to start, even if a single-component upgrade is selected.
Prerequisite software versions are enforced in releases from January 2024 onward. During the upgrade or patch preparations, the Upgrader service validates the currently installed appliance software version against the new target version. If the appliance is not running at least the minimum required version, the workflow rolls back the environment to its previous state. You must first install the minimum required version, before proceeding with the intended target version.
Certain upgrade procedures require that the administrator first sets a provisioning lock and maintenance lock. While the locks are active, no provisioning operations or other conflicting activity can occur, meaning the upgrade process is protected against potential disruptions. After the upgrade has completed, the maintenance and provisioning locks must be released so the system returns to full operational mode.
In large composite workflows, like the full rack upgrade and the upgrade of all compute nodes, locks are applied and released programmatically. Administrators do not need to lock and unlock components manually, but they must monitor the upgrade workflow for potential interruptions, and take corrective action if issues occur with node locking, compute instance migrations, and so on.
Full Rack Upgrade
The Upgrader Workflow Service (UWS) adds an orchestration layer to the upgrade process. It brings the modular component upgrades together in an integrated workflow, allowing the administrator to start an upgrade of the entire appliance with a single command. The upgrade logic of component selection, order, and timing, is determined by the upgrade plan. The workflow manages the operations required to complete the upgrade plan by generating requests and monitoring the results of the associated jobs.
The workflow-based approach is introduced in the appliance software version that migrates the full management cluster to Oracle Linux 8. The full rack upgrade is the preferred way to upgrade Private Cloud Appliance with all releases from April 2025 onward. In case of unforeseen orchestration problems or specific component requirements, Oracle will provide guidance on how to upgrade components in groups or individually.
Component order is determined by the upgrade plan for the specific release. The rack upgrade workflow includes the these components:
-
ZFS Storage Appliance firmware (including ILOMs of storage controllers)
-
Compute nodes
-
Management nodes
-
Host operating system of the 3 nodes
-
MySQL cluster database
-
Secret service (including Etcd and Vault)
-
Kubernetes container orchestration packages (platform layer)
-
Containerized microservices
-
-
Oracle Cloud Infrastructure images
-
ILOM firmware (all nodes)
-
Switch firmware (all switches)
Component Upgrades
Private Cloud Appliance upgrades are designed to be modular, enabling individual components to be upgraded without service interruptions in the operational environment. Oracle strongly recommends using the full rack upgrade workflow, but procedures for subsets and individual components are available for particular scenarios.
-
ZFS Storage Appliance firmware
Use this option to upgrade the operating software on the ZFS Storage Appliance. Both controllers, which operate in an active-active cluster configuration, are upgraded as part of the same process. If a new ILOM firmware is available, it is also upgraded on both controllers in the process.
-
Compute node
Use this option to install the latest Oracle Linux kernel and user space packages, as well as appliance-specific tools and optimizations. You can upgrade a single node by providing its host IP address, or run a workflow that upgrades all compute nodes installed in the appliance. There can be no concurrent upgrades; the nodes are locked and upgraded one by one.
-
Management cluster
Use this option to upgrade the host OS and the services running on all three management nodes, including the clustered MySQL database and Secret service (etcd/Vault). The full management node cluster upgrade integrates multiple component upgrades into a global workflow, so all jobs are performed in a predefined order. With a single command, all three management nodes in the cluster are upgraded sequentially and component by component. This means an upgrade of a given component is performed on each of the three management nodes before the global workflow moves to the next component upgrade.
-
Host operating system
Use this option to upgrade the Oracle Linux operating system on a management node. You can upgrade the host OS of a single node by providing its host IP address, or run a workflow that upgrades the host OS sequentially on all three management nodes.
-
Kubernetes cluster
Use this option to upgrade the Kubernetes cluster, which is the container orchestration environment where services are deployed. The Kubernetes cluster runs on all the management nodes and compute nodes; its upgrade involves three major operations:
-
Upgrading the Kubernetes packages and all dependencies: kubeadm, kubelet, kubectl and so on.
-
Upgrading the Kubernetes container images: kube-apiserver, kube-controller-manager, kube-proxy and so on.
-
Updating any deprecated Kubernetes APIs and services YAML manifest files.
-
-
Platform services
Use this option to upgrade the containerized services running within the Kubernetes cluster on the management nodes. The service upgrade mechanism is based on Helm, the Kubernetes equivalent of a package manager. For services that need to be upgraded, new container images and Helm deployment charts are delivered through an ISO image and uploaded to the internal registry. None of the operations up to this point have an effect on the operational environment.
At the platform level, an upgrade is triggered by restarting the pods that run the services. The new deployment charts are detected, causing the pods to retrieve the new container image when they restart. If a problem is found, a service can be rolled back to the previous working version of the image.
Note:
In specific circumstances it is possible to upgrade certain platform services individually, by adding an optional JSON string to the command. This option should not be used unless Oracle provides explicit instructions to do so.
-
ILOM firmware
Use this option to upgrade the Oracle Integrated Lights Out Manager (ILOM) firmware of a component installed in the appliance. You can upgrade a single ILOM by providing its IP address, or run a workflow that upgrades all ILOMs of all appliance components. After the firmware is upgraded successfully, the ILOM is automatically rebooted. However, the administrator must manually restart the server for all changes to take effect.
-
Switch firmware
Use this option to upgrade the operating software of the switches. You can specify a switch category to upgrade: the leaf switches, the spine switches, or the management switch. You can also run a workflow that upgrades all switches installed in the appliance.
-
Oracle-provided images
Use this option to add the latest available Oracle Cloud Infrastructure Images for Private Cloud Appliance. Existing images that are no longer needed, must be removed manually.