Upgrade

Upgrading components of Private Cloud Appliance is the responsibility of the appliance administrator. The system provides a framework to verify the state of the appliance prior to an upgrade, and to execute an upgrade procedure as a workflow that initiates each individual task and tracks its progress until it completes. Thanks to built-in redundancy at all system levels, the appliance components can be upgraded without service interruptions to the operational environment.

The source content for upgrades – packages, archives, deployment charts and so on – is delivered through an ISO image. During the preparation of the upgrade environment, the ISO image is downloaded to shared storage, and the upgrader itself is upgraded to the latest version included in the ISO. At this stage the appliance is ready for upgrades to be applied.

The administrator can perform an upgrade either through the Service Web UI or the Service CLI, and must select one of two available options: individual component upgrade or full management node cluster upgrade.

Upgrade Plan and History

During the preparation stages of the upgrade environment, the metadata file with the target component versions is compared to the current installation, and the result of this metadata comparison is recorded in a detailed upgrade plan. For each item, the plan contains the current and target versions and builds, and whether upgrade is required or not. If an item needs upgrading, the plan indicates which infrastructure components are impacted, whether a reboot is required, and how much time the upgrade takes.

Every reboot can affect the performance and availability of the appliance. For minimum risk and optimal use of time, the logic to populate the upgrade plan includes detailed evaluation of new packages. Many packages can be upgraded without a reboot or only require a service restart. The reboot flag in the upgrade plan is set only if a component absolutely must be rebooted.

The upgrade plan serves as a checklist that drives all upgrade procedures in a prescribed order. An upgrade job is started for every upgrade command, but if the plan indicates that a component is already at the required version, no further action is taken and the upgrade is skipped. However, a same-version upgrade can be forced if necessary.

As the upgrade operations progress, the upgrade plan is updated to reflect the current status, so that an administrator can check at any time which components have already been upgraded or still need to be upgraded. When current and target versions are identical for each item in the upgrade plan, the entire system is up-to-date.

For the purpose of long-term tracking, metadata files and upgrade plans are saved, while detailed information about all component upgrades and patches is captured in upgrade jobs. The upgrade history feature allows an administrator to drill down into the upgrade and patching activity on the appliance. The upgrade history presents all upgrade and patch information in a categorized way so you can see which version upgrades have been performed, which jobs have been run for each of those upgrades, and from which source (ISO upgrade or ULN patch). Details include build versions, component versions before and after, job completion, success or failure, time stamps, and duration.

Pre-Checks

All upgrade operations are preceded by a verification process to ensure that system components are in the correct state to be upgraded. For these pre-checks, the upgrade mechanism relies on the platform-level health checks. Even though health checks are executed continually for monitoring purposes, they must be run specifically before an upgrade operation. The administrator is not required to run the pre-checks manually; they are executed by the upgrade code when an upgrade command is entered. All checks must pass for an upgrade to be allowed to start, even if a single-component upgrade is selected.

Prerequisite software versions are enforced in releases from January 2024 onward. During the upgrade or patch preparations, the Upgrader service validates the currently installed appliance software version against the new target version. If the appliance is not running at least the minimum required version, the workflow rolls back the environment to its previous state. You must first install the minimum required version, before proceeding with the intended target version.

Certain upgrade procedures require that the administrator first sets a provisioning lock and maintenance lock. While the locks are active, no provisioning operations or other conflicting activity can occur, meaning the upgrade process is protected against potential disruptions. Once the upgrade has completed, the maintenance and provisioning locks must be released so the system returns to full operational mode.

Single Component Upgrade

Private Cloud Appliance upgrades are designed to be modular, allowing individual components to be upgraded rather than the entire system at once. With single component upgrade, the following component options are available:

  • ILOM firmware

    Use this option to upgrade the Oracle Integrated Lights Out Manager (ILOM) firmware of a specific server within the appliance. After the firmware is upgraded successfully, the ILOM is automatically rebooted. However, the administrator must manually restart the server for all changes to take effect.

  • Switch firmware

    Use this option to upgrade the operating software of the switches. You must specify which switch category to upgrade: the leaf switches, the spine switches, or the management switch.

  • ZFS Storage Appliance firmware

    Use this option to upgrade the operating software on the ZFS Storage Appliance. Both controllers, which operate in an active-active cluster configuration, are upgraded as part of the same process.

  • Host operating system

    Use this option to upgrade the Oracle Linux operating system on a management node. It triggers a yum upgrade on the selected management node, and is configured to use a yum repository populated through the ISO image.

  • Clustered MySQL database

    Use this option to upgrade the MySQL database on all management nodes. The database installation is rpm-based and thus relies on the yum repository that is populated through the ISO image. The packages for the database are deliberately kept out of the host operating system upgrade, because the timing of the database upgrade is critical. The database upgrade workflow manages the backup operations and the cluster state, and stops and restarts the relevant services. It ensures all the steps are performed in the correct order on each management node.

  • Kubernetes cluster

    Use this option to upgrade the Kubernetes cluster, which is the container orchestration environment where services are deployed. The Kubernetes cluster runs on all the management nodes and compute nodes; its upgrade involves three major operations:

    • Upgrading the Kubernetes packages and all dependencies: kubeadm, kubelet, kubectl and so on.

    • Upgrading the Kubernetes container images: kube-apiserver, kube-controller-manager, kube-proxy and so on.

    • Updating any deprecated Kubernetes APIs and services YAML manifest files.

  • Secret service

    The process to upgrade the secret service on all management nodes consists of two parts. It involves a rolling upgrade of the two main secret service components: the etcd key value store and the Vault secrets manager. Using the new image files made available in the podman registry, both are upgraded independently of each other, but in a mandatory order: etcd first, then Vault.

  • Platform services

    Use this option to upgrade the containerized services running within the Kubernetes cluster on the management nodes. The service upgrade mechanism is based on Helm, the Kubernetes equivalent of a package manager. For services that need to be upgraded, new container images and Helm deployment charts are delivered through an ISO image and uploaded to the internal registry. None of the operations up to this point have an effect on the operational environment.

    At the platform level, an upgrade is triggered by restarting the pods that run the services. The new deployment charts are detected, causing the pods to retrieve the new container image when they restart. If a problem is found, a service can be rolled back to the previous working version of the image.

    Note:

    In specific circumstances it is possible to upgrade certain platform services individually, by adding an optional JSON string to the command. This option should not be used unless Oracle provides explicit instructions to do so.

  • Compute node

    Use this option to perform a yum upgrade of the Oracle Linux operating system on a compute node. Upgrades include the ovm-agent package, which contains appliance-specific code to optimize virtual machine operations and hypervisor functionality. You must upgrade the compute nodes one by one; there can be no concurrent upgrade operations.

Full Management Node Cluster Upgrade

Upgrades of individual components are largely self-contained. The full management node cluster upgrade integrates a number of those component upgrades into a global workflow that executes the component upgrades in a predefined order. With a single command, all three management nodes in the cluster are upgraded sequentially and component by component. This means an upgrade of a given component is executed on each of the three management nodes before the global workflow moves to the next component upgrade.

The order in which components are upgraded is predefined because of dependencies, and must not be changed. During the full management node cluster upgrade, the following components are upgraded:

  1. Management node host operating system

  2. Clustered MySQL database

  3. Secret service (Etcd and Vault)

  4. Kubernetes container orchestration packages

  5. Platform layer containerized microservices

  6. Oracle Cloud Infrastructure images