3 Upgrading CNE
Note:
Upgrading from 23.3.x to 23.4.x causes additional reboots of the CNE nodes as CNE 23.4.0 uses Oracle Linux 9 and CNE 23.3.x runs on Oracle Linux 8.3.1 Supported Upgrade Paths
The following table lists the supported upgrade paths for CNE:
Table 3-1 Supported Upgrade Paths
| Source Release | Target Release | 
|---|---|
| 23.4.x (where x = 1,4) | 23.4.6 | 
| 23.3.x | 23.4.6 | 
Note:
CNE supports performing an OS update on an existing system that runs version 23.4.1 or above on Oracle Linux 9.3.2 Prerequisites
- The user's central repository is updated with the latest versions of RPMs, binaries, and CNE Images for 23.4.x. For more information on how to update RPMs, binaries, and CNE images, see Artifact Acquisition and Hosting.
 - All Network Functions (NFs) are upgraded before performing a CNE upgrade. For more information about NF upgrade procedure, see Oracle Communications Cloud Native Core, Cloud Native Environment User Guide.
 - The CNE instance that is upgraded has at least the minimum recommended node counts for Kubernetes (that is, three master nodes and six worker nodes).
 
Note:
Currently, CNE doesn't support rollback in any instances such as:- encountering an error after initiating an upgrade
 - after a successful upgrade
 
Caution:
User, computer and applications, and character encoding settings may cause an issue when copy-pasting commands or any content from PDF. PDF reader version also affects the copy-pasting functionality. It is recommended to verify the pasted content especially when the hyphens or any special characters are part of the copied content.3.3 Common Services Release Information
/var/occne/cluster/${OCCNE_CLUSTER}/artifacts
                        directory: 
                     - Kubernetes Release File:
                                                
K8S_container_images.txt - Common Services Release File:
                                                
CFG_container_images.txt 
3.4 Preupgrade Tasks
Before upgrading CNE, perform the tasks described in this section.
3.4.1 Saving CNE Customizations
Before upgrading a CNE instance, you must save all the customizations applied to the CNE instance so that you can reapply them after the upgrade is complete.
3.4.1.1 Preserving Grafana Dashboards
- Log in to Grafana GUI.
 - Select the dashboard to save.
 - Click Share Dashboard to save the dashboard.
                                 
Figure 3-1 Grafana Dashboard

 - Navigate to the Export tab and click Save to file to
                    save the file in the local repository.
                                 
Figure 3-2 Saving the Dashboard in Local Repository

 - Repeat steps 1 to 4 until you save all the required customer-specific dashboards.
 
3.4.2 Performing Preupgrade Health Checks
Perform the following steps to ensure that cluster is in healthy state.
Check drive space on Bastion Host
Before upgrading, ensure that there's sufficient drive space at the home
                directory of the user (usually admusr for Bare Metal and
                    cloud-user for vCNE), where the upgrade runs.
                        
The df -h command in the /home directory must
                have at least 3.5 GB and /var directory must have at least 10 GB
                free space for temporary gathering and for running the CNE containers when the
                upgrade procedure runs.
                        
If there is insufficient space, then free up some space. One common
                location to reclaim space is the podman image storage for local images. You can find
                the local images using the 'podman image ls' command, and remove
                them by using the 'podman image rm -f [image]' command. You can
                reclaim additional podman space by using the 'podman system prune
                    -fa' command to remove any unreferenced image layers.
                        
Check OpenSearch pods disk space
For Elasticsearch pods that are using PVCs, check available disk space and confirm that there is at least 1 GB of disk space available before running the upgrade.
kubectl -n occne-infra exec occne-opensearch-cluster-data-0 -c opensearch -- df -h /usr/share/opensearch/data
kubectl -n occne-infra exec occne-opensearch-cluster-data-1 -c opensearch -- df -h /usr/share/opensearch/data
kubectl -n occne-infra exec occne-opensearch-cluster-data-2 -c opensearch -- df -h /usr/share/opensearch/data
                        $ kubectl -n occne-infra exec occne-opensearch-cluster-data-0 -c opensearch -- df -h /usr/share/opensearch/dataFilesystem      Size  Used Avail Use% Mounted on
/dev/sdb        9.8G  5.3G  4.1G  57% /usr/share/opensearch/data3.4.3 Checking Preupgrade Config Files
Check manual updates on the pod resources: Check the manual updates made to
            the Kubernetes cluster configuration such as deployments and daemonsets, after the
            initial deployment, are configured in the proper occne.ini (vCNE) or
                hosts.ini (Bare Metal) file. For more information, see Preinstallation Tasks.
                     
3.5 Standard Upgrade
This section describes the procedure to perform a full upgrade, OS update, or both on a given CNE deployment (Bare Metal or vCNE).
Note:
- This upgrade is only used to upgrade from release 23.3.x to release 23.4.x.
 - Ensure that you complete all the preupgrade procedures before performing the upgarde.
 - It is suggested to use a terminal emulator (such as, tmux) when running this procedure so the Bastion Host bash shell continues to run even in case of shell and VPN disconnections.
 - It is suggested to use a session capture program (such as, Script) on the Bastion Host to capture all input and output for diagnosing issues. This program must be rerun for each login.
 - Initiate the 
upgrade.shscript from the active Bastion Host. However, during most of the upgrades, there is no designated active Bastion Host as the system changes continuously. Therefore, ensure that you run theupgrade.shscript from the same Bastion Host that was used initially. - The upgrade procedure can take hours to complete and the total time depends on the configuration of the cluster.
 - Before performing an upgrade or OS update, verify the health of the cluster and the services related to CNE.
 
WARNING:
Refrain from performing a controlled abort (ctrl-C) on the upgrade while it is in progress. Allow the upgrade to exit gracefully from an error condition or after a successful completion.Log Files for Debugging Upgrade
The system generates many log files during the upgrade or OS update
                process. All the log files are suffixed with a date and timestamp. These files are
                maintained in the
                    /var/occne/cluster/<cluster_name>/upgrade/logs directory
                and can be removed after the upgrade or OS update completes successfully. For any
                issues encountered during the upgrade, these files must be collected into a tar file
                and made available to the next level of support for debugging.
                     
3.5.1 Performing an Upgrade or OS Update
This section describes the procedure to perform an upgrade, OS upgrade, or both.
Note:
- The upgrade and OS update causes the current Bastion Host to reboot multiple
                    times. Each time the 
upgrade.shscript terminates without indicating an error condition, rerun theupgrade.shscript using this procedure on the same Bastion Host after it reboots. - Currently, any procedure that applies to VMware and uses Terraform
                    doesn't operate successfully and cannot be used. These procedures include the
                        following:
                              
- Replacing a Failed vCNE LoadBalancer
 - Replacing a Failed Kubernetes Worker Node
 - Replacing a Failed Kubernetes Controller Node
 
 - OpenStack maintenance procedures that utilize Terraform in any step to create a
                    new VM will not work properly if the 
cluster.tfvarsimage field is not updated to "ol9u2" after the upgrade completes. For more information about downloading Oracle Linux, see Downloading Oracle Linux. 
- Use SSH to log in to the active Bastion Host. You can determine the
                    active Bastion Host by running the following
                        command:
$ is_active_bastionSample output:If you are rerunning the upgrade script after an error or termination, log in to the same Bastion Host that was used during the initial run.IS active-bastion - Ensure that the naming format of the existing OLX 
.repofile in the/var/occne/yum.repos.d/directory is<CENTRAL_REPO>-olx.repo, wherexis the version number (For example,<CENTRAL_REPO>-ol8.repo). - Ensure that the new central repository OLX yum
                        
.repofile is present in the is in/var/occne/yum.repos.d/directory and the format of the file name is<CENTRAL_REPO>-olx.repo, where x is the version number (For example,<CENTRAL_REPO>-ol9.repo):For example:$ curl http://${CENTRAL_REPO}/<path_to_file>/${CENTRAL_REPO}-ol9.repo -o /var/occne/yum.repos.d/${CENTRAL_REPO}-ol9.repoNote:
Ensure the content of the new central repository OLX yum.repofile is correct. - Perform one the following steps to initiate or resume the upgrade
                    or the OS update:
                              
Note:
The upgrade runs an initial cluster test based on the currentOCCNE_VERSION. If the initial upgrade cluster test fails, theupgrade.shscript terminates. However, at this point, the upgrade is not started. Therefore, after correcting the issues discovered, you can restart the upgrade using step a. This is not applicable to the usual expected exits in theupgrade.shscript.- Run the following command to launch the upgrade script to
                            perform an upgrade to the new
                            version:
$ OCCNE_NEW_VERSION=<new_version_tag> upgrade.sh -  Run the following command to launch the upgrade script to
                            perform an OS update and subsequent runs for both upgrade and OS
                            update:
$ upgrade.sh 
 - Run the following command to launch the upgrade script to
                            perform an upgrade to the new
                            version:
 - When the upgrade process initiates a reboot of the hosting Bastion,
                    the 
upgrade.shscript terminates gracefully with the following output and the current Bastion is rebooted after a short period.Sample output:
The current Bastion: ${OCCNE_HOSTNAME} will be going into reboot after configuration changes. Wait until the reboot completes and follow the documentation to continue the upgrade. This may take a number of minutes, longer on a major OS upgrade.Once the Bastion recovers from the reboot, rerunupgrade.shby running the command in Step 4b.Note:
- Once the upgrade begins on each node (starting with
                                    the active Bastion Host), the login Banner in Shell is updated
                                    to reflect the following message. This is restored back to the
                                    original message when the upgrade completes. This Banner is not
                                    set in the current active LBVMs for vCNE when the Banner is set
                                    in the other nodes. This is because the current active LBVMs are
                                    not upgraded until the end of the
                                    procedure.
**************************************************************************** | | Date/Time: 2024-01-16 14:56:31.612728 | | OCCNE UPGRADE TO VERSION: 23.4.1 IN PROGRESS | | Please discontinue login if not assisting in this maintenance activity. | | **************************************************************************** - In some cases, you may see an "Ansible FAILED!"
                                    assertion message after the following message. This is an
                                    expected behavior where the system tries to return the control
                                    to Shell when CNE detects that a reboot will interrupt
                                    processing.
TASK [staged_reboot : Halt ansible for os_upgrade reboot on current bastion (or its kvm host). After reboot, reconnect to same bastion, and relaunch upgrade.sh] *** fatal: [my-cluster-name-bastion-1]: FAILED! => { "assertion": false, "changed": false, "evaluated_to": false, "msg": "NOT AN ERROR: This is an EXPECTED assertion to flag self-reboot, and return shell control." } PLAY RECAP ********************************************************************* my-cluster-name-bastion-1 : ok=70 changed=24 unreachable=0 failed=1 skipped=194 rescued=1 ignored=0 - By default, during an upgrade and OS update, the
                                        
upgrade.shscript exits before rebooting the ACTIVE LBVMs by displaying the following message:
You must manually reboot or switch the activity of each ACTIVE LBVM such that it becomes the STANDBY LBVM. For procedure to perform a manual switchover of LBVMs, see the Performing Manual Switchover of LBVMs During Upgrade section. When the switchover of the LBVMs completes successfully, rerun theSkipping active LBVMs reboot since OCCNE_REBOOT_ACTIVE_LB is not set. The active LBVMs must be manually rebooted and the upgrade.sh script be run again.upgrade.shscript. 
 - Once the upgrade begins on each node (starting with
                                    the active Bastion Host), the login Banner in Shell is updated
                                    to reflect the following message. This is restored back to the
                                    original message when the upgrade completes. This Banner is not
                                    set in the current active LBVMs for vCNE when the Banner is set
                                    in the other nodes. This is because the current active LBVMs are
                                    not upgraded until the end of the
                                    procedure.
 - When the upgrade or OS update is complete, the system displays  the
                    following message:
                              
Message format for CNE upgrade:
For example:<date/time>******** Upgrade Complete ***********12/12/2023 - December Tuesday 01:09:57 - *********** Upgrade Complete **************Message format for OS update:<date/time>******** OS Update Complete *********** 
3.6 Postupgrade Tasks
This section describes the postupgrade tasks for CNE.
3.6.1 Restoring CNE Customizations
This section provides information about restoring CNE customizations. Ensure that you restore all the customizations applied to the CNE instance after completing the upgrade process.
3.6.1.1 Restoring Grafana Dashboards
Perform the following steps to restore the Grafana dashboard:
- Load the previously installed Grafana dashboard.
 - Click the + icon on the left panel and select Import.
                              
Figure 3-3 Load Grafana Dashboard

 - Once in new panel, click Upload JSON file. Choose the locally
                saved dashboard file.
                              
Figure 3-4 Uploading the Dashboard

 - Repeat the same steps for all the dashboards saved from the older version.
 
3.6.2 Verifying Terraform Files in VMware Deployments
This section provides details about verifying the content of the
                                compute/main.tf and
                                compute-lbvm/main.tf files in a VMware deployment
                        after performing an upgrade.
                     
- Update the Linux image template from OL8 to OL9 by changing the
            
template_namevariable in the/var/occne/cluster/${OCCNE_CLUSTER}/${OCCNE_CLUSTER}/cluster.tfvarsfile:
The following example shows the$ vi /var/occne/cluster/${OCCNE_CLUSTER}/${OCCNE_CLUSTER}/cluster.tfvarstemplate_namevariable that must be updated:template_name = "<name of the OL9 template>" - Verify the content of the 
compute/main.tfandcompute-lbvm/main.tffiles:- Run the following command to verify the content of the
                                
compute/main.tffile:$ cat /var/occne/cluster/${OCCNE_CLUSTER}/modules/compute/main.tf | grep 'ignore_changes\|override_template_disk' -C 2Ensure that the content of the file exactly matches the following content:} override_template_disk { bus_type = "paravirtual" size_in_mb = var.disk -- lifecycle { ignore_changes = [ vapp_template_id, template_name, catalog_name, override_template_disk ] } -- } override_template_disk { bus_type = "paravirtual" size_in_mb = var.disk -- lifecycle { ignore_changes = [ vapp_template_id, template_name, catalog_name, override_template_disk ] } - Run the following command to verify the content of the
                                
compute-lbvm/main.tffile:$ cat /var/occne/cluster/${OCCNE_CLUSTER}/modules/compute-lbvm/main.tf | grep 'ignore_changes\|override_template_disk' -C 2Ensure that the content of the file exactly matches the following content:} override_template_disk { bus_type = "paravirtual" size_in_mb = var.disk -- lifecycle { ignore_changes = [ vapp_template_id, template_name, catalog_name, override_template_disk ] } -- } override_template_disk { bus_type = "paravirtual" size_in_mb = var.disk -- lifecycle { ignore_changes = [ vapp_template_id, template_name, catalog_name, override_template_disk ] } 
 - Run the following command to verify the content of the
                                
 - If the files don't contain the 
ignore_changesargument, then edit the files and add the argument to each of the "vcd_vapp_vm" resources:- Run the following command to edit the
                                
compute/main.tffile:$ vi /var/occne/cluster/${OCCNE_CLUSTER}/modules/compute/main.tf - Add the following content between each
                                
override_template_diskcode block andmetadata = var.metadataline for each "vcd_vapp_vm" resource:lifecycle { ignore_changes = [ vapp_template_id, template_name, catalog_name, override_template_disk ] } - Save the 
compute/main.tffile. - Run the following command to edit the
                                
compute-lbvm/main.tffile:$ vi /var/occne/cluster/${OCCNE_CLUSTER}/modules/compute/main.tf - Add the following content between each
                                
override_template_diskcode block andmetadata = var.metadataline for each "vcd_vapp_vm" resource:lifecycle { ignore_changes = [ vapp_template_id, template_name, catalog_name, override_template_disk ] } - Save the 
compute-lbvm/main.tffile. - Repeat step 1 to ensure that the content of the files matches the ones provided in the step.
 
 - Run the following command to edit the
                                
 
3.6.3 Activating Optional Features
This section provides information about activating optional features, such as Velero and Local DNS post upgrade.
3.6.3.1 Activating Velero Post Upgrade
This section provides information about the Velero activation procedure.
Velero is used for performing on-demand backups and restore of CNE cluster data. Velero is an optional feature and has extra set of hardware and networking requirements. You can activate Veloro after a CNE installation or upgrade. For more information about activating Velero, see Activating Velero.
3.6.3.2 Activating Local DNS
This section provides information about activating Local DNS post upgrade.
The Local DNS feature is a reconfiguration of core DNS (CoreDNS) to support external hostname resolution. When Local DNS is enabled, CNE routes the connection to external hosts through core DNS rather than the nameservers on the Bastion Hosts. For information about activating this feature post upgrade, see the "Activating Local DNS" section in Oracle Communications Cloud Native Core, Cloud Native Environment User Guide.
To stop DNS forwarding to Bastion DNS, you must define the DNS details through 'A' records and SRV records. A records and SRV records are added to CNE cluster using Local DNS API calls. For more information about adding and deleting DNS records, see the "Adding and Removing DNS Records" section in Oracle Communications Cloud Native Core, Cloud Native Environment User Guide.
3.6.4 Updating Port Name for servicemonitors and podmonitors
The metric port name on which Prometheus extracts metrics from 5G-CNC applications
            must be updated to "cnc-metrics". 
                     
- Run the following command to get the servicemonitor
                        details:
$ kubectl get servicemonitor -n occne-infraSample output:NAME AGE occne-nf-cnc-servicemonitor 60m - Run the following command to update the port name for
                    servicemonitor:
$ kubectl edit servicemonitor occne-nf-cnc-servicemonitor -n occne-infra # Edit the above servicemonitor and update the following port name by removing "http" prefix. existing port name - port: http-cnc-metrics updated port name - port: cnc-metrics - Save the changes for servicemonitor.
 - Run the following command to get the podmonitor
                        details:
$ kubectl get podmonitor -n occne-infraSample output:NAME AGE occne-nf-cnc-podmonitor 60m - Run the following command to update the port name for
                    podmonitor:
$ kubectl edit podmonitor occne-nf-cnc-podmonitor -n occne-infra existing port name - port: http-cnc-metrics updated port name - port: cnc-metrics - Save the changes for podmonitor.