11 Upgrading the UIM Cloud Native Environment
This chapter describes the tasks you perform in order to apply a change or upgrade to a component in the cloud native environment.
Creating a detailed upgrade plan can be a complex process. It is useful to start by mapping your use case to an upgrade path. These upgrade paths identify a set of sequenced activities that align to a CD stage. Once you know the activity sequence, you can then look for the detailed steps involved in each to come up with the comprehensive set of steps to be performed.
Upgrade paths consist of activities that fall into the following two main categories:
- Operational Procedures
- Component Upgrade Procedures
Operational Procedures
There are many different operational procedures and all of these affect the operating state of UIM. UIM cloud native provides the mechanism to change the operational state as described in "Running Operational Procedures".
The flowcharts in this chapter use the following image to depict an operational procedure:
Component Upgrade Procedures
These are the actual set of steps to perform a component upgrade and can be one of the following types:
- UIM Cloud Native Procedures: UIM cloud native owns the
component and therefore the upgrade procedure for that component. UIM cloud
native provides the mechanism to perform the upgrade via the scripts that are
bundled with the UIM cloud native toolkit.
An example of this is a change to a value in a UIM cloud native specification file (shape, project, and instance).
The flowcharts in this chapter use the following image to depict a UIM cloud native owned procedure.
- External Procedures: These procedures are for components that
are part of the UIM cloud native operating environment, but are out of the
control of UIM cloud native. UIM cloud native does not determine how to apply
the upgrade, but provides recommendations on the operational state of UIM
accompanying the upgrade.
An example would be updating the operating system on a worker node.
The flowcharts in this chapter use the following image to depict an external upgrade procedure.
- Miscellaneous upgrade procedures: There are some procedures that require special handling and are not captured in any of the upgrade paths. These are described in "Miscellaneous Upgrade Procedures".
Rolling Restart
Occasionally, you may need to restart UIM managed servers in a rolling fashion, one at a time. This does not result in downtime, but only reduced capacity for a limited period. A rolling restart can be triggered by invoking the restart-instance.sh script. This script can restart the whole instance in a rolling fashion, or only the admin server or all the managed servers in a rolling fashion. Some operations may automatically trigger rolling restart. These include image updates, tuning parameter changes, and so on pushed via the upgrade-instance.sh script.
Identifying Your Upgrade Path
In order to prepare your detailed plan for an upgrade, you need to be able to map your upgrade use case to an upgrade path. Some common use cases are detailed in the following charts. If your use case is not listed, see "Upgrade Path Flow Chart", which guides you through the decision making process to prepare a specific upgrade path.
Table 11-1 Common Upgrade Paths
Upgrade Type | Component | Upgrade Path | Requires Changing Image? |
---|---|---|---|
Cartridge Management | Deploy new cartridge version with Ruleset code (Where the ruleset is referring Java files) |
Online change, application upgrade, cartridge deployment |
Yes |
Cartridge Management | Redeploy a cartridge against an existing cartridge version with Ruleset code (Where the ruleset is referring Java files) |
Online change, application upgrade, cartridge deployment |
Yes |
Cartridge Management | Deploy new cartridge version without Ruleset code |
Online change, online cartridge deployment |
No |
Cartridge Management | Redeploy a cartridge against an existing cartridge version without Ruleset code |
Online change, online cartridge deployment |
No |
Configuration and Tuning | UIM cluster size (scaling up or down) | Online change, application upgrade | Not applicable |
Configuration and Tuning | Java parameters (memory, GC, and so on) | Online change, application upgrade | Not applicable |
Configuration and Tuning | WebLogic domain configuration (WDT such as JMS Queue configuration) | Online change, application upgrade | No |
Configuration and Tuning | UIM configuration parameters (custom-extensions.properties) | Online change, application upgrade | No |
Database Storage Management | DB Purges | Offline Change, PDB upgrade | No |
Security parameters | WebLogic Password change (poms cache coordination) | Miscellaneous upgrade procedures | No |
Security parameters | UIM Schema Password Change | Miscellaneous upgrade procedures | No |
Software Upgrade and Patching | UIM release or patch upgrade with Database change | Offline change, PDB upgrade, application upgrade | Yes |
Software Upgrade and Patching | Fusion MiddleWare upgrade | Online change, application upgrade (some exceptions needing offline change) | Yes |
Software Upgrade and Patching | UIM patch upgrade without Database change | Online Change, application upgrade (some exceptions needing offline change) | Yes |
Software Upgrade and Patching | Fusion MiddleWare overlay patches (for example, PSU or one-off patch) | Online Change, application upgrade (some exceptions needing offline change) | Yes |
Software Upgrade and Patching | Java upgrade | Online Change, application upgrade | Yes |
Software Upgrade and Patching | Linux | Online Change, application upgrade | Yes |
Software Upgrade and Patching | Custom code or third-party tool (custom image) | Online Change, application upgrade (some exceptions needing offline change) | Yes |
Software Upgrade and Patching | UIM cloud native toolkit | The release dictates the constraints. | Not applicable |
Shared infrastructure | Operating system or hardware on worker node | Online change, external procedure | No |
Shared infrastructure | Docker | Online change, external procedure | No |
Shared infrastructure | WebLogic Operator minor upgrade (backward compatible) | Online change, external procedure | No |
Shared infrastructure | WebLogic Operator major upgrade (non-backward compatible) | Online change, external procedure | No |
Once you understand the activities in your upgrade path, you can begin to map out the sequence of activities that you need to perform.
Offline Change Upgrade Paths
Offline changes are defined as those requiring UIM to be shutdown before the change can be applied.
All offline upgrades must start with a Scale Down procedure and end with a Scale Up procedure. You can find the explicit steps to perform these activities in Running Operational Procedures.
Once the cluster has been scaled down, you will need to perform either an external procedure (referencing documentation for the component) or follow a UIM cloud native owned procedure. See "UIM Cloud Native Upgrade Procedures" for details.
As an example, if your use case is to perform DB purges, then the upgrade path is "Offline Change, DB Purge procedure". The actual steps involve the following:
- Scale Down
- Edit the instance specification file to set cluster size to 0.
- Run upgrade-instance.sh.
- PDB Upgrade
- Edit the instance specification file to include purge command.
- Run install-uimdb.sh with the command appropriate for the purge use case.
- Scale Up
- Edit the instance specification file to return cluster size to original (1-18).
- Run upgrade-instance.sh.
Online Change Upgrade
Online changes are changes for which UIM can remain running while the component upgrade is performed. There is, therefore, no operational procedure at the start of the flow, but some paths include a rolling restart after the upgrade procedure is performed.
The component upgrade will either be an external procedure (referencing documentation for the component) or follow a UIM cloud native owned procedure described in "UIM Cloud Native Upgrade Procedures".
If explicit post-upgrade operational activities are required, you can find details in "Running Operational Procedures".
The following flowchart illustrates online change upgrade paths.
Exceptions and Unsupported Tasks
Exceptions
The following require shutdown:
- Some UIM patches
- Some Oracle Fusion Middleware overlay patches
- Oracle Fusion Middleware version upgrades
Unsupported Tasks
Adding, modifying, and deleting users or groups from embedded LDAP are not supported through an upgrade procedure. To make changes to users and groups, the instance must be deleted and re-create.
UIM Cloud Native Upgrade Procedures
The UIM cloud native owned upgrade procedures are:
- PDB upgrade
- UIM application upgrade
- Online cartridge deployment
Change or upgrade procedures that are dictated by UIM cloud native are applied using the scripts and the configuration provided in the toolkit.
PDB Upgrade Procedure
Changes impacting the PDB can be found in any of the specification files - project, instance or shape.
Examples include updating the UIM DB Installer image.
To perform a PDB upgrade procedure:
- Make the necessary modifications in your specification files.
- Invoke $UIM_CNTK/scripts/install-uimdb.sh with the command
appropriate for your use case.
To see a list of options, invoke with -h.
UIM Application Upgrade
Changes impacting the UIM application can be found in any of the specification files - project, instance or shape.
Examples include changing an existing value, changing the UIM image or supplying something new such as a secret or a new WDT extension.
To perform UIM application upgrade:
- Make the necessary modifications in your specification files.
- Invoke $UIM_CNTK/scripts/upgrade-instance.sh to push out the changes you just made to the running instance. This also triggers introspection for upgrade paths where introspection is required.
- In upgrade paths where a manual restart is required, restart the instance. See "Restarting the Instance" for details.
Updating the Default Settings for Coherence Cluster
After you upgrade the UIM application, update the default settings for coherence cluster in the WebLogic console.
To update the default settings for coherence cluster:
-
Open the WebLogic console.
-
Under the Domain Structure section, expand Environment and select Coherence Clusters.
The Settings for defaultCoherenceCluster page appears.
-
Under the Members tab:
-
Under the Servers section, deselect AdminServer.
-
Under the Cluster section, select the required clusters.
-
-
Click Save.
The default settings for coherence cluster are updated.
Online Cartridge Deployment
The Online deployment mode supports deployment of new cartridges and depends on the type of the cartridge. The cartridges are classified as follows:
- Simple cartridge (such as entity specifications, Groovy, or Drools code)
- Custom Extension cartridge (Java code, configuration files, images, custom applications, Java libraries, Aspects, and localization)
For Simple Cartridges, deployment can be performed without any upgrade path.
For Custom Extension Cartridges, perform the deployment as follows:
- Build customized image.
- Make the necessary modifications in your project specification to modify the image name.
- Upgrade the instance.
- Deploy cartridges.
Upgrades to Infrastructure
From the point of view of UIM instances, upgrades to the cloud infrastructure fall into two categories: rolling upgrades and one-time upgrades.
Note:
All infrastructure upgrades must continue to meet the supported types and versions listed in the UIM documentation's certification statement.Rolling upgrades are where, with proper high-availability planning (like anti-affinity rules), the instance as a whole remains available as parts of it undergo temporary outages. Examples of this are Kubernetes worker node OS upgrades, Kubernetes version upgrades and Docker version upgrades.
One-time upgrades affect a given instance all at once. The instance as a whole suffers either an operational outage or a control outage. Examples of this are WebLogic Operator upgrade and perhaps Ingress Controller upgrade.
Kubernetes and Docker Infrastructure Upgrades
Follow standard Kubernetes and Docker practices to upgrade these components. The impact at any point should be limited to one node - Master (Kubernetes and OS) or worker (Kubernetes, OS, and Docker). If a worker node is going to be upgraded, drain and cordon the node first. This will result in all pods moving away to other worker nodes. This is assuming your cluster has the capacity for this - you may have to temporarily add a worker node or two. For UIM instances, any pods on the cordoned worker will suffer an outage until they come up on other workers. However, their messages and orders are redistributed to remaining managed server pods and processing continues at a reduced capacity until the affected pods relocate and initialize. As each worker undergoes this process in turn, pods continue to terminate and start up elsewhere, but as long as the instance has pods in both affected and unaffected nodes, it will continue to process orders.
WebLogic Operator Upgrade
To upgrade the WebLogic Operator, follow the Operator documentation. As long as the target version can co-exist in a Kubernetes cluster with the current version, a phased cutover can be performed. In this, you will perform a fresh install of the new version of the Operator into a new namespace. RBAC will be arranged here, identical to your existing Operator namespace. Once the new Operator is functioning, for each UIM cloud native project, un-register it from the old Operator and register it with the new Operator. This can be done at your convenience on a per-project basis. When all projects have been switched to the new Operator, the old Operator can be safely deleted.
export WLSKO_NS=old-namespace $UIM_CNTK/scripts/unregister-namespace -p project -t wlsko
export WLSKO_NS=new-namespace $UIM_CNTK/scripts/register-namespace -p project -t wlsko
All instances with the transitioned project are impacted by this operation. However, there is no order processing outage during the transition. There is a control outage - where no changes can be pushed to the instances (upgrade-instance.sh or delete-instance.sh). Also, during the control outage, the termination of a pod does not immediately trigger healing. However, once the transition of the project is complete, the new Operator will react to any changed state (whether in the cluster, like pod termination, or in pushed changes, like instance upgrades) and run the required actions.
Ingress Controller Upgrade
Follow the documentation of your chosen Ingress Controller to perform an upgrade. Depending on the Ingress Controller used and its deployment in your Kubernetes environment, the UIM instances it serves may see a wide set of impacts, ranging from no impact at all (if the Ingress Controller supports a clustered approach and can be upgraded that way) to a complete outage.
To take the sample of Traefik that UIM cloud native toolkit uses as an Ingress Controller illustration:
An approach identical to that of WebLogic Operator upgrade can be followed for Traefik upgrade. The new Traefik can be installed into a new namespace, and one-by-one, projects can be unregistered from the old Traefik and registered with the new Traefik.
export TRAEFIK_NS=old-namespace $UIM_CNTK/scripts/unregister-namespace -p project -t traefik
export TRAEFIK_NS=new-namespace $UIM_CNTK/scripts/register-namespace -p project -t traefik
During this transition, there will be an outage in terms of the outside world interacting with UIM. Any data that flows through the ingress will be blocked until the new Traefik takes over. This includes GUI traffic, order injection, API queries, and SAF responses from external systems. This outage will affect all the instances in the project being transitioned.
Miscellaneous Upgrade Procedures
This section describes miscellaneous upgrade scenarios.
Network File System (NFS)
If an instance is created successfully, but a change to the NFS configuration is required, then the change cannot be made to a running UIM instance. In this case, the procedure is as follows:
- Perform a fast delete. See "Running Operational Procedures" for details.
- Update the
nfs
details in the instance specification. - Start the instance.
Security Parameters
To set the security parameters:
- Perform a fast delete. See "Running Operational Procedures" for details.
- Update the secrets for WebLogic, PDB credentials, or UIM Schema credentials.
- Start the instance.
Running Operational Procedures
This section describes the tasks you perform on the UIM server in response to a planned upgrade to the UIM cloud native environment. You must consider if the change in the environment fundamentally affects UIM processing to the extent that UIM should not run when the upgrade is applied or UIM can run during the upgrade but must be restarted to properly process the change.
The operational procedures are performed using the UIM cloud native specification files and scripts.
- Trigger introspection
- Scaling down the cluster
- Scaling up the cluster
- Restarting the cluster
- Fast delete
- Shutting down the cluster
- Starting up the cluster
Triggering Introspection
When any of the specification files have changed, invoke the upgrade-instance.sh script to trigger the operator's introspector to examine the change and apply it to the running instance.
Scaling Down the Cluster
The scaling down procedure described here is only in the context of the upgrade flow diagram. Hence, scaling down is down to 0 managed servers. A generalized scaling can change the cluster size down to a value between 0 and 18 (both inclusive) in any desired increment or decrement.
To scale down the cluster, edit the instance specification and change the
clusterSize
parameter to 0
. This terminates all the managed server pods, but leaves the admin
server up and running.
$UIM_CNTK/scripts/upgrade-instance.sh -p project -i instance -s $SPEC_PATH
Scaling Up the Cluster
The scaling up procedure described here is only in the context of the upgrade flow diagram. Hence, scaling up is up to the initial cluster size. A generalized scaling can change the cluster size up to a value between 0 and 18 (both inclusive) in any desired increment or decrement.
To scale up the cluster, edit the instance specification and change the value
of the clusterSize
parameter to its original value to
return the cluster to its previous operational state.
$UIM_CNTK/scripts/upgrade-instance.sh -p project -i instance -s $SPEC_PATH
Restarting the Instance
The UIM cloud native toolkit provides a script (restart-instance.sh) that you can use to perform different flavors of restarts on a running instance of UIM cloud native.
Following is the usage of the restart-instance.sh script
restart-instance.sh parameters
-p projectName : mandatory
-i instanceName : mandatory
-s specPath : mandatory; locations of specification files
-m customExtPath : optional; locations of custom extension files
-r restartType : mandatory; what kind of restart is requested
# specPath and customExtPath take a colon(:) delimited list of directories
# restartType can take the following values:
* full: Restarts the whole instance (rolling restart)
* admin: Restarts the WebLogic Admin Server only
* ms: Restarts all the Managed Servers (rolling restart)
# or just -h for help
$UIM_CNTK/scripts/restart-instance.sh -p project -i instance -s $SPEC_PATH -r full
Fast Delete
When the entire domain, including the admin server, needs to be taken offline, then the full shutdown and full startup procedures follow. This can be used to perform a "fast delete" or "dehydration" of the domain, instead of a full delete-instance operation where you may have to be concerned about the secrets and other pre-requisites being deleted. To quickly restore the domain, simply perform the startup procedure.
Shutting Down the Cluster
serverStartPolicy
parameter to
NEVER. This terminates all the
pods.# Operational control parameters
# scope - domain or cluster
serverStartPolicy: NEVER
$UIM_CNTK/scripts/upgrade-instance.sh -p project -i instance -s $SPEC_PATH
Starting Up the Cluster
serverStartPolicy
parameter to
IF_NEEDED. This starts up all the
pods.# Operational control parameters
# scope - domain or cluster
serverStartPolicy: IF_NEEDED
$UIM_CNTK/scripts/upgrade-instance.sh -p project -i instance -s $SPEC_PATH
Upgrade Path Flow Chart
When comparing and contrasting the different flows, identifying common steps or divergences, it can be useful to have a combined view of the flowcharts along with the main decision points. This can be useful when trying to automate parts of the process.
The first decision to make is whether UIM can be running when you apply the change. Typically, UIM needs to be shutdown for PDB impacting scenarios and the exceptions listed in the "Exceptions and Unsupported Tasks" section.
The following flowchart illustrates the flow for offline upgrades and various scenarios.
Figure 11-3 Upgrade Path Flow for Offline Changes
The following flowchart illustrates the flow for online upgrades and various scenarios.
Figure 11-4 Upgrade Path Flow for Online Changes