Upgrade the WebLogic Kubernetes Operator
Learn how to upgrade the WebLogic Kubernetes operator version for an Oracle WebLogic Server for OKE domain, if your domain uses an older operator version, or if your Kubernetes cluster version is not compatible with the WebLogic Server Kubernetes Operator version.
You can use Cloud Shell or the administration host to perform the upgrade. However, some of the upgrade steps must be performed on the administration host only.
If the subnet for the administration host does not have a NAT gateway configured, you must use Cloud Shell to perform the upgrade.
- Access the Oracle Cloud Infrastructure console and open Cloud Shell.
- Connect to the Administration Server node as the
opc
user.The SSH command format is:
ssh -i path_to_private_key opc@admin_ip
To SSH using Windows, see To connect to a Linux instance from a Windows system using PuTTY in Connecting to Your Linux Instance Using SSH in Oracle Cloud Infrastructure documentation.
Upgrade the WebLogic Kubernetes Operator to 3.4.4
Perform the following steps to upgrade the operator on the administration host:
Upgrade the WebLogic Kubernetes Operator to 4.0.5
If you upgrade the cluster and node pools to 1.25.x using the Oracle Cloud Infrastructure console, you must upgrade the WebLogic Server Kubernetes Operator version to 4.0.5. else domain-related operations fail.
Perform the following steps to upgrade the operator on the administration host:
Log File and Script Files
This section lists the log file and the script files that need to be replaced on the administration host after you upgrade the WebLogic Server Kubernetes Operator to 4.0.5.
log_messages.json
Copy the following content in log_messages.json
located
in /u01/shared/scripts/pipeline/clogging
directory.
{
"WLSOKE-VM-CRITICAL": {
},
"WLSOKE-VM-ERROR": {
"0001": "Error - executing check_versions.sh: [%s]",
"0002": "TF scripts version does not match the scripts on the vm",
"0003": "Unable to login to custom OCIR [%s]",
"0004": "Docker unable to download [%s] image from custom repository. Exit Code= [%s]",
"0005": "Error in docker initialization. Exit code[%s].",
"0006": "Cluster info file is not found.",
"0007": "OKE Cluster info file is missing.",
"0008": "OKE Cluster worker nodes are not yet available.",
"0009": "Error executing status check for OKE worker nodes.",
"0010": "Error creating namespace [%s]. Error [%s]",
"0011": "Error creating weblogic operator service account. Exit code[%s]",
"0012": "Error decrypting the value[%s]",
"0013": "Error installing weblogic operator. Exit code[%s]",
"0014": "Weblogic Operator values yaml is not available. Unable to install operator.",
"0015": "Error mounting FSS on admin host at mount point [%s]. [exit code = %s]",
"0016": "Error creating ocir secrets [ %s ] in oke [namespace=%s]. Error [%s]",
"0017": "Pod [%s] condition [%s]",
"0018": "Error installing public LB for weblogic cluster. [ exit_code =%s ]",
"0019": "Usage: /u01/scripts/bootstrap/install_wls_operator.sh <operator_ns> <wls_domain_ns>",
"0020": "Error executing %s. Exit code [%s]",
"0021": "Error create weblogic credential secrets. Exit code [%s]",
"0022": "Usage: /u01/scripts/domain/create_domain_image.sh <wls_domain> <wls_domain_ns>",
"0023": "Error uploading image [%s] to the OCIR repo. Exit code [%s]",
"0024": "Error in kubectl apply for domain: %s",
"0025": "Error: Failed to retrieve secret using secret ocid: %s ",
"0026": "Error: Weblogic domain is not ready after waiting for [%s] seconds.",
"0027": "Error executing command [%s] exit code [%s] and output [%s]",
"0028": "Exception executing command [%s] is [%s]",
"0029": "Error creating domain in home image. exit code[%s]",
"0030": "Error writing load balancer details in file[%s]. [%s]",
"0031": "Failed to provision domain [%s].[Exception :%s]",
"0032": "Failed to execute command [%s]. exit_cod[%s]",
"0033": "Process handled not saved for asynchronous polling",
"0034": "Timed out waiting for the process [%s] to complete.",
"0035": "Error: Credential file path not set.",
"0036": "Error: timed out waiting for the credential file.",
"0037": "Error: executing provisioner scripts. [exit_code=%s]",
"0038": "Error: unable to push the domain container image to OCIR. exit code [%s]",
"0039": "Failed to save to verified markers file",
"0040": "Failed to read verified marker file.",
"0041": "Error running wait-for-all-markers check. [%s]",
"0042": "Error creating domain in home image after retry. exit code[%s]",
"0043": "Provisioning failed marker found. Aborting provisioning check.",
"0044": "Failed to get attribute [%s]: [%s]",
"0045": "Execute error: [cmd=%s] [exit_code=%s] [error=%s] [output=%s]",
"0046": "Error deploying persistent volume claim for OKE cluster.",
"0047": "Error: Persistent volume claim is not bound.[bound = %s]",
"0048": "Error: Failed to change permission on persistent volume claim.",
"0049": "Error mounting shared FSS on admin host.[exit code = %s]",
"0050": "Error loading all attributes: [%s]",
"0051": "Usage: /u01/scripts/bootstrap/install_jenkins.sh <path_to_inputs_file>",
"0052": "Error installing jenkins charts. Exit code[%s]",
"0053": "Error installing fss charts for jenkins. Exit code[%s]",
"0054": "Error deploying persistent volume claim for OKE cluster in jenkins pod.",
"0055": "Error: Persistent volume claim is not bound in jenkins pod.[bound = %s]",
"0056": "Error: Docker images metadata file does not exists.",
"0057": "Exception: [ %s ]",
"0058": "Error installing ingress controller with Helm. Exit code [%s]",
"0059": "Error generating metadata file exception=[%s]",
"0060": "Error writing provisioning metadata file exception=[%s]",
"0061": "Error updating the dynamic group for admin.",
"0063": "Error updating kube config at [%s] with exit code [%s].",
"0064": "Error updating jenkins configuration. [%s]",
"0065": "Error Response code for adding request header to load balancer: [%s]",
"0066": "Error Response code for getting load balancer details: [%s]",
"0067": "Error : [%s]. Exception [%s]",
"0070": "Usage: /u01/scripts/bootstrap/install_ingress_controller.sh <path_to_inputs_file>",
"0071": "Error: Failed to install WebLogic Deploy Tool - [%s] Admin Mount Path not available.",
"0072": "Error: Failed to install WebLogic Deploy Tool - [%s] WebLogic Deploy Tool zip not found.",
"0073": "Error: Failed to install WebLogic Image Tool - [%s] WebLogic Image Tool zip not found.",
"0074": "Error: Failed vulnerability scan of image: [%s]",
"0075": "Error: [%s]. Exit code [%s]",
"0076": "Error loading all attributes with v2 endpoint: [%s]",
"0077": "Unable to create kubeconfig file for cluster: [%s]",
"0078": "Unable to create kubeconfig file for cluster",
"0079": "Unable to download atp wallet: [%s]",
"0080": "ATP download response code: [%s]",
"0081": "Error while creating IDCS applications in host %s for tenant %s:[%s]",
"0082": "Creation of IDCS applications and app gateway failed. Please check the logs",
"0083": "Usage: /u01/scripts/bootstrap/install_idcs_services.sh <path_to_inputs_file>",
"0084": "Error creating ConfigMap weblogic_conf [%s]",
"0085": "Error creating ConfigMap for cwallet.sso [%s]",
"0086": "File [%s] not found after generation.",
"0087": "Failed to create IDCS Deployment. Please check the logs.",
"0088": "Status code [%s] while getting cluster info",
"0089": "Error getting cluster info [%s]",
"0090": "OKE cluster belongs to different compartment/vcn [%s / %s] than stack compartment/vcn [%s / %s]",
"0091": "Error updating the dynamic group for os management.",
"0092": "Failed to patch domain",
"0093": "Introspector job is in failed state in weblogic domain namespace [%s]. [exit_code : %s]. Please check the introspector logs at location /u01/shared/logs.",
"0094": "Failed to create repository [%s] in compartment [%s]. Reason: [%s]",
"0095": "Error installing patching tool. [%s]",
"0096": "Error pushing Verrazzano images to OCIR repo. Exit code: [%s]",
"0097": "Failed to create Docker registry secret for Verrazzano container images. Exit code: [%s]",
"0098": "Failed to install Verrazzano operator. Exit code: [%s]",
"0099": "Failed to install Verrazzano. Exit code: [%s]",
"0100": "Failed to install ingress for Jenkins. Exit code [%s]",
"0101": "Failed to make the repository [%s] public. Please make the repo public manually.",
"0102": "Failed to get node pools [%s]",
"0103": "Failed to get node pool ID [%s]",
"0104": "Failed to get node pool details [%s]",
"0105": "Unknown DNS type for Jenkins ingress: [%s]",
"0106": "Failed to create secret [%s] in namespace [%s] for OCI DNS in Verrazzano. Exit code: [%s]. Output: [%s]",
"0107": "Could not retrieve OCID of DNS resolver for VCN [%s]",
"0108": "Could not retrieve OCID of DNS view for DNS zone [%s]",
"0109": "Failed to update DNS resolver [%s] with DNS view [%s]. Exception: [%s]",
"0110": "Failed to create secret [%s] in namespace [%s] for Custom CA in Verrazzano. Exit code: [%s]. Output: [%s]",
"0111": "Failed to configure prerequisites for certificates in Verrazzano. Exit code: [%s]. Output: [%s]",
"0112": "Failed to configure prerequisites for DNS in Verrazzano. Exit code: [%s]. Output: [%s]",
"0113": "Failed to configure prerequisites for Verrazzano. Exit code: [%s]. Output: [%s]"
},
"WLSOKE-VM-WARNING": {
"0001": "Failed to delete RCU schemas for prefix = [%s]",
"0002": "Retrying create domain as earlier attempt failed.",
"0003": "Warning while trying to format message %s with provided arguments %s. [%s]",
"0004": "Warning found failure marker created. Exiting provisioning flow.",
"0005": "Retrying create namespace as earlier attempt failed.",
"0006": "Error running provisioning status check. Please check the provisioning logs for status of provisioning",
"0007": "Warning missing life cycle management scripts",
"0008": "The node [%s] does not have IP addresses",
"0009": "Verrazzano install did not finish in 30 minutes. Login to the admin host and check if the install has succeeded."
},
"WLSOKE-VM-INFO": {
"0001": "Executing check_versions script",
"0002": "VM scripts version: [%s]",
"0003": "TF scripts version: [%s]",
"0004": "Executed check_versions script with exit code [%s]",
"0005": "Executing docker_init script",
"0006": "Docker login into OCIR [%s]",
"0007": "Docker downloading image [%s] from ocir",
"0008": "OKE Cluster information [%s]",
"0009": "Found [%s] node pools in OKE Cluster",
"0010": "OKE Node pools statuses : [%s]",
"0011": "Waiting for the workers nodes to be Active. Retrying...",
"0012": "Installing weblogic operator",
"0013": "Created operator namespace [%s]",
"0014": "Creating operator service account",
"0015": "Successfully created operator service account [%s]",
"0016": "Creating RBAC policy for service account.",
"0017": "Checking pod [ %s ] type [%s] status[ %s ]",
"0018": "Creating weblogic operator values yaml file",
"0019": "Successfully created weblogic operator values yaml file",
"0020": "Installing weblogic operator in namespace [%s]",
"0021": "Successfully installed weblogic operator.",
"0022": "Writing operator parameter file",
"0023": "Operator Parameters: %s",
"0024": "Domain attributes: %s",
"0025": "Creating domain yaml file: %s",
"0026": "Creating namespace: %s",
"0027": "Creating ocir secrets in oke: %s",
"0028": "Executing create weblogic domain script...",
"0029": "Create weblogic credential secrets...",
"0030": "Create weblogic domain inputs yaml file...",
"0031": "Uploading container image [%s]",
"0032": "Docker initialised for provisioning",
"0033": "Retrieving secret content for OCID %s",
"0034": "Executing create domain scripts",
"0035": "Applying domain yaml to OKE cluster",
"0036": "Waiting pods in domain [%s] to be running",
"0037": "Successfully created weblogic OKE cluster",
"0038": "Applying Load balancer yaml file",
"0039": "Waiting for load balancer service details",
"0040": "Successfully created load balancer",
"0041": "Successfully created model in image",
"0042": "Successfully executed command [%s]",
"0043": "Successfully applied domain yaml[%s]",
"0044": "Executing provisioner scripts",
"0045": "Successfully completed provisiong [ %s ]",
"0047": "Returning status code for marker verification [%s]",
"0048": "Successfully verified all the markers. WebLogic for OKE provisioning is successful.",
"0049": "Successfully created the namespace [%s].",
"0050": "Mounting share FSS on the admin host.",
"0051": "Successfully mounted share FSS on the admin host",
"0052": "Executing the mount fss script on admin.",
"0053": "Successfully executed the mount fss script on admin.",
"0054": "Executing helm install for fss on OKE cluster.",
"0055": "Updating the dynamic group for the Admin instance",
"0056": "Installing jenkins %s",
"0057": "Successfully installed jenkins in namespace [ %s ]",
"0058": "Installing ingress controller charts for jenkins [ %s ]",
"0059": "Successfully installed ingress controller",
"0060": "Successfully downloaded image [%s] from ocir",
"0061": "Successfully written file to FSS",
"0062": "Successfully updated the dynamic group for the Admin",
"0064": "Executing docker pull on oke nodes",
"0065": "Executing docker pull on node [ %s ]",
"0066": "Executed docker pull on node [ %s ]",
"0067": "Successfully executed docker pull on all nodes",
"0068": "Running post provisioning clean up scripts",
"0069": "Executing clean up scripts",
"0070": "Unzipping lcm and pipeline scripts to FSS.",
"0071": "Executed clean up scripts",
"0072": "Oke node init status [exit_code= %s], [output=%s]",
"0073": "Executing script command [ %s ]",
"0074": "Executing ssh key update on oke node[%s]",
"0075": "Successfully executed ssh key update on oke node[%s]",
"0076": "Creating provisioning service account [%s]",
"0077": "Creating clusterrolebinding with cluster administration permissions",
"0078": "Getting provisioning service account token name",
"0079": "Updating kubeconfig with provisioning service account token value",
"0080": "Setting the provisioning service account in the kubeconfig file for the current context",
"0081": "Successfully updated kubeconfig with provisioning service account token.",
"0082": "Updating jenkins configuration.",
"0083": "Found [%s] out [%s] pods in namespace [%s].",
"0084": "Waiting for command [ %s ] to finish",
"0087": "Creating ssl certificate secret in weblogic domain and jenkins namespaces",
"0088": "Successfully created certificate secret [ %s:%s ] in namespace [ %s ]",
"0089": "Successfully added header for Weblogic SSL termination [%s]",
"0090": "Configuring [%s] load balancer [%s]",
"0091": "Updating Jenkins job files in shared directory [%s]",
"0092": "Successfully configured [%s] load balancer for SSL [%s]",
"0093": "Creating File [%s]",
"0095": "Successfully updated the Jenkins job files in shared directory [%s]",
"0096": "Successfully installed WebLogic Deploy Tool [%s]",
"0097": "Successfully installed WebLogic Image Tool [%s]",
"0098": "Creating configmap [%s]",
"0099": "Successfully created kubeconfig file for cluster",
"0100": "ATP Wallet downloaded",
"0101": "Creating confidential IDCS application %s in host %s for tenant %s",
"0102": "Writing confidential IDCS application details to %s...",
"0103": "Created confidential IDCS application [%s] in host [%s] for tenant [%s]",
"0104": "Creating enterprise IDCS application %s in host %s for tenant %s",
"0105": "Writing enterprise IDCS application details to %s...",
"0106": "Created enterprise IDCS application [%s] in host [%s] for tenant [%s]",
"0107": "Creating IDCS Application Gateway %s in host %s for tenant %s",
"0108": "Writing IDCS Application Gateway details to %s...",
"0109": "Creating IDCS Application Gateway server %s in host %s for tenant %s",
"0110": "Creating IDCS Application Gateway mapping with description %s",
"0111": "Activating IDCS Application Gateway with id %s",
"0112": "Created IDCS Application Gateway [%s] in host [%s] for tenant [%s]",
"0113": "Creating IDCS applications and app gateway",
"0114": "Deactivating App Gateway [%s] with id [%s]",
"0115": "Deleting App Gateway [%s] with id [%s]",
"0116": "The IDCS Application Gateway %s already exists. Deleting it...",
"0117": "Deactivating IDCS Application Gateway %s.",
"0118": "Deleting IDCS Application Gateway %s.",
"0119": "The IDCS Application Gateway server %s already exists. Deleting it...",
"0120": "Deleting IDCS Application Gateway server %s.",
"0121": "The IDCS application %s already exists. Deleting it...",
"0122": "Deactivating IDCS application %s.",
"0123": "Deleting IDCS application %s.",
"0124": "Obtaining information for IDCS app role [%s]",
"0125": "Adding app role [%s] to application with app id [%s]",
"0126": "Creating IDCS app gateway config files",
"0127": "Skipping IDCS Applications creation as IDCS is not selected",
"0128": "Installing IDCS Service charts for [ %s ]",
"0129": "Successfully installed IDCS Services",
"0130": "Starting cwallet generation ...",
"0131": "Removing client id and secret from [%s]",
"0132": "The cwallet file generated in [%s]",
"0133": "OKE cluster passed compartment and vcn validation",
"0134": "Successfully updated the dynamic group for the os management",
"0135": "Successfully patched domain",
"0136": "Successfully created domain yaml",
"0137": "Successfully updated domain yaml",
"0138": "Successfully completed status check for Introspector job in weblogic domain namespace [%s]. [return_code : %s]",
"0139": "Successfully created OCIR repo %s in compartment %s",
"0140": "Successfully installed patching tool [%s]",
"0141": "Loading Verrazzano images into OCIR",
"0142": "Finished loading Verrazzano images into ocir",
"0143": "Created docker registry secret to allow access to required images",
"0144": "Successfully installed Verrazzano operator",
"0145": "Successfully installed Verrazzano",
"0146": "Successfully installed Jenkins ingress on Verrazzano ingress controller",
"0147": "Making image [%s] public",
"0148": "Submitted Verrazzano installation task",
"0149": "Node pool created successfully. Node pool id [%s]",
"0150": "Using DNS type [%s] for Verrazzano",
"0151": "Creating secret [%s] in namespace [%s] for OCI DNS in Verrazzano",
"0152": "Updated DNS resolver [%s] with DNS view [%s]",
"0153": "Creating the secret for provisioning service account"
},
"WLSOKE-VM-DEBUG": {
"0001": "OKE nodes : [%s]",
"0002": "Successfully executed command [%s]",
"0003": "Executing %s: [%s]",
"0004": "Pod:%s is not ready. Waiting for %s before retrying",
"0005": "Pods in namespace [%s] is not ready. Waiting for %s before retrying",
"0006": "Waiting for service [%s] external IP. Waiting for %s before retrying",
"0007": "Found following markers created during provisioning: %s",
"0008": "Saved verified markers to status file [markers=%s]",
"0009": "Provisioning status check found pending markers[%s].",
"0010": "Verified markers list: %s",
"0011": "Error getting metadata attribute [%s]: [%s]",
"0012": "Failed to get metadata from v2 endpoint [attribute=%s] [%s] ",
"0013": "Error getting attribute [%s]: [%s]",
"0014": "Failed to get attributes from v2 endpoint [attribute=%s] [%s] ",
"0015": "Wallet file configmap in namespace [%s] is not available. Waiting for [%s] before retrying"
}
}
domain_builder_utils.py
Copy the following content in domain_builder_utils.py
located in /u01/shared/scripts/pipeline/common
directory.
"""
Copyright (c) 2020, 2021, Oracle Corporation and/or its affiliates.
Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl.
"""
import yaml
import sys
import json
import os
def usage(args):
"""
Prints usage.
:param args:
:return:
"""
print('Args passed: ' + args)
print("""
Usage: python3 domain_builder_utils.py <operation>
where,
operation = create-test-domain-yaml
args = <running_domain_yaml> <test_domain_yaml_file> <new domain image>
operation = check-pods-ready
args = sys.stdin
operation = get-replica-count
args = <domain_yaml_file>
""")
sys.exit(1)
def get_replica_count(domain_yaml_file):
"""
Get replica count from domain yaml
:param domain_yaml_file:
:return:
"""
replica_count = 0
try:
with open(domain_yaml_file) as f:
domain_yaml = yaml.full_load(f)
replica_count = domain_yaml["status"]["clusters"][0]["replicas"]
except Exception as ex:
print("Error in parsing json file [%s]: %s" % (domain_yaml_file, str(ex)))
print(str(replica_count))
def create_running_domain_yaml(running_domain_yaml_file, new_domain_img, secrets):
"""
Create test domain YAML file.
:param running_domain_yaml_file: YAML for currently running domain
:param new_domain_img: New image to be updated
:param secrets: List of user defined secrets
:return:
"""
existing_secrets = []
with open(running_domain_yaml_file) as f:
running_domain_yaml = yaml.full_load(f)
running_domain_yaml["spec"]["image"] = new_domain_img
existing_secrets.extend([running_domain_yaml["spec"]["configuration"]["model"]["runtimeEncryptionSecret"], running_domain_yaml["spec"]["webLogicCredentialsSecret"]["name"]])
if running_domain_yaml["spec"]["configuration"]["model"]["domainType"] == "JRF":
existing_secrets.extend([running_domain_yaml["spec"]["configuration"]["opss"]["walletPasswordSecret"], running_domain_yaml["spec"]["configuration"]["opss"]["walletFileSecret"]])
new_secrets = [name for name in secrets if name not in existing_secrets]
if new_secrets:
running_domain_yaml["spec"]["configuration"]["secrets"] = new_secrets
with open(running_domain_yaml_file, 'w') as f:
yaml.dump(running_domain_yaml, f)
print("Successfully created running domain yaml [%s]" % running_domain_yaml_file)
def get_model_secrets(running_domain_yaml_file):
"""
Get the list of secrets present in the running domain yaml
:param running_domain_yaml_file: Running domain yaml file
:return:
"""
existing_secrets = []
with open(running_domain_yaml_file) as f:
running_domain_yaml = yaml.full_load(f)
existing_secrets.extend([running_domain_yaml["spec"]["configuration"]["model"]["runtimeEncryptionSecret"], running_domain_yaml["spec"]["webLogicCredentialsSecret"]["name"]])
if running_domain_yaml["spec"]["configuration"]["model"]["domainType"] == "JRF":
existing_secrets.extend([running_domain_yaml["spec"]["configuration"]["opss"]["walletPasswordSecret"], running_domain_yaml["spec"]["configuration"]["opss"]["walletFileSecret"]])
if "secrets" in running_domain_yaml["spec"]["configuration"]:
existing_secrets.extend(running_domain_yaml["spec"]["configuration"]["secrets"])
secrets = ''
for name in existing_secrets:
secrets += str(name) + ' '
print(secrets)
def check_pods_ready(file):
"""
Check if pods are ready and print the count of pods that are in ready state
:param file: stdin file descriptor
:return:
"""
count = 0
try:
a = json.load(file)
for i in a['items']:
for j in i['status']['conditions']:
if j['status'] == "True" and j['type'] == "Ready" and i['status']['phase'] == 'Running':
# print(i['metadata']['name'])
count = count + 1
except:
print("The data from stdin doesn't appear to be valid json. Fix this!")
sys.exit(1)
print(count)
def get_ocir_user(ocir_url, file):
"""
Get OCIR user from the input ocirsecrets auths json.
:param ocir_url: e.g. phx.ocir.io
:param file: stdin from kubectl command to read ocirsecrets json.
:return:
"""
try:
a = json.load(file)
if 'Username' in a['auths'][ocir_url]:
print(a['auths'][ocir_url]['Username'])
else:
print(a['auths'][ocir_url]['username'])
except:
print("The data from stdin doesn't appear to be valid json. Fix this!")
sys.exit(1)
def get_ocir_auth_token(ocir_url, file):
"""
Get OCIR auth token from the input ocirsecrets auths json.
:param ocir_url: e.g. phx.ocir.io
:param file: stdin from kubectl command to read ocirsecrets json.
:return:
"""
try:
a = json.load(file)
if 'Password' in a['auths'][ocir_url]:
print(a['auths'][ocir_url]['Password'])
else:
print(a['auths'][ocir_url]['password'])
except:
print("The data from stdin doesn't appear to be valid json. Fix this!")
sys.exit(1)
def get_metadata_attribute(attr):
"""
Get Metadata attribute.
Assumes that the metadata attributes are loaded from configmap and exposed in the pod as environment variables.
:param attr: Attribute to look for.
:return:
"""
return os.environ[attr]
def main():
if len(sys.argv) < 2:
usage(sys.argv)
try:
operation = sys.argv[1]
if operation == 'create-running-domain-yaml':
if len(sys.argv) < 5:
usage(sys.argv)
running_domain_yaml_file = sys.argv[2]
new_domain_img = sys.argv[3]
secrets = sys.argv[4:]
create_running_domain_yaml(running_domain_yaml_file, new_domain_img, secrets)
elif operation == 'check-pods-ready':
check_pods_ready(sys.stdin)
elif operation == 'get-ocir-user':
ocir_url = sys.argv[2]
get_ocir_user(ocir_url, sys.stdin)
elif operation == 'get-ocir-auth-token':
ocir_url = sys.argv[2]
get_ocir_auth_token(ocir_url, sys.stdin)
elif operation == 'get-replica-count':
if len(sys.argv) < 3:
usage(sys.argv)
domain_yaml_file = sys.argv[2]
get_replica_count(domain_yaml_file)
elif operation == 'get-secrets-list':
running_domain_yaml_file = sys.argv[2]
get_model_secrets(running_domain_yaml_file)
except Exception as ex:
print("Error: " + str(ex))
sys.exit(1)
if __name__ == "__main__":
main()
pipeline_utils.sh
Copy the following content in pipeline_utils.sh
located
in /u01/shared/scripts/pipeline/common
directory.
#!/usr/bin/env bash
# Copyright (c) 2020, 2022, Oracle Corporation and/or its affiliates.
# Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl.
# This script defines functions that are called across different pipeline stages.
script="${BASH_SOURCE[0]}"
scriptDir="$( cd "$( dirname "${script}" )" && pwd )"
fileName=$(basename $BASH_SOURCE)
source ${scriptDir}/pipeline_constants.sh
# Function: Get metadata attribute.
# metadata_file: path to metadata file
# attribute: metadata attribute
get_metadata_attribute() {
val=$(printenv $1)
if [[ -z $val ]]; then
val=$(kubectl get cm wlsoke-metadata-configmap -n jenkins-ns -o jsonpath="{.data.$1}")
fi
echo $val
}
# Function: Get OCIR Username
# ocir_url: OCIR url e.g. phx.ocir.io
# domain_ns: domain namespace
# ocirsecret_name: Name of imagePullSecrets[0] defined in domain.yaml
get_ocir_user() {
local ocir_url=$(get_metadata_attribute 'ocir_url')
local ocirsecret_name=$1
local domain_ns=$(get_metadata_attribute 'wls_domain_namespace')
local auths_json=$(kubectl get secret ${ocirsecret_name} -n ${domain_ns} -o jsonpath="{.data.\.dockerconfigjson}" | base64 -d)
result=$(echo ${auths_json} | python3 ${scriptDir}/domain_builder_utils.py 'get-ocir-user' ${ocir_url})
echo ${result}
}
# Function: Get OCIR Auth Token
# ocir_url: OCIR url e.g. phx.ocir.io
# domain_ns: domain namespace
# ocirsecret_name: Name of imagePullSecrets[0] defined in domain.yaml
get_ocir_auth_token() {
local ocir_url=$(get_metadata_attribute 'ocir_url')
local ocirsecret_name=$1
local domain_ns=$(get_metadata_attribute 'wls_domain_namespace')
local auths_json=$(kubectl get secret ${ocirsecret_name} -n ${domain_ns} -o jsonpath="{.data.\.dockerconfigjson}" | base64 -d)
result=$(echo ${auths_json} | python3 ${scriptDir}/domain_builder_utils.py 'get-ocir-auth-token' ${ocir_url})
echo ${result}
}
# Function: Generate updated domain image tag for WLS-OKE.
# metadata_file: metadata file
# timestamp: build timestamp to be used for image tagging
generate_domain_img_tag() {
local timestamp=$1
local tag=""
# Generate a tag for new domain image
if [[ -n ${DOMAIN_NAME} ]]; then
#BASE_IMAGE: <ocir-url>/<tenancy>/okestack/wls-base-image/14110:14.1.1.0.230117-230117
local ocir_domain_image_repo=$(get_domain_property "${DOMAIN_NAME}" "BASE_IMAGE")
# repo_path=<ocir-url>/<tenancy>/mytest/wls-base-image/12214
local repo_main_path=$(echo ${ocir_domain_image_repo} | cut -d ":" -f 1)
# repo_path=<ocir-url>/mytenancy/myserv/wls-base-image
repo_path=$(dirname ${repo_main_path})
# wls_base_version = 12.2.1.4.191220-200203
local wls_base_version=$2
if [[ -z $wls_base_version ]]; then
wls_base_version=$( echo ${ocir_domain_image_repo} | cut -d ":" -f 2)
fi
# tag = <ocirurl>/mytenancy/myserv/wls-base-image/mydomain/12214:12.2.1.4.191220-200203-<timestamp>
wls_version=$(echo $(echo ${ocir_domain_image_repo} | cut -d ":" -f 1) | cut -d "/" -f5-)
tag="$repo_path/$DOMAIN_NAME/${wls_version}:${wls_base_version}-${timestamp}"
else
wls_base_version=$2
#iad.ocir.io/idivyfxzwa6h/mystack1411/wls-base-image
local ocir_domain_image_repo=$(get_metadata_attribute 'ocir_domain_repo')
# Create Domain Base Image job. This does not act on a given domain.
wls_version=$(echo "${wls_base_version}" | sed 's/\.//g')
#wls_version=14.1.1.0.0, need to exclude the last zero for the base image wls version
tag=$ocir_domain_image_repo"/"${wls_version::5}":"${wls_base_version}-${timestamp}
fi
echo ${tag}
}
# Function: Generate updated domain image tag for Verrazzano.
# metadata_file: metadata file
# timestamp: build timestamp to be used for image tagging
vz_generate_domain_img_tag() {
local timestamp=$1
if [[ $# == 2 ]]; then
DOMAIN_NAME=$2
fi
# Generate a tag for new domain image for Verrazzano
#<ocir-url>/<tenancy>/mytest/wls-base-image:12.2.1.4.210420-210502
local ocir_domain_image_repo=$(get_domain_property "${DOMAIN_NAME}" "BASE_IMAGE")
# repo_path=<ocir-url>/<tenancy>/mytest/wls-base-image/12214
local repo_main_path=$(echo ${ocir_domain_image_repo} | cut -d ":" -f 1)
# repo_path=<ocir-url>/mytenancy/myserv/wls-base-image
repo_path=$(dirname ${repo_main_path})
# wls_base_version = 12.2.1.4.191220-200203
wls_base_version=$( echo ${ocir_domain_image_repo} | cut -d ":" -f 2)
local tag=""
if [[ -n ${DOMAIN_NAME} ]]
then
# tag = <ocirurl>/mytenancy/myserv/wls-base-image/mydomain/12214:12.2.1.4.191220-200203-<timestamp>
wls_version=$(echo $(echo ${ocir_domain_image_repo} | cut -d ":" -f 1) | cut -d "/" -f5-)
tag="$repo_path/$DOMAIN_NAME/$wls_version:${DOMAIN_NAME}-${timestamp}"
else
# Create Domain Base Image job. This does not act on a given domain.
# tag = <ocir-url>/mytenancy/myserv/mydomain/wls-domain-base:12.2.1.4.191220-200203-<timestamp>
tag=$repo_main_path":"${wls_base_version}-${timestamp}
fi
echo ${tag}
}
# Function: Update running domain yaml file.
# metadata_file:
# running_domain_yaml: path to the running domain.yaml file to be updated
# build_timestamp: timestamp used for the new configmap
update_running_domain_yaml() {
local running_domain_yaml=$1
local build_timestamp=$2
local secrets=()
local i=0
for name in `kubectl get secrets -n ${DOMAIN_NS} |grep Opaque|awk '{print $1}'` ;
do
secrets[$i]=$name
i=`expr $i + 1`
done
python3 ${scriptDir}/domain_builder_utils.py 'create-running-domain-yaml' ${running_domain_yaml} ${build_timestamp} "${secrets[@]}"
}
#Function: Check introspector pod status
#domain_ns: domain namespace
#domain_uid: domain UID
#max_wait_time: timeout duration for pods in the domain to come to ready state
check_introspector_status(){
set -x
local domain_ns=$1
local domain_uid=$2
local max_wait_time=$3
local interval=30
local status=2
local consistent_result_count=0
local max_consistency_count=6
local count=0
let max_retry=${max_wait_time}/${interval}
mkdir -p /tmp/intro
cd /tmp/intro
$log Info "0116" $fileName
# Checking the domain status and admin server image is updated with the latest image.
sleep ${interval}s
status_type=$(kubectl get domain ${domain_uid} -n ${domain_ns} -o jsonpath="{..status.conditions[0].type}")
get_admin_server_image=$(kubectl get po ${domain_uid}-${domain_uid}-adminserver -n ${domain_ns} -o jsonpath="{.spec.containers[0].image}")
message=$(kubectl get domain ${domain_uid} -n ${domain_ns} -o jsonpath="{..status.conditions[0].message}")
#Checking and throwing error either status message contains Failed/failed string or status type is Failed.
if [[ ${message} =~ "ailed" || "${status_type}" == "Failed" ]]; then
$log Error "0092" $fileName
((status=-1))
else
# Waiting for Admin pod to come up with the new image.
while [[ "${NEW_DOMAIN_IMAGE}" != "$get_admin_server_image" && ${consistent_result_count} -ne ${max_consistency_count} ]] ; do
((consistent_result_count++))
$log Info "0117" $fileName
sleep ${interval}s
status_type=$(kubectl get domain ${domain_uid} -n ${domain_ns} -o jsonpath="{..status.conditions[0].type}")
get_admin_server_image=$(kubectl get po ${domain_uid}-${domain_uid}-adminserver -n ${domain_ns} -o jsonpath="{.spec.containers[0].image}")
message=$(kubectl get domain ${domain_uid} -n ${domain_ns} -o jsonpath="{..status.conditions[0].message}")
if [[ -n ${message} ]] && [[ "${status_type}" == "Failed" ]]; then
$log Error "0092" $fileName
$log Error "0093" $fileName "${message}"
((status=-1))
break
elif [[ "${status_type}" == "Available" ]] && [[ -z ${message} ]]; then
continue
fi
done
#Waiting for all pods to be updated with the new image.
if [[ $status -eq 2 ]]; then
while [[ $count -lt ${max_retry} ]] ; do
$log Info "0119" $fileName
sleep ${interval}s
#Getting the latest status of the serves.
get_admin_server_image=$(kubectl get po ${domain_uid}-${domain_uid}-adminserver -n ${domain_ns} -o jsonpath="{.spec.containers[0].image}")
admin_pod_status=$(kubectl get po ${domain_uid}-${domain_uid}-adminserver -n ${domain_ns} -o jsonpath="{..status.phase}")
get_managed_server_image=$(kubectl get po ${domain_uid}-${domain_uid}-managed-server1 -n ${domain_ns} -o jsonpath="{.spec.containers[0].image}")
managed_server_pod_status=$(kubectl get po ${domain_uid}-${domain_uid}-managed-server1 -n ${domain_ns} -o jsonpath="{..status.phase}")
# Checking Admin and Managed servers are updated with the new domain image and both are running or not?
if [[ "${NEW_DOMAIN_IMAGE}" == "$get_admin_server_image" && "$admin_pod_status" == "Running" ]]; then
if [[ "${NEW_DOMAIN_IMAGE}" == "$get_managed_server_image" && "$managed_server_pod_status" == "Running" ]]; then
$log Info "0118" $fileName
status=0
break;
fi
else
((count++))
continue;
fi
done
fi
fi
if [[ ${status} -eq 0 ]]
then
return 0
else
return 1
fi
}
# Function: Check pod status is ready
check_pods_ready() {
result=$(python3 ${scriptDir}/domain_builder_utils.py 'check-pods-ready')
echo ${result}
}
# Function: Wait for pods to be ready
# domain_yaml: domain yaml
# max_wait_time: timeout duration for pods in the domain to come to ready state
# domain_ns: domain namespace
# num_pods_to_run: how many pods are there in the domain
wait_for_pods() {
set +x
local domain_yaml=$1
local max_wait_time=$2
local domain_ns=$3
local num_pods_to_run=$4
local count=0
local interval=30
let max_retry=${max_wait_time}/${interval}
# Ensure the check_pods_ready count remains consistent over 4*30secs = 2mins
# This is needed for the case when we are doing rolling restart of the domain (default case)
#
local consistent_result_count=0
local max_consistency_count=4
echo "Waiting for domain server pods [$num_pods_to_run] to be ready (max_retries: $max_retry at interval: $interval seconds) ..."
local START=$(date +%s)
count_pods_ready=$(kubectl get pods -n ${domain_ns} -o json | check_pods_ready)
while [[ ( ${count_pods_ready} -ne ${num_pods_to_run} || ${consistent_result_count} -ne ${max_consistency_count} ) && $count -lt ${max_retry} ]] ; do
sleep ${interval}s
count_pods_ready=$(kubectl get pods -n ${domain_ns} -o json | check_pods_ready)
# Check if all pods are ready then the result remains consistent for sometime
while [[ ${consistent_result_count} -ne ${max_consistency_count} ]] && [[ ${count_pods_ready} -eq ${num_pods_to_run} ]] ; do
let consistent_result_count=consistent_result_count+1
echo "Consistent result count: ${consistent_result_count}"
echo "[$count_pods_ready of $num_pods_to_run] are ready"
sleep ${interval}s
count_pods_ready=$(kubectl get pods -n ${domain_ns} -o json | check_pods_ready)
done
if [[ ${consistent_result_count} -eq ${max_consistency_count} ]] && [[ ${count_pods_ready} -eq ${num_pods_to_run} ]] ; then
break
else
let consistent_result_count=0
fi
let count=count+1
done
echo "Exiting wait_for_pods: [$count_pods_ready of $num_pods_to_run] are ready"
echo "consistent_result_count: [$consistent_result_count] of [$max_consistency_count]"
echo "retries: [$count] of [$max_retry]"
local END=$(date +%s)
echo "Domain startup took:"
echo $((END-START)) | awk '{print int($1/60)"m:"int($1%60)"s"}'
if [[ ${count_pods_ready} -eq ${num_pods_to_run} ]]
then
return 0
else
return 1
fi
set -x
}
# Function: Login to OCIR.
# metadata_file: provisioning metadata file
# ocir_user: OCIR username
# ocir_auth_token: OCIR user auth token
ocir_login() {
set +x
local ocir_url=$(get_metadata_attribute 'ocir_url')
local ocir_user=$1
local ocir_auth_token=$2
$log Info "0120" $fileName "$ocir_url" "$ocir_user"
echo ${ocir_auth_token} | docker login ${ocir_url} --username ${ocir_user} --password-stdin
exit_code=$?
if [[ $exit_code -ne 0 ]]; then
$log Error "0094" $fileName "$ocir_url"
exit 1
fi
}
#Function: Select the OS Linux Base Image
#linux_version: Linux version used for base image
os_base_image() {
set +x
local version=`echo ${1} | cut -d '-' -f2`
local result
images=(`docker images| tail -n +2`)
for ((x=0;x<${#images[@]};x++)); do
if [[ ${images[$x]} =~ "oraclelinux" ]]; then
name="${images[$x]}"
((x++))
if [[ ${images[$x]} =~ $version ]]; then
result=$name:"${images[$x]}"
break
fi
fi
done
echo ${result}
}
#
# Uploads image to ocir repo
# param: ocir_image_tag
# param: exit code to return for error
function ocir_image_upload() {
set -x
ocir_image_tag=$1
return_exit_code=$2
if [[ ${return_exit_code} == "" ]]; then
return_exit_code=1
fi
python3 /u01/shared/scripts/pipeline/clogging/shellLogging.py Info "0016" $fileName $ocir_image_tag
#Pushing the docker image to compartment level instead of root compartment
repo_path=$(echo $(echo "${ocir_image_tag}" | cut -d ":" -f 1) | cut -d "/" -f3-)
#Creating the repo at compartment level to push the pipeline images
compartment_id=$(kubectl get cm wlsoke-metadata-configmap -n jenkins-ns -o jsonpath="{.data.\oke_cluster_compartment_id}")
python3 ${pipeline_common}/create_repo.py "$compartment_id" "$repo_path"
exit_code=$?
if [[ $exit_code -ne 0 ]]; then
$log Error "0150" $fileName "$docker_repo_path" "$compartment_id" $exit_code
exit 2
fi
cmd_output=$(docker push $ocir_image_tag 2>&1)
exit_code=$?
echo "${cmd_output}"
if [[ ${exit_code} -ne 0 ]]; then
docker images
python3 /u01/shared/scripts/pipeline/clogging/shellLogging.py Error "0021" $fileName $ocir_image_tag $exit_code
exit $return_exit_code
fi
}
# Function: Validate the Domain is running and server PODs are in RUNNING state.
validate_running_domain() {
set -x
local running_domain_yaml=/tmp/running-domain-${BUILD_TIMESTAMP}.yaml
local is_idcs_selected=$(get_domain_property $DOMAIN_NAME IS_IDCS_SELECTED)
# Get running domain yaml
kubectl get domain ${DOMAIN_NAME} -n ${DOMAIN_NS} -o yaml > ${running_domain_yaml}
exit_code=$?
if [[ $exit_code -ne 0 ]]; then
$log Error "0089" $fileName "${DOMAIN_NAME}" "${DOMAIN_NS}"
exit_with_cleanup 7
fi
# Get replica count in domain yaml
local replica_count=$(python3 /u01/shared/scripts/pipeline/common/domain_builder_utils.py 'get-replica-count' ${running_domain_yaml})
if [[ ${is_idcs_selected} == "YES" ]]; then
let num_pods_to_run=replica_count+2
else
let num_pods_to_run=replica_count+1
fi
# Max wait time 120 mins for pods to be ready
let max_wait_time=120*60
wait_for_pods ${running_domain_yaml} ${max_wait_time} ${DOMAIN_NS} ${num_pods_to_run}
if [[ $? -ne 0 ]]; then
$log Error "0090" $fileName
exit_with_cleanup 7
else
$log Info "0111" $fileName
fi
set +x
}
# Function: Deploy updated domain image to the running domain.
deploy_domain_img() {
set -x
local tag=${NEW_DOMAIN_IMAGE}
# Apply the domain image to running domain if publish is selected
$log Info "0108" $fileName "$tag"
local running_domain_yaml=/tmp/running-domain-${BUILD_TIMESTAMP}.yaml
# Get running domain yaml
kubectl get domain ${DOMAIN_UID} -n ${DOMAIN_NS} -o yaml > ${running_domain_yaml}
exit_code=$?
if [[ $exit_code -ne 0 ]]; then
$log Error "0085" $fileName "${DOMAIN_UID}" "${DOMAIN_NS}"
exit_with_cleanup 3
fi
# Back it up
mkdir -p /u01/shared/weblogic-domains/${DOMAIN_UID}/backups/${BUILD_TIMESTAMP}
cp ${running_domain_yaml} /u01/shared/weblogic-domains/${DOMAIN_UID}/backups/${BUILD_TIMESTAMP}/prev-domain.yaml
# Update domain.yaml with the new image and secrets.
update_running_domain_yaml ${running_domain_yaml} ${tag}
# Apply the changes
kubectl apply -f ${running_domain_yaml}
exit_code=$?
if [[ $exit_code -ne 0 ]]
then
exit_with_cleanup 4
fi
# Replace the domain.yaml file in shared filesystem
cp ${running_domain_yaml} /u01/shared/weblogic-domains/${DOMAIN_UID}/domain.yaml
# Back up the new domain.yaml in backup directory
cp ${running_domain_yaml} /u01/shared/weblogic-domains/${DOMAIN_UID}/backups/${BUILD_TIMESTAMP}/domain.yaml
set +x
}
# Function: Validate if the introspector has completed successfully
validate_introspector() {
set -x
max_wait_time=5*60
local is_apply_jrf=$(get_domain_property $DOMAIN_NAME IS_APPLY_JRF)
if [[ ${is_apply_jrf} == "true" ]]
then
max_wait_time=15*60
fi
check_introspector_status ${DOMAIN_NS} ${DOMAIN_NAME} ${max_wait_time}
if [[ $? -ne 0 ]]; then
$log Error "0088" $fileName "${DOMAIN_NAME}"
cp /u01/shared/weblogic-domains/${DOMAIN_NAME}/backups/${BUILD_TIMESTAMP}/prev-domain.yaml /u01/shared/weblogic-domains/${DOMAIN_NAME}/domain.yaml
exit_with_cleanup 6
else
$log Info "0110" $fileName
fi
set +x
}
# Function: Rollback domain
rollback_domain() {
set -x
local running_domain_yaml=/tmp/running-domain-${BUILD_TIMESTAMP}.yaml
local prev_domain_yaml=/u01/shared/weblogic-domains/${DOMAIN_NAME}/backups/${BUILD_TIMESTAMP}/prev-domain.yaml
if [[ -f ${prev_domain_yaml} ]]; then
#Get running domain yaml
kubectl get domain ${DOMAIN_NAME} -n ${DOMAIN_NS} -o yaml > ${running_domain_yaml}
exit_code=$?
if [[ $exit_code -eq 0 ]]; then
old_mii_image=`echo $(sed -n '/image:/p' ${prev_domain_yaml} | cut -d':' -f2-) | sed 's/ *$//g'`
cp ${running_domain_yaml} ${prev_domain_yaml}
# Update domain.yaml with the old mii image
sed -i -e "s|\(image: \).*|\1 \"${old_mii_image}\"|g" ${prev_domain_yaml}
$log Info "0112" $fileName "${prev_domain_yaml}" "${old_mii_image}"
cat ${prev_domain_yaml}
# Apply the domain yaml with old domain image
kubectl apply -f ${prev_domain_yaml}
else
#In case there is no domain (which should not happen) we will try to apply the file that was backup when the new domain image was applied
$log Error "0091" $fileName "${DOMAIN_UID}" "${DOMAIN_NS}" "${prev_domain_yaml}"
# Apply the domain yaml backed up when the image was applied.
kubectl apply --force -f ${prev_domain_yaml}
fi
validate_running_domain
# Replace current domain.yaml with most current domain
kubectl get domain ${DOMAIN_UID} -n ${DOMAIN_NS} -o yaml > /u01/shared/weblogic-domains/${DOMAIN_UID}/domain.yaml
else
$log Info "0113" $fileName
fi
# Cleanup the earlier image from OCIR repo. TODO: automate this step
$log Info "0114" $fileName "${NEW_DOMAIN_IMAGE}"
set +x
}
# Function: Rollback to specified domain image.
# Param: rollback_to_image - domain image to rollback to.
rollback_domain_to_image() {
set -x
local rollback_to_image=$1
timestamp=$(date +"%y-%m-%d_%H-%M-%S")
local running_domain_yaml=/tmp/running-domain-${timestamp}.yaml
# Get running domain yaml
kubectl get domain ${DOMAIN_UID} -n ${DOMAIN_NS} -o yaml > ${running_domain_yaml}
exit_code=$?
if [[ $exit_code -ne 0 ]]; then
$log Error "0089" $fileName "${DOMAIN_UID}" "${DOMAIN_NS}"
exit_with_cleanup 2
fi
# Update domain.yaml with the old domain image
sed -i -e "s|\(image: \).*|\1 \"${rollback_to_image}\"|g" ${running_domain_yaml}
$log Info "0112" $fileName "${running_domain_yaml}" "${rollback_to_image}"
cat ${running_domain_yaml}
# Apply the domain yaml with old domain image
kubectl apply -f ${running_domain_yaml}
exit_code=$?
if [[ $exit_code -ne 0 ]]
then
exit_with_cleanup 3
fi
validate_running_domain
# Remove the temp running domain yaml
rm -f ${running_domain_yaml}
set +x
}
# Copy this function here because this might be called from auto-patching flow where pipeline_common.sh is not sourced.
exit_with_cleanup() {
set -x
local exit_code=$1
# Get out of the build context directory so we can delete it
cd "$(pwd)"
# Remove the temp domain image build directory
rm -rf /tmp/deploy-apps
# Temp domain yaml files created
rm -f /tmp/test-domain-${BUILD_TIMESTAMP}.yaml
rm -f /tmp/running-domain-${BUILD_TIMESTAMP}.yaml
# Remove patching files created
rm -f /tmp/apply_opatches.log
rm -f /tmp/finalbuild.txt
rm -f /tmp/oraInst.loc
rm -rf /tmp/opatch_updated_tag.txt
# Scan image cleanup
# docker stop ${BUILD_TIMESTAMP}-clair
# docker stop ${BUILD_TIMESTAMP}-db
# rm -f /tmp/clair-${BUILD_TIMESTAMP}.log
set +x
exit ${exit_code}
}