12 Troubleshooting Your BRM Cloud Native Deployment

Learn how to solve problems that may occur in your Oracle Communications Billing and Revenue Management (BRM) cloud native system.

Topics in this document:

Problems with the Helm Installation

If a Helm installation encounters errors, such as an incorrect namespace, follow these steps to get back to a state where you can fix the issue and do a new installation.

Note:

For more information about Kubernetes commands, see "kubectl Cheat Sheet" in the Kubernetes documentation.

  1. Check the state of the deployment:

    kubectl get pods -o wide -n NameSpace

    To see information about a specific pod:

    kubectl describe pod PodName -n NameSpace
  2. Use the helm rollback command to go back to a previous revision of the chart, or use the helm uninstall command to uninstall the chart. See "Rolling Back A Release To A Previous Revision" in BRM Cloud Native System Administrator’s Guide, or see "Helm Uninstall" in the Helm documentation.

  3. If neither rolling back nor uninstalling the chart are successful, do the following to identify Kubernetes resources that did not install correctly and then delete them:

    • Check and delete all other stateful set components from the cluster:

      kubectl get sts

      If you identify a stateful set that you want to delete, scale the number of replicas:

      kubectl scale statefulsets StatefulSetName --replicas=n

      where StatefulSetName is the name of a stateful set, and n is the number of replicas you are scaling to. For more information, see "Scale a StatefulSet" in the Kubernetes documentation.

      Then, delete the stateful set:

      kubectl delete StatefulSetName 

      You can run kubectl get sts again to verify the deletions.

    • If you need to clean up Apache Kafka and Apache ZooKeeper, scale to 0 and then delete:

      kubectl scale sts/kafka_pod --replicas=0
      kubectl scale sts/zookeeper_pod --replicas=0
      kubectl get pods
      kubectl get sts
      kubectl delete sts kafka_pod zookeeper_pod
    • If necessary, check any PVC, Secret, ConfigMap, or service that was created by the deployment. If the output from any of these commands shows something that you want to clean up, you can use kubectl delete to remove it.

      For example:

      kubectl get pvc --all-namespaces
      kubectl delete pvc PVCName
      
      kubectl get secrets --all-namespaces
      kubectl delete secret SecretName
      
      kubectl get configmap --all-namespaces
      kubectl delete configmap ConfigMapName
      
      kubectl get svc --all-namespaces
      kubectl delete svc SVC1 SVC2 

Helm Installation Fails with Time-Out Error

After you deploy a Helm chart, you may receive the following error message indicating that the Helm chart installation failed:

Error: failed post-install: timed out waiting for the condition

This occurs because a post-installation job took longer than five minutes to complete.

To resolve the issue:

  1. Purge your Helm release:

    helm delete BrmReleaseName --purge

    This removes and purges all resources associated with the last revision of the release.

  2. Run the Helm install command again.

If that does not fix the problem, increase the amount of time Kubernetes waits for a command to complete by including the --timeout duration argument with the helm install command. For example, to set the timeout duration to 10 minutes, you would enter this command:

helm install BrmReleaseName oc-cn-helm-chart --namespace BrmNameSpace --timeout 10m --values OverrideValuesFile

BRM Cloud Native Deployment Out of Memory Errors

After you deploy BRM cloud native, you may receive an error message similar to the following:

ERROR: cm_cache_heap_malloc: name="fm_bparams_cache" - out of memory, size requested=2216,high val=960
cm_cache_flist: PIN_ERR_NO_MEM:requested=2216, used=121456, allocated=122880, chunk=30, cache name="fm_bparams_cache"

To resolve the issue:

  1. In your oc-cn-helm-chart directory, open your CM ConfigMap file (configmap_pin_conf_cm.yaml).

  2. Add the following fm_bparams_cache entry to the file:

    - cm_cache fm_bparams_cache 40,245760,23
  3. Save and close the file.

  4. Run the helm upgrade command for oc-cn-helm-chart:

    helm upgrade BrmReleaseName oc-cn-helm-chart --values OverrideValuesFile -n BrmNameSpace

PDC Messages Stuck in Rating Engine Queues

Occasionally, PDC messages and changesets may become stuck in the rating engine queues.

To resolve the issue, delete both the RRE and BRE pods by running the following command:

kubectl -n BrmNameSpace delete pod PdcPodName

where:

  • BrmNameSpace is the namespace in which the BRM Kubernetes objects reside.

  • PdcPodName is the name of the pod.

Kubernetes automatically restarts the deleted pod, which restarts the transformation engine. Messages should start flowing again.

PDC Interceptor Pod is Started But Went to Error State

After you deploy PDC, the Interceptor pod may start but immediately transition to an error state.

This may occur because the RCU prefix is configured incorrectly. To find out if this is the case, run the following command:

kubectl describe domain DomainName -n NameSpace

If the issue is related to the RCU prefix, you will see something similar to the following:

WLSDPLY-12409: createDomain failed to create the domain: Failed to get FMW infrastructure database defaults from the service table : Got exception when auto configuring the schema component(s) with data obtained from shadow table:
Failed to build JDBC Connection object:

To resolve this issue:

  1. Make sure that the RCU prefix is configured successfully as part of oc-cn-op-job-helm-chart. To do so, run the following command:

    kubectl get pod -n BrmNameSpace

    If it is configured correctly, pdc-configure-rcu-xxxxx will show a Completed status.

  2. Make sure that the RCU prefix and Password configured in the override-values.yaml file for oc-cn-helm-chart and oc-cn-op-job-helm-chart matches, and that a valid host name, port, and service name have been configured in the values.yaml file.

If the values are not configured properly, do the following

  1. Uninstall PDC and then set the ocpdc.isEnabled key to false in your override-values.yaml file for oc-cn-helm-chart.

  2. Run the Helm upgrade command for oc-cn-helm-chart.

    Wait until the PDC pods have stopped.

  3. In the override-values.yaml file for oc-cn-helm-chart and oc-cn-op-job-helm-chart, configure the ocpdc.configEnv.rcuPrefix key and set the ocpdc.isEnabled key to true.

  4. Run the Helm upgrade command for oc-cn-helm-chart and oc-cn-op-job-helm-chart.

    Wait until the PDC pods are in Running status.

For more information about troubleshooting pod errors, see "Troubleshooting" in Oracle WebLogic Kubernetes Operator Samples.

Earlier Cloud Native Releases with WebLogic Kubernetes Operator 3.0.0

If you are installing a version of BRM cloud native released prior to BRM 12.0 Patch Set 3 with Interim Patch 31848465, follow these steps to use WebLogic Kubernetes Operator 3.0.0 with your release.

To use WebLogic Kubernetes Operator 3.0.0 with an earlier version of BRM cloud native:

  1. On the server on which you are installing BRM cloud native, open an SSH session.

  2. Go to the oc-cn-helm-chart directory.

  3. Find all YAML files that include the apiVersion key:

    [USERNAME@HOSTNAME oc-cn-helm-chart]# find . -type f -exec grep -l
    "weblogic.oracle\/v6" {} \;
    ./templates/domain_boc.yaml
    ./templates/domain_brm_wsm.yaml
    ./templates/domain_bcws.yaml
    ./templates/domain_billingcare.yaml
  4. In each YAML file, replace instances of "weblogic.oracle/v6" with "weblogic.oracle/v8". For example:

    apiVersion: "weblogic.oracle/v8"