Troubleshooting Policy

4 Troubleshooting Policy

This chapter provides information to troubleshoot the common errors which can be encountered during the preinstall, installation, upgrade, and rollback procedures of Policy.

Note:

The performance and capacity of the Policy system may vary based on the Call model, Feature/Interface configuration, underlying CNE and hardware environment, including but not limited to the complexity of deployed policies, policy table size , object expression and custom json usage in policy design.

4.1 Deployment Related Issues

This section describes the most common deployment related issues and their resolution steps. It is recommended to perform the resolution steps provided in this guide. If the issue still persists, then contact Oracle Support.

4.1.1 Helm Install Failure

If helm install command Fails

This section covers the reasons and troubleshooting procedures if the helm install command fails.

Reasons for helm install failure:

Chart syntax issue [This issue could be shown in the few seconds]
Please resolve the chart specific things and rerun the helm install command, because in this case, no hooks should have begun.
Most possible reason [TIMEOUT]
If any job stuck in a pending/error state and unable to run, it will result in the timeout after 5 minutes. As default timeout for helm command is "5 minutes". In this case, we have to follow the below steps to troubleshoot.
helm install command failed in case of duplicated chart
```
helm install /home/cloud-user/pcf_1.6.1/sprint3.1/ocpcf-1.6.1-sprint.3.1.tgz --name ocpcf2 --namespace ocpcf2 -f <custom-value-file>
```
Error: release ocpcf2 failed: configmaps "perfinfo-config-ocpcf2" already exists
Here, configmap 'perfinfo-config-ocpcf2' exists multiple times, while creating Kubernetes objects after pre-upgrade hooks, this will be failed. In this case also please go through the below troubleshooting steps.
Troubleshooting steps:
1. Check from describe/logs of failure pods and fix them accordingly. You need to verify what went wrong on the installation of the Policy by checking the below points:
  For the PODs which were not started, run the following command to check the failed pods:
```
kubectl describe pod <pod-name>  -n <release-namespace>
```
  For the PODs which were started but failed to come into "READY"state, run the following command to check the failed pods:
```
kubectl describe logs <pod-name>  -n <release-namespace>
```
2. Run the below command to get kubernetes objects:
```
kubectl get all -n <release_namespace>
```
  This gives a detailed overview of which objects are stuck or in a failed state.
3. Run the below command to delete all kubernetes objects:
```
kubectl delete all --all -n <release_namespace>
```
4. Run the below command to delete all current configmaps:
```
kubectl delete cm --all -n <release-namespace>
```
5. Run the below command to cleanup the databases created by the helm install command and create the database again:
```
DROP DATABASE IF EXISTS occnp_audit_service;
DROP DATABASE IF EXISTS occnp_config_server;
DROP DATABASE IF EXISTS occnp_pcf_am;
DROP DATABASE IF EXISTS occnp_pcf_sm;
DROP DATABASE IF EXISTS occnp_pcrf_core;
DROP DATABASE IF EXISTS occnp_release;
DROP DATABASE IF EXISTS occnp_binding;
DROP DATABASE IF EXISTS occnp_policyds;
DROP DATABASE IF EXISTS occnp_pcf_ue;
DROP DATABASE IF EXISTS occnp_commonconfig;
CREATE DATABASE IF NOT EXISTS occnp_audit_service;
CREATE DATABASE IF NOT EXISTS occnp_config_server;
CREATE DATABASE IF NOT EXISTS occnp_pcf_am;
CREATE DATABASE IF NOT EXISTS occnp_pcf_sm;
CREATE DATABASE IF NOT EXISTS occnp_pcrf_core;
CREATE DATABASE IF NOT EXISTS occnp_release;
CREATE DATABASE IF NOT EXISTS occnp_binding;
CREATE DATABASE IF NOT EXISTS occnp_policyds;
CREATE DATABASE IF NOT EXISTS occnp_pcf_ue;
CREATE DATABASE IF NOT EXISTS occnp_commonconfig; 
```
  In addition, clean up the entries in "mysql.ndb_replication" table by running the following command:
```
DROP TABLE IF EXISTS mysql.ndb_replication;
```
6. Run the following command :
  - For Helm2:
```
helm ls --all
```
  - For Helm3:
```
helm3 ls -n <release-namespace>
```
  If this is in a failed state, please purge the namespace using the following command:
```
helm delete --purge <release_namespace>
```
  Once the purge command is succeeded, press "ctrl+c" to stop the above script.
  Note:
  If the command is taking more time, run the following command in another session to clear all the delete jobs.
```
while true; do kubectl delete jobs --all -n <release_namespace>; sleep 5;done
```
7. After the database cleanup and creation of the database again, run the helm install command.
Policy upgrade fails due to Helm upgrade failure during post-upgrade job for nrf-client-nfdiscovery

Helm upgrade can fail due to an exception in deleting the older release entry from common_configuration table for nrf-client-nfdiscovery service.
Workaround:
- Retry the upgrade, which will delete the older version's configuration enabling upgrade to go through.
- If the retry fails, manually delete the older version entries from common_configuration table and retry the upgrade. This can bring up the services with newer version's configuration data.

If helm install command fails due to atomic and timeout options

The helm install command fails as the external-ip allocation (Loadbalancer) fails for Diameter Gateway, Ingress Gateway, and Configuration Management service as they are of the type loadbalancer.

Reason: The primary reason for this problem is availability of limited infrastructure due to which floating IPs may not be available. It may also happen due to the sytem taking more time to assign floating IPs, as a result of which charts purge.

Solution: To resolve this issue, user may either skip --atomic keyword from the helm install command or set a higher timeout value.

4.1.2 Configuration Issue where mysql-username had an Extra Line

Symptom

No suitable driver found for jdbc

Problem

Secret files contain the user id and password for the MySQL. User ID and password inside the secret file shall be base64 encoded. During base64 encoding, if a new line is present in the user id and password – the line is also encoded and may cause issues when they are decoded back.

Resolution Steps

To resolve this issue, perform the following steps:

Get the secret file created by customer.
Fetch the encoded MySQL username and password.
Go to https://www.base64decode.org/.
Give the username and password and click decode.
Verify if the extra line is present in the username and password. If present, remove the extra line.
Decode it again.

4.1.3 App Info Worker Time Out

Problem

PCF appinfo pod is stuck in restarting with the following log:

[CRITICAL] WORKER TIMEOUT

The appinfo process has a HTTP server (gunicorn) and a few worker processes. The request comes to the gunicorn process, then the worker processes handle the request. If the worker does not return in 30 seconds, then gunicorn prints "WORKER TIMEOUT" error, and kills the worker. From the log, it appears that the worker processes are stuck somewhere.

Troubleshooting steps:

Change the appinfo deployment, increase the liveness threshold value from 3 to a higher value. By doing so, appinfo is not impacted by readiness check.
Watch the log of appinfo to check whether the problem still exists.
If the problem still exists, then we need to find out why the worker process is stuck. Run the following command to get into appinfo pod:
```
kubectl -n <pcf namespace> exec -it <pod name> /bin/bash
```

Create a temporary python file:

cat  > xxx_test.py

import pdb
import appinfo

pdb.set_trace()
appinfo.app.run(port=9999)

Run the following command to run this temporary python file
```
python3 xxx_test.py
```
It launches a python debugger, type "continue" to run the app.
Open another terminal, run the following command:
```
kubectl -n <pcf namespace> exec -it <pod name> /bin/bash
```
Then, run the following command to check whether this temporary service can return immediately:
```
curl localhost:9999/v1/readiness
```
If curl gets stuck, then we have reproduced the problem. Now in the python debugger, type "ctrl+C", and you should be able to get the stack trace that indicates the problem.

4.1.4 Startup Probes

To increase the application's reliability and availability, startup probes are introduced in Policy. Consider a scenario where the configuration is not loaded or partially loaded but the service goes into a ready state. This may result in different pods showing different behaviour for the same service. With the introduction of startup probe, the readiness and liveness checks for a pod are not initiated until the configuration is loaded completely and startup probe is successful. However, if the startup probe fails, the container restarts.

To check the status of startup probe or investigate the reason of failing, perform the following steps:

kubectl exec -it podname -n namespace -- bash
curl -kv http://localhost:<monitoring-port>/<startup-probe-url>

Example:

kubectl exec -it test-pcrf-core-797cf5997-2zlgf -- curl -kv http://localhost:9000/actuator/health/startup

The sample output can be as follow:

[cloud-user@bastion-1 ~]$ 
*   Trying ::1...
* TCP_NODELAY set
* connect to ::1 port 9000 failed: Connection refused
*   Trying 127.0.0.1...
* TCP_NODELAY set
* connect to 127.0.0.1 port 9000 failed: Connection refused
* Failed to connect to localhost port 9000: Connection refused
* Closing connection 0
curl: (7) Failed to connect to localhost port 9000: Connection refused
command terminated with exit code 7
[cloud-user@bastion-1 ~]$ k exec -it test-pcrf-core-797cf5997-2zlgf -- curl -kv http://localhost:9000/actuator/health/startup
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 9000 (#0)
> GET /actuator/health/startup HTTP/1.1
> Host: localhost:9000
> User-Agent: curl/7.61.1
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
< Date: Thu, 21 Apr 2022 11:18:03 GMT
< Content-Type: application/json;charset=utf-8
< Transfer-Encoding: chunked
< Server: Jetty(9.4.43.v20210629)
<
* Connection #0 to host localhost left intact
{"status":"DOWN"}[cloud-user@bastion-1 ~]$ k exec -it test-pcrf-core-797cf5997-2zlgf -- curl -kv http://localhost:9000/actuator/health/startup
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 9000 (#0)
> GET /actuator/health/startup HTTP/1.1
> Host: localhost:9000
> User-Agent: curl/7.61.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 21 Apr 2022 11:18:04 GMT
< Content-Type: application/json;charset=utf-8
< Transfer-Encoding: chunked
< Server: Jetty(9.4.43.v20210629)
<
* Connection #0 to host localhost left intact
{"status":"UP"}[cloud-user@bastion-1 ~]$

To check why the startup probe failed, describe the output:
```
Describe output:

 Warning  Unhealthy  <invalid> (x10 over 2m45s)  kubelet            Startup probe failed: Get "http://10.233.81.231:9000/actuator/health/startup": dial tcp 10.233.81.231:9000: connect: connection refused
```
The following could be the possible reasons for startup probe failure:
- Network connectivity issue
- Database connection issue due to which server is not coming up
- Due to any other exception
If the reason for startup probe failure is not clear, check the logs to determine if it is due to an issue with config-server connection or any issue with fetching configurations from the config-server.

4.1.5 Monitoring of Diameter Gateway worker nodes failure

Symptom

When Diameter Gateway node fails, new replicas are not created in a different worker node.

Problem

On the Diameter Gateway, if the worker node is being shutdown, it is set to "Terminating" state. The diameter gatway pods are statefulsets, due to which new pods are not created until the original pod dies. While in similar scenario new worker nodes are spun for replicasets. The pod has to be forced killed using the --force option.

Resolution

For Diameter Gateway, set terminationGracePeriodSeconds to 0s. This is done by configuring the occnp-custom-values.yaml file.

Example:


diam-gateway:
  # Graceful Termination
  gracefulShutdown: 
    gracePeriod:0s

Create an alert that gets triggered when a node is down. Do modify the oid and name as per customer deployment if needed.

Example:


name: NODE_UNAVAILABLE
expr: kube_node_status_condition{condition="Ready",status="true"}== 0
for: 30s
labels:
oid: XXXXXX
severity: critical
annotations:
description: Kubernetes node {{ $labels.node }} is not in Ready state
summary: Kubernetes node {{ $labels.node }} is unavailable {code}

4.2 Database Related Issues

This section describes the most common database related issues and their resolution steps. It is recommended to perform the resolution steps provided in this guide. If the issue still persists, then contact Oracle Support.

4.2.1 Policy MySQL DB Access

Problem

Keyword - wait-for-db

Tags - "config-server" "database" "readiness" "init" "SQLException" "access denied"

Because of database accessibility issues from the Policy service, pods will stay in the init state.

For some pods, if they come up, they will be kept on getting the exception : " Cannot connect to database server java.sql.SQLException"

Reasons:

MySQL host IP address OR MySQL-service name[in case of occne-infra] is not correctly given.
Few MySQL nodes are probably down.
Username/Password given in the secrets are not created in the database OR not having proper grant/access to service databases.
Databases are not created correctly with the same name mentioned in the custom_value file while installing Policy. - MOST LIKELY

Resolution Steps

To resolve this issue, perform the following steps:

Check if the database IP is proper and pingable from worker nodes of the Kubernetes cluster. Update the database IP and service accordingly. If required, you can use floating IP as well. If the database connectivity issue is there, then please update the proper IP address.
In the case of the CNE infrastructure, instead of mentioning IP address for MySQL connection, please use FQDN for mysql-connectivity-service to connect to the database.
Manually log in to MySQL via the same database IP mentioned in a custom-value file. In case of MySQL service name, describe the service by command :
```
kubectl describe svc <mysql-servicename> -n <namespace> 
```
and login to the MySQL database with all sets of IPs described in the MySQL service, If any SQL node is down, it will lead to an intermittent DB query failure issue. So make sure that you can log in to MySQL from all the Nodes mentioned in the IP list of MySQL-service describe command.
Make sure that all the MySQL nodes are up and running before installing the Policy.
Check the existing user list into the database using SQL query: "select user from mysql.user;"
Check if all the mentioned users in the custom-value of Policy installation are present in the database.

Note:
Create the user with proper password as mentioned in the secret file of the Policy.
Check the grants of all the users mentioned into the custom_value file by SQL query: "show grants for <username>;"
If username/password issue is there, then please correctly create the user with the required password and provide grants as per the installation guide.
Check the databases are created with the same name mentioned in the custom_value file for the services.

Note:
Create the database as per the custom_value file.
Check if problematic pods are getting created on any one unique worker node. If yes, then may be the cause of the error can be the worker node. Try draining the problematic worker node and allow pods to move to another node.

4.3 Service Related Issues

This section describes the most common service related issues and their resolution steps. It is recommended to perform the resolution steps provided in this guide. If the issue still persists, then contact Oracle Support.

4.3.1 SM Service Issues

This section describes the most common SM service issues and their resolution steps. It is recommended to for users to attempt the resolution steps provided in this guide before contacting Oracle Support.

Failed BSF register or deregister binding

Symptom

On sending BSF register or deregister binding request, the SM service receives 406 NOT_ACCEPTABLE binding reply message from BSF.

Problem

When the SM service initiates a request to register with or deregister from BSF a session, BSF sends 4xx in the response code. It is assumed that bindingSvcenabled parameter is set to true while deploying the Policy instance.

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{"instant":{"epochSecond":1621844941,"nanoOfSecond":854958774},"thread":"boundedElastic-1","level":"INFO","loggerName":"ocpm.pcf.service.sm.serviceconnector.BsfConnector","message":"Sent Binding Request to BSF Service: https://bsf.apigateway:8001/nbsf-management/v2/pcfBindings, {\"supi\":\"imsi-10000000002\",\"contextId\":\"afa7e0cb-87f3-4e6c-a867-166705acfcfe\",\"gpsi\":\"msisdn-10000000001\",\"ipv4Addr\":\"192.168.10.10\",\"ipv6Prefix\":\"2800:a00:cc01::/64\",\"ipDomain\":\"ora.com\",\"dnn\":\"dnn1\",\"pcfFqdn\":\"pcf-smservice.pcf\",\"pcfDiamHost\":\"pcf-smservice\",\"pcfDiamRealm\":\"pcf-smservice.svc\",\"snssai\":{\"sst\":11,\"sd\":\"abc123\"}}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":37,"threadPriority":5,"messageTimestamp":"2021-05-24T08:29:01.854+0000"}
{"instant":{"epochSecond":1621844941,"nanoOfSecond":945868600},"thread":"boundedElastic-1","level":"INFO","loggerName":"ocpm.pcf.service.sm.serviceconnector.BsfConnector","message":"Receive Binding Reply from BSF: 406 NOT_ACCEPTABLE","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":37,"threadPriority":5,"messageTimestamp":"2021-05-24T08:29:01.945+0000"}
{"instant":{"epochSecond":1621844941,"nanoOfSecond":946569461},"thread":"boundedElastic-1","level":"DEBUG","loggerName":"ocpm.pcf.service.sm.domain.component.metrics.SmMetrics","message":"Pegging binding response metric. Dnn :dnn1, snssai : 11-abc123, operationType : create ,mode : synchronous ,responseCode : 4xx","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":37,"threadPriority":5,"messageTimestamp":"2021-05-24T08:29:01.946+0000"}

Resolution Steps

Policy not evaluated, and instead default policy got applied

Symptom

On sending POST request to binding service, SM service receives failed to call Binding service error.

Problem

When the SM service initiates a POST request towards BSF such as http://my-cnpolicy-occnp-binding:8000/binding/v1/contextBinding/context-owner/PCF-SM, an error occurs and a message is received at SM service stating that the system failed to call Binding service. User may search for a response similar to the following:

logMsg=Failed to call policy service: {}

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{
  "instant": {
    "epochSecond": 1623052588,
    "nanoOfSecond": 652897378
  },
  "thread": "boundedElastic-7",
  "level": "ERROR",
  "loggerName": "ocpm.pcf.service.common.domain.serviceconnector.PolicyServiceException",
  "message": "Max Attempts Reached for PRE connections",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 125,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-07T07:56:28.652+0000"
}
{
  "instant": {
    "epochSecond": 1623052588,
    "nanoOfSecond": 653125047
  },
  "thread": "boundedElastic-7",
  "level": "ERROR",
  "loggerName": "ocpm.pcf.service.common.domain.serviceconnector.PolicyServiceConnector",
  "message": "Failed to call policy service: {} ",
  "thrown": {
    "commonElementCount": 0,
    "name": "ocpm.pcf.service.common.domain.serviceconnector.PolicyServiceException",
    "extendedStackTrace": "ocpm.pcf.service.common.domain.serviceconnector.PolicyServiceException: null\n\tat ocpm.pcf.service.common.domain.serviceconnector.PolicyServiceConnector.lambda$processObject$4(PolicyServiceConnector.java:106) ~[classes!/:?]\n\tat reactor.util.retry.RetryBackoffSpec.lambda$generateCompanion$4(RetryBackoffSpec.java:557) ~[reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:374) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:250) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.EmitterProcessor.drain(EmitterProcessor.java:491) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.EmitterProcessor.tryEmitNext(EmitterProcessor.java:299) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.SinkManySerialized.tryEmitNext(SinkManySerialized.java:97) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.InternalManySink.emitNext(InternalManySink.java:27) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onError(FluxRetryWhen.java:189) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.publisher.MonoPublishOn$PublishOnSubscriber.run(MonoPublishOn.java:187) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28) [reactor-core-3.4.3.jar!/:3.4.3]\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]\n\tat java.lang.Thread.run(Thread.java:832) [?:?]\n"
  },
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 125,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-07T07:56:28.653+0000"
}
{
  "instant": {
    "epochSecond": 1623052588,
    "nanoOfSecond": 653405431
  },
  "thread": "boundedElastic-7",
  "level": "DEBUG",
  "loggerName": "ocpm.pcf.service.common.domain.component.policy.PolicyManager",
  "message": "process PolicyReply",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 125,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-07T07:56:28.653+0000"
}

Resolution Steps

To resolve this issue, perform the following steps:

Check whether all the pods are running or not.
Verify that correct URLs have been mentioned in the deployment file.
Update any incorrect or missing information in the deployment file.
Run the policy again.

Inter-microservice communication failures

Symptom

SM service receives failed to call Policy service error.

Problem

When the microservices are unable to establish communication with the each other, an error occurs and a message is received at SM service stating that the system failed to call Policy service. Search for an error message similar to the following:

logMsg=Failed to call Binding service for
    a8f8cf48-b889-44ee-95e6-a9b82fdeeef3


Error has been observed at the following site(s):
|_ checkpoint ? Request to POST http://my-cnpolicy-occnp-binding:8000/binding/v1/contextBinding/context-owner/PCF-SM [DefaultWebClient]

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{"instant":{"epochSecond":1622547969,"nanoOfSecond":166956262},"thread":"boundedElastic-14","level":"ERROR","loggerName":"ocpm.pcf.service.sm.serviceconnector.BindingServiceConnector","message":"Failed to call Binding service for
      a8f8cf48-b889-44ee-95e6-a9b82fdeeef3 :
      org.springframework.web.reactive.function.client.WebClientRequestException: Connection
      refused; nested exception is java.net.ConnectException: Connection refused
      ","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":23640,"threadPriority":5,"messageTimestamp":"2021-06-01T11:46:09.166+0000"}org.springframework.web.reactive.function.client.WebClientRequestException: Connection
      refused; nested exception is java.net.ConnectException: Connection refusedat
      org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:137)Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:Error has been observed at the following site(s):|_ checkpoint ? Request to POST http://my-cnpolicy-occnp-binding:8000/binding/v1/contextBinding/context-owner/PCF-SM [DefaultWebClient]Stack trace:at
      org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:137)at reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:70)at reactor.core.publisher.Mono.subscribe(Mono.java:4046)at
      reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)at
      reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onError(FluxPeekFuseable.java:234)at
      reactor.core.publisher.FluxPeekFuseable$PeekFuseableSubscriber.onError(FluxPeekFuseable.java:234)at reactor.core.publisher.Operators$MonoSubscriber.onError(Operators.java:1862)at
      reactor.core.publisher.MonoIgnoreThen$ThenAcceptInner.onError(MonoIgnoreThen.java:315)at
      org.eclipse.jetty.reactive.client.internal.AbstractSingleProcessor.onError(AbstractSingleProcessor.java:119)at
      org.eclipse.jetty.reactive.client.internal.ResponseListenerProcessor.onComplete(ResponseListenerProcessor.java:140)at
      org.eclipse.jetty.client.ResponseNotifier.notifyComplete(ResponseNotifier.java:218)at
      org.eclipse.jetty.client.ResponseNotifier.notifyComplete(ResponseNotifier.java:210)at
      org.eclipse.jetty.client.HttpExchange.notifyFailureComplete(HttpExchange.java:269)at org.eclipse.jetty.client.HttpExchange.abort(HttpExchange.java:240)at org.eclipse.jetty.client.HttpConversation.abort(HttpConversation.java:149)at org.eclipse.jetty.client.HttpRequest.abort(HttpRequest.java:818)at org.eclipse.jetty.client.HttpDestination.abort(HttpDestination.java:506)at org.eclipse.jetty.client.HttpDestination.failed(HttpDestination.java:253)at
      org.eclipse.jetty.client.AbstractConnectionPool$FutureConnection.failed(AbstractConnectionPool.java:551)at org.eclipse.jetty.util.Promise$Wrapper.failed(Promise.java:136)at org.eclipse.jetty.client.HttpClient$1$1.failed(HttpClient.java:633)at
      org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2$SessionListenerPromise.failConnectionPromise(HttpClientTransportOverHTTP2.java:261)at
      org.eclipse.jetty.http2.client.http.HttpClientTransportOverHTTP2$SessionListenerPromise.failed(HttpClientTransportOverHTTP2.java:194)at
      org.eclipse.jetty.http2.client.HTTP2Client$ClientSelectorManager.connectionFailed(HTTP2Client.java:516)at org.eclipse.jetty.io.ManagedSelector$Connect.failed(ManagedSelector.java:929)at org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:335)at org.eclipse.jetty.io.ManagedSelector.access$1600(ManagedSelector.java:62)at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:639)at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:501)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)at
      org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)at
      org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)at java.base/java.lang.Thread.run(Thread.java:832)Caused by: java.net.ConnectException: Connection refusedat java.base/sun.nio.ch.Net.pollConnect(Native Method)at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:660)at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:875)at org.eclipse.jetty.io.SelectorManager.doFinishConnect(SelectorManager.java:355)at org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:313)at org.eclipse.jetty.io.ManagedSelector.access$1600(ManagedSelector.java:62)at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:639)at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:501)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:360)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:184)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)at
      org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)at
      org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)at
      org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)at java.base/java.lang.Thread.run(Thread.java:832)

Resolution Steps

To resolve this issue, perform the following steps:

PCF is suspended with both primary and secondary NRF

Symptom

SM service receives status of services associated with NfType: PCF is Deregistration service warning.

Problem

When the SM service tries to establish communication with the NRF client, but PCF is suspended with both primary and secondary NRF, a warning message is received. User may search for the following log message:

logMsg=Status of services associated with NfType :PCF is Deregistration, nrfInstanceId=<>,
response=<>

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{
  "instant": {
    "epochSecond": 1622098819,
    "nanoOfSecond": 615494329
  },
  "thread": "main",
  "level": "WARN",
  "loggerName": "com.oracle.cgbu.cnc.nrf.NRFManagement",
  "message": "Status of services associated with NfType :PCF is Deregistration",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 1,
  "threadPriority": 5,
  "source": {
    "method": "registerNfInstance",
    "file": "NRFManagement.java",
    "line": 843,
    "class": "com.oracle.cgbu.cnc.nrf.NRFManagement"
  },
  "messageTimestamp": "2021-05-27T07:00:19.615+0000"
}

Resolution Steps

To resolve this issue, perform the following steps:

Check whether the primaryNrfApiRoot and secondaryNrfApiRoot point to the correct endpoints.
Check the status of the pcf-sm-servicepod.
Check the logs of nrf-management pod.
If the pcf-sm-service is down and NRF discovery is not able to register then restart the pcf-sm-service pod.
Check the logs of nrf-management pod again to verify if the registration has happened successfully.

Failed to write to database

Symptom

SM service receives error on trying to save data in database.

Problem

When the SM service tries to write to database, but the request is not processed and the following error message is generated:

logMsg="Could not create connection to database server"

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{
  "instant": {
    "epochSecond": 1622819336,
    "nanoOfSecond": 250368963
  },
  "thread": "main",
  "level": "INFO",
  "loggerName": "ocpm.cne.common.db.JdbcDbClient",
  "message": "Maximum Pool Size is: 32 ",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 1,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-04T15:08:56.250+0000"
}
{
  "instant": {
    "epochSecond": 1622819336,
    "nanoOfSecond": 265776725
  },
  "thread": "main",
  "level": "WARN",
  "loggerName": "com.zaxxer.hikari.HikariConfig",
  "message": "HikariPool-1 - idleTimeout has been set but has no effect because the pool is operating as a fixed size pool.",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 1,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-04T15:08:56.265+0000"
}
{"instant":{"epochSecond":1622819342,"nanoOfSecond":432547862},"thread":"main","level":"ERROR","loggerName":"com.zaxxer.hikari.pool.HikariPool","message":"HikariPool-1
      - Exception during pool
      initialization.","thrown":{"commonElementCount":0,"localizedMessage":"Could not create
      connection to database server. Attempted reconnect 3 times. Giving up.","message":"Could
      not create connection to database server. Attempted reconnect 3 times. Giving
      up.","name":"java.sql.SQLNonTransientConnectionException","cause":{"commonElementCount":67,"localizedMessage":"Communications
      link failure\n\nThe last packet sent successfully to the server was 0 milliseconds ago. The
      driver has not received any packets from the server.","message":"Communications link
      failure\n\nThe last packet sent successfully to the server was 0 milliseconds ago. The driver
      has not received any packets from the
      server.","name":"com.mysql.cj.exceptions.CJCommunicationsException","cause":{"commonElementCount":67,"localizedMessage":"Connection
      refused","message":"Connection refused","name":"java.net.ConnectException","extendedStackTrace":"java.net.ConnectException: Connection refused\n\tat sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]\n\tat sun.nio.ch.Net.pollConnectNow(Net.java:660) ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
      ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597) ~[?:?]\n\tat java.net.SocksSocketImpl.connect(SocksSocketImpl.java:333) ~[?:?]\n\tat java.net.Socket.connect(Socket.java:648) ~[?:?]\n\tat
      com.mysql.cj.protocol.StandardSocketFactory.connect(StandardSocketFactory.java:155)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:63)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.NativeSession.connect(NativeSession.java:144)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:847)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n"},"extendedStackTrace":"com.mysql.cj.exceptions.CJCommunicationsException:
      Communications link failure\n\nThe last packet sent successfully to the server was 0
      milliseconds ago. The driver has not received any packets from the server.\n\tat
      jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]\n\tat
      jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:64)
      ~[?:?]\n\tat
      jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      ~[?:?]\n\tat java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
      ~[?:?]\n\tat java.lang.reflect.Constructor.newInstance(Constructor.java:481) ~[?:?]\n\tat
      com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:89)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.NativeSession.connect(NativeSession.java:144)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]\n\tat
      com.mysql.cj.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:847)
      ~[mysql-connector-java-8.0.23.jar!/:8.0.23]

Resolution Steps

To resolve this issue, perform the following steps:

Check the database connection details of the specific deployment file.
Update any missing or incorrect details.
Check whether the database tables are created properly and include all the required columns.

4.3.2 CM Service Issues

This section describes the most common Configuration Management (CM) service issues and their resolution steps. It is recommended for users to attempt the resolution steps provided in this guide before contacting Oracle Support.

Configuration Management GUI not loading configuration data

Symptom

Configuration data is not updated in the GUI, that is, Cloud Native Configuration Console (CNC Console).

Problem

When the configuration data is not loaded in the configuration management GUI, the following error message is generated:

logMsg=Error fetching config-items for topic: common.logging.config-mgmt, retry after 1000
      milliseconds and retry count = 1

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{"instant":{"epochSecond":1622099002,"nanoOfSecond":761814372},"thread":"pool-6-thread-1","level":"ERROR","loggerName":"ocpm.cne.common.configclient.ConfigServerConnectionWithRetry","message":"Could
      not fetch config-items for topic: common.logging.config-mgmt, maxRetry is
      exhausted.","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":28,"threadPriority":5,"messageTimestamp":"2021-05-27T07:03:22.761+0000"}{"instant":{"epochSecond":1622099005,"nanoOfSecond":792401204},"thread":"pool-6-thread-1","level":"WARN","loggerName":"ocpm.cne.common.configclient.ConfigServerConnectionWithRetry","message":"Error
      fetching config-items for topic: common.logging.config-mgmt, retry after 1000 milliseconds and
      retry count = 1. Exception:
      ","thrown":{"commonElementCount":0,"localizedMessage":"java.net.ConnectException: Connection refused","message":"java.net.ConnectException: Connection refused","name":"javax.ws.rs.ProcessingException","cause":{"commonElementCount":16,"localizedMessage":"Connection
      refused","message":"Connection refused","name":"java.net.ConnectException","extendedStackTrace":"java.net.ConnectException: Connection refused\n\tat sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]\n\tat sun.nio.ch.Net.pollConnectNow(Net.java:660) ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:549)
      ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597) ~[?:?]\n\tat java.net.Socket.connect(Socket.java:648) ~[?:?]\n\tat sun.net.NetworkClient.doConnect(NetworkClient.java:177) ~[?:?]\n\tat sun.net.www.http.HttpClient.openServer(HttpClient.java:474) ~[?:?]\n\tat sun.net.www.http.HttpClient.openServer(HttpClient.java:569) ~[?:?]\n\tat sun.net.www.http.HttpClient.<init>(HttpClient.java:242) ~[?:?]\n\tat sun.net.www.http.HttpClient.New(HttpClient.java:341) ~[?:?]\n\tat sun.net.www.http.HttpClient.New(HttpClient.java:362) ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1261)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1194)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1082)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1016)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1600)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1528)
        ~[?:?]\n\tat java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527) ~[?:?]\n\tat
      org.glassfish.jersey.client.internal.HttpUrlConnector._apply(HttpUrlConnector.java:367)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:259)
      ~[jersey-client-2.30.1.jar!/:?]\n"},"extendedStackTrace":"javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused\n\tat
      org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:261)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.client.ClientRuntime.invoke(ClientRuntime.java:296)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.client.JerseyInvocation.lambda$invoke$2(JerseyInvocation.java:643)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.internal.Errors.process(Errors.java:292)
      ~[jersey-common-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.internal.Errors.process(Errors.java:274)
      ~[jersey-common-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.internal.Errors.process(Errors.java:205)
      ~[jersey-common-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:390)
      ~[jersey-common-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.client.JerseyInvocation.invoke(JerseyInvocation.java:641)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.client.JerseyInvocation$Builder.method(JerseyInvocation.java:414)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.client.JerseyInvocation$Builder.get(JerseyInvocation.java:305)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      ocpm.cne.common.configclient.ConfigServerConnectionWithRetry.getConfigurationItemByTopicWithRetry(ConfigServerConnectionWithRetry.java:313)
      [cne-common-1.11.0.jar!/:?]\n\tat
      ocpm.cne.common.configclient.ConfigClient.getConfigurationItemByTopic(ConfigClient.java:178)
      [cne-common-1.11.0.jar!/:?]\n\tat
      ocpm.cne.common.configclient.ConfigClient.getConfigurationItemByTopic(ConfigClient.java:149)
      [cne-common-1.11.0.jar!/:?]\n\tat
      ocpm.cne.common.logging.level.PullLogLevelConfigTask.run(PullLogLevelConfigTask.java:68)
      [cne-common-1.11.0.jar!/:?]\n\tat
      java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]\n\tat
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]\n\tat
      java.lang.Thread.run(Thread.java:832) [?:?]\nCaused by: java.net.ConnectException: Connection refused\n\tat sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]\n\tat sun.nio.ch.Net.pollConnectNow(Net.java:660) ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:549)
      ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597) ~[?:?]\n\tat java.net.Socket.connect(Socket.java:648) ~[?:?]\n\tat sun.net.NetworkClient.doConnect(NetworkClient.java:177) ~[?:?]\n\tat sun.net.www.http.HttpClient.openServer(HttpClient.java:474) ~[?:?]\n\tat sun.net.www.http.HttpClient.openServer(HttpClient.java:569) ~[?:?]\n\tat sun.net.www.http.HttpClient.<init>(HttpClient.java:242) ~[?:?]\n\tat sun.net.www.http.HttpClient.New(HttpClient.java:341) ~[?:?]\n\tat sun.net.www.http.HttpClient.New(HttpClient.java:362) ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1261)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1194)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1082)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1016)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1600)
        ~[?:?]\n\tat sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1528)
        ~[?:?]\n\tat java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527) ~[?:?]\n\tat
      org.glassfish.jersey.client.internal.HttpUrlConnector._apply(HttpUrlConnector.java:367)
      ~[jersey-client-2.30.1.jar!/:?]\n\tat
      org.glassfish.jersey.client.internal.HttpUrlConnector.apply(HttpUrlConnector.java:259)
      ~[jersey-client-2.30.1.jar!/:?]\n\t... 16
      more\n"},"endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":28,"threadPriority":5,"messageTimestamp":"2021-05-27T07:03:25.792+0000"}

Resolution Steps

To resolve this issue, perform the following steps:

Check the status of the config-management and config-server pods.
Check the logs of config-server pod and rule out errors related to database or connection.
If the config-server is itself down, restart the config-server pod. Then, check the logs of config-management.
After performing these steps, the data should be available on CNC Console.

4.3.3 Audit Service Issues

This section describes the most common Audit service issues and their resolution steps. It is recommended for users to attempt the resolution steps provided in this guide before contacting Oracle Support.

Audit service unable to notify services about stale session

Symptom

Audit service receives error message on sending a notification request.

Problem

When the audit service detects a stale session, it sends a notification to the owner service about the stale records. When the notification request sent by Audit service is not successful, it receives a response similar to the following error:

logMsg=Error sending notification request to http://my-cnpolicy-occnp-pcf-sm:8005/audit/notify

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{
  "instant": {
    "epochSecond": 1623075339,
    "nanoOfSecond": 71945796
  },
  "thread": "main",
  "level": "WARN",
  "loggerName": "org.hibernate.orm.connections.pooling",
  "message": "HHH10001002: Using Hibernate built-in connection pool (not for production use!)",
  "endOfBatch": false,
  "loggerFqcn": "org.hibernate.internal.log.ConnectionPoolingLogger_$logger",
  "threadId": 1,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-07T14:15:39.071+0000"
}
{
  "instant": {
    "epochSecond": 1623075340,
    "nanoOfSecond": 282816509
  },
  "thread": "Thread-6",
  "level": "ERROR",
  "loggerName": "ocpm.common.service.audit.services.NotifyTask",
  "message": "Error sending notification request to http://my-cnpolicy-occnp-pcf-sm:8005/audit/notify",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 69,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-07T14:15:40.282+0000"
}
{
  "instant": {
    "epochSecond": 1623075340,
    "nanoOfSecond": 285152533
  },
  "thread": "pool-4-thread-1",
  "level": "ERROR",
  "loggerName": "ocpm.common.service.audit.services.NotifyTask",
  "message": "Notification was not sent to http://my-cnpolicy-occnp-pcf-sm:8005/audit/notify due to an error",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 62,
  "threadPriority": 5,
  "messageTimestamp": "2021-06-07T14:15:40.285+0000"
}

Resolution Steps

To resolve this issue, perform the following steps:

Verify if the correct value has been entered for AUDIT_NOTIFICATION_URL in the deployment file of SM service.
Ensure that the value of notificationUri in the AuditRegistrations database table has the same value as mentioned in the SM service deployment file.
After performing these steps, check the logs again.

4.3.4 UDR Connector Issues

This section describes the most common UDR Connector issues and their resolution steps. Users are recommended to attempt the resolution steps provided in this guide before contacting Oracle Support.

Failed or no UDR on-demand discovery to NRF on Egress Gateway

Symptom

UDR returns status code: 424 FAILED_DEPENDENCY on receiving a request from UDR connector.

Problem

When the UDR connector sends a request to UDR to fetch for example SmPolicyData, UDR tries to establish connection with NRF on Egress Gateway to process the on-demand discovery request. However, when it is unable to establish the connection, it returns the following status code:

ClientResponse has erroneous status code: 424 FAILED_DEPENDENCY

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{
  "instant": {
    "epochSecond": 1622112585,
    "nanoOfSecond": 184133637
  },
  "thread": "XNIO-1 task-5",
  "level": "INFO",
  "loggerName": "ocpm.pcf.service.ud.intf.restful.api.UserDataUniformApiController",
  "message": "Received GET request, ueIdList: [imsi-650081000000606], reqParam: {\"smPolicyDataReq\":{\"subscription\":false,\"params\":null,\"snssai\":{\"sst\":11,\"sd\":\"abc123\"},\"dnn\":\"dnn1\",\"fields\":null},\"ldapDataReq\":{\"subscription\":false,\"params\":null},\"ssvEnabled\":false}",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 86,
  "threadPriority": 5,
  "messageTimestamp": "2021-05-27T10:49:45.184+0000"
}
{
  "instant": {
    "epochSecond": 1622112585,
    "nanoOfSecond": 186728033
  },
  "thread": "UserService_ThreadPool_6",
  "level": "DEBUG",
  "loggerName": "ocpm.pcf.service.ud.core.AbstractCommonService",
  "message": "Initialize user for [imsi-650081000000606], result: {\"pk\":\"6966884214826339058\",\"ueIdList\":[\"imsi-650081000000606\"],\"policyDataProfile\":{\"subscriptionMap\":{}}}",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 115,
  "threadPriority": 5,
  "messageTimestamp": "2021-05-27T10:49:45.186+0000"
}
{
  "instant": {
    "epochSecond": 1622112585,
    "nanoOfSecond": 189665164
  },
  "thread": "UserService_ThreadPool_6",
  "level": "INFO",
  "loggerName": "ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService",
  "message": "discover UDR instance on demand: http://pcf1111-occnp-nrf-client-nfdiscovery:8000/nnrf-disc/v1/nf-instances?target-nf-type=UDR&requester-nf-type=PCF&service-names=nudr-dr&supi=imsi-650081000000606",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 115,
  "threadPriority": 5,
  "messageTimestamp": "2021-05-27T10:49:45.189+0000"
}

{"instant":{"epochSecond":1622112585,"nanoOfSecond":184133637},"thread":"XNIO-1 task-5","level":"INFO","loggerName":"ocpm.pcf.service.ud.intf.restful.api.UserDataUniformApiController","message":"Received GET request, ueIdList: [imsi-650081000000606], reqParam: {\"smPolicyDataReq\":{\"subscription\":false,\"params\":null,\"snssai\":{\"sst\":11,\"sd\":\"abc123\"},\"dnn\":\"dnn1\",\"fields\":null},\"ldapDataReq\":{\"subscription\":false,\"params\":null},\"ssvEnabled\":false}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":86,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:45.184+0000"}
{"instant":{"epochSecond":1622112585,"nanoOfSecond":186728033},"thread":"UserService_ThreadPool_6","level":"DEBUG","loggerName":"ocpm.pcf.service.ud.core.AbstractCommonService","message":"Initialize user for [imsi-650081000000606], result: {\"pk\":\"6966884214826339058\",\"ueIdList\":[\"imsi-650081000000606\"],\"policyDataProfile\":{\"subscriptionMap\":{}}}","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:45.186+0000"}
{"instant":{"epochSecond":1622112585,"nanoOfSecond":189665164},"thread":"UserService_ThreadPool_6","level":"INFO","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService","message":"discover UDR instance on demand: http://pcf1111-occnp-nrf-client-nfdiscovery:8000/nnrf-disc/v1/nf-instances?target-nf-type=UDR&requester-nf-type=PCF&service-names=nudr-dr&supi=imsi-650081000000606","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:45.189+0000"}

{"instant":{"epochSecond":1622112586,"nanoOfSecond":782028662},"thread":"UserService_ThreadPool_6","level":"WARN","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService","message":"ClientResponse has erroneous status code: 424 FAILED_DEPENDENCY, body: ","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:46.782+0000"}
{"instant":{"epochSecond":1622112586,"nanoOfSecond":782274717},"thread":"UserService_ThreadPool_6","level":"DEBUG","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService","message":"Check for response:null","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:46.782+0000"}
{"instant":{"epochSecond":1622112586,"nanoOfSecond":782622756},"thread":"UserService_ThreadPool_6","level":"DEBUG","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService","message":"Retry Exception result: true","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:46.782+0000"}
{"instant":{"epochSecond":1622112586,"nanoOfSecond":782858203},"thread":"UserService_ThreadPool_6","level":"DEBUG","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.AlternateRouteServiceHelper","message":"Check Retry: Profile: RetryProfilebject [name = udr-retry, enableRetrySettings = true, enableAlternateRouting = true] SubQuery_Count: 0 retryCount: 2","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:46.782+0000"}
{"instant":{"epochSecond":1622112586,"nanoOfSecond":783021549},"thread":"UserService_ThreadPool_6","level":"DEBUG","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.AlternateRouteServiceHelper","message":"RETRY status: true, subQueryRetry: 1, ConfiguredRetryCount: 2","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:46.783+0000"}
{"instant":{"epochSecond":1622112586,"nanoOfSecond":783292038},"thread":"UserService_ThreadPool_6","level":"INFO","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService","message":"discover UDR instance on demand: http://pcf1111-occnp-nrf-client-nfdiscovery:8000/nnrf-disc/v1/nf-instances?target-nf-type=UDR&requester-nf-type=PCF&service-names=nudr-dr&supi=imsi-650081000000606","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":115,"threadPriority":5,"messageTimestamp":"2021-05-27T10:49:46.783+0000"}

Resolution Steps

To resolve this issue, perform the following steps:

In the application-config yaml file, check whether UDR is present in nrfClientSubscribeTypes values or not. If it is not present, then add UDR.
Restart nrf-management pod.
Verify on-demand flag in PCF user connector GUI.
After resending the request, check the UDR and NRF-management logs to verify if the request for on-demand discovery has been processed successfully.

Failed or no Policy data request to UDR on Egress Gateway

Symptom

UDR returns Could NOT find any NFProfile, set NullDataSource for UDR on receiving a policy data request from UDR connector on Egress Gateway.

Problem

To fetch policy data, for example SmPolicyData, UDR connector sends a request to UDR. However, when the UDR is unable to process the policy data request successfully, the following response is received in the log message:

Could NOT find any NFProfile, set NullDataSource for UDR

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{"instant":{"epochSecond":1622025233,"nanoOfSecond":476028210},"thread":"main","level":"WARN","loggerName":"io.undertow.websockets.jsr","message":"UT026010:
      Buffer pool was not set on WebSocketDeploymentInfo, the default pool will be
      used","endOfBatch":false,"loggerFqcn":"io.undertow.websockets.jsr.JsrWebSocketLogger_$logger","threadId":1,"threadPriority":5,"messageTimestamp":"2021-05-26T10:33:53.476+0000"}{"instant":{"epochSecond":1622027281,"nanoOfSecond":565754129},"thread":"UserService_ThreadPool_1","level":"WARN","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService","message":"Could
      NOT find any NFProfile, set NullDataSource for
      UDR","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":50,"threadPriority":5,"messageTimestamp":"2021-05-26T11:08:01.565+0000"}{"instant":{"epochSecond":1622027281,"nanoOfSecond":576403066},"thread":"UserService_ThreadPool_1","level":"WARN","loggerName":"ocpm.pcf.service.ud.core.UdrService","message":"Failed
      GET class ocpm.pcf.service.ud.domain.SmPolicyData, ueId: imsi-650081000000606, result:
      FAILURE_DATASOURCENOTFOUND","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":50,"threadPriority":5,"messageTimestamp":"2021-05-26T11:08:01.576+0000"}{"instant":{"epochSecond":1622027281,"nanoOfSecond":580718514},"thread":"UserService_ThreadPool_1","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"A
      child [GET] request failed, <500 INTERNAL_SERVER_ERROR Internal Server
      Error,[]>","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":50,"threadPriority":5,"messageTimestamp":"2021-05-26T11:08:01.580+0000"}{"instant":{"epochSecond":1622027281,"nanoOfSecond":582259973},"thread":"UserService_ThreadPool_1","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"Request
      failed 500 INTERNAL_SERVER_ERROR,
      requestContext=RequestContext{userIds=[imsi-650081000000606],
      requestParams='{\"smPolicyDataReq\":{\"subscription\":false,\"params\":null,\"snssai\":{\"sst\":11,\"sd\":\"abc123\"},\"dnn\":\"dnn1\",\"fields\":null},\"ldapDataReq\":{\"subscription\":false,\"params\":null}}',
      requestType='GET'}.","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":50,"threadPriority":5,"messageTimestamp":"2021-05-26T11:08:01.582+0000"}{"instant":{"epochSecond":1622027658,"nanoOfSecond":279448141},"thread":"UserService_ThreadPool_2","level":"WARN","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.udr.UdrDataSourceService","message":"Could
      NOT find any NFProfile, set NullDataSource for
      UDR","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":55,"threadPriority":5,"messageTimestamp":"2021-05-26T11:14:18.279+0000"}{"instant":{"epochSecond":1622027658,"nanoOfSecond":280438989},"thread":"UserService_ThreadPool_2","level":"WARN","loggerName":"ocpm.pcf.service.ud.core.UdrService","message":"Failed
      GET class ocpm.pcf.service.ud.domain.SmPolicyData, ueId: imsi-650081000000606, result:
      FAILURE_DATASOURCENOTFOUND","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":55,"threadPriority":5,"messageTimestamp":"2021-05-26T11:14:18.280+0000"}{"instant":{"epochSecond":1622027658,"nanoOfSecond":280904616},"thread":"UserService_ThreadPool_2","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"A
      child [GET] request failed, <500 INTERNAL_SERVER_ERROR Internal Server
      Error,[]>","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":55,"threadPriority":5,"messageTimestamp":"2021-05-26T11:14:18.280+0000"}{"instant":{"epochSecond":1622027658,"nanoOfSecond":282162222},"thread":"UserService_ThreadPool_2","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"Request
      failed 500 INTERNAL_SERVER_ERROR,
      requestContext=RequestContext{userIds=[imsi-650081000000606],
      requestParams='{\"smPolicyDataReq\":{\"subscription\":false,\"params\":null,\"snssai\":{\"sst\":11,\"sd\":\"abc123\"},\"dnn\":\"dnn1\",\"fields\":null},\"ldapDataReq\":{\"subscription\":false,\"params\":null}}',
      requestType='GET'}.","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":55,"threadPriority":5,"messageTimestamp":"2021-05-26T11:14:18.282+0000"}

Resolution Steps

To resolve this issue, perform the following steps:

Check in the application-config yaml file whether UDR is present in nrfClientSubscribeTypes values or not. If it is not present, then add UDR.
Restart nrf-management pod.
Check the logs of udr-connector to verify if the policy data request registration has been sent successfully.

UDR profile is found, but UDR request fails

Symptom

UDR returns error response with code 503 on receiving a request from UDR connector.

Problem

To fetch policy data, for example SmPolicyData, UDR connector sends a request to UDR. However, when the UDR is unable to process the policy data request despite finding the UDR profile, the following response is received in the log message:

logMsg=<500 INTERNAL_SERVER_ERROR Internal Server
      Error,{\"type\":null,\"title\":\"Service
    Unavailable\",\"status\":503,\

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{
  "instant": {
    "epochSecond": 1627388645,
    "nanoOfSecond": 749163068
  },
  "thread": "UserService_ThreadPool_16",
  "level": "WARN",
  "loggerName": "ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper",
  "message": "A child [GET] request failed, <500 INTERNAL_SERVER_ERROR Internal Server Error,{\"type\":null,\"title\":\"Service Unavailable\",\"status\":503,\"detail\":\"Service Unavailable\",\"instance\":null,\"cause\":null,\"invalidParams\":null},[]>",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 3092,
  "threadPriority": 5,
  "messageTimestamp": "2021-07-27T12:24:05.749+0000"
}
{
  "instant": {
    "epochSecond": 1627388645,
    "nanoOfSecond": 749294213
  },
  "thread": "UserService_ThreadPool_16",
  "level": "WARN",
  "loggerName": "ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper",
  "message": "Request failed 500 INTERNAL_SERVER_ERROR, requestContext=RequestContext{userIds=[imsi-450081000000001], requestParams='{\"smPolicyDataReq\":{\"subscription\":false,\"params\":null,\"snssai\":{\"sst\":11,\"sd\":\"abc123\"},\"dnn\":\"dnn1\",\"fields\":null},\"ldapDataReq\":{\"subscription\":false,\"params\":null},\"ssvEnabled\":true}', requestType='GET'}.",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 3092,
  "threadPriority": 5,
  "messageTimestamp": "2021-07-27T12:24:05.749+0000"
}

Resolution Steps

To resolve this issue, perform the following steps:

As UDR discovery is on-demand, when a request is received, check whether NRF-Management returns correct FQDN for UDR. If the FQDN is incorrect, update with the current value and initiate the request again.

Retry to CHF or UDR alternate route on timeout or error

Symptom

UDR returns error response for retrying to CHF or UDR alternate route on timeout or error in previous attempt.

Problem

logMsg=Error performing GET operation for URI
      /nf-common-component/v1/nrf-client-nfmanagement/nfProfileList"

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{"instant":{"epochSecond":1627368379,"nanoOfSecond":172423065},"thread":"CmAgentTask1","level":"ERROR","loggerName":"ocpm.cne.common.cmclient.CmRestClient","message":"Error
      performing GET operation for URI
      /nf-common-component/v1/nrf-client-nfmanagement/nfProfileList","thrown":{"commonElementCount":0,"localizedMessage":"I/O
      error on GET request for \"http://localhost:5000/nf-common-component/v1/nrf-client-nfmanagement/nfProfileList\": Connect to localhost:5000 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed:
      Connection refused; nested exception is org.apache.http.conn.HttpHostConnectException: Connect
      to localhost:5000 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection
      refused","message":"I/O error on GET request for \"http://localhost:5000/nf-common-component/v1/nrf-client-nfmanagement/nfProfileList\": Connect to localhost:5000 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed:
      Connection refused; nested exception is org.apache.http.conn.HttpHostConnectException: Connect
      to localhost:5000 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection
      refused","name":"org.springframework.web.client.ResourceAccessException","cause":{"commonElementCount":14,"localizedMessage":"Connect
      to localhost:5000 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection
      refused","message":"Connect to localhost:5000 [localhost/127.0.0.1,
      localhost/0:0:0:0:0:0:0:1] failed: Connection
      refused","name":"org.apache.http.conn.HttpHostConnectException","cause":{"commonElementCount":14,"localizedMessage":"Connection
      refused","message":"Connection refused","name":"java.net.ConnectException","extendedStackTrace":"java.net.ConnectException: Connection refused\n\tat sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]\n\tat sun.nio.ch.Net.pollConnectNow(Net.java:669) ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
      ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597) ~[?:?]\n\tat java.net.SocksSocketImpl.connect(SocksSocketImpl.java:333) ~[?:?]\n\tat java.net.Socket.connect(Socket.java:645) ~[?:?]\n\tat
      org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.springframework.http.client.HttpComponentsClientHttpRequest.executeInternal(HttpComponentsClientHttpRequest.java:87)
      ~[spring-web-5.3.4.jar!/:5.3.4]\n\tat
      org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
      ~[spring-web-5.3.4.jar!/:5.3.4]\n\tat
      org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:66)
      ~[spring-web-5.3.4.jar!/:5.3.4]\n\tat
      org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:776)
      ~[spring-web-5.3.4.jar!/:5.3.4]\n"},"extendedStackTrace":"org.apache.http.conn.HttpHostConnectException:
      Connect to localhost:5000 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection
      refused\n\tat
      org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
      ~[httpclient-4.5.13.jar!/:4.5.13]\n\tat
      org.springframework.http.client.HttpComponentsClientHttpRequest.executeInternal(HttpComponentsClientHttpRequest.java:87)
      ~[spring-web-5.3.4.jar!/:5.3.4]\n\tat
      org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48)
      ~[spring-web-5.3.4.jar!/:5.3.4]\n\tat
      org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:66)
      ~[spring-web-5.3.4.jar!/:5.3.4]\n\tat
      org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:776)
      ~[spring-web-5.3.4.jar!/:5.3.4]\nCaused by: java.net.ConnectException: Connection refused\n\tat sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]\n\tat sun.nio.ch.Net.pollConnectNow(Net.java:669) ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:542)
      ~[?:?]\n\tat sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597) ~[?:?]\n\tat java.net.SocksSocketImpl.connect(SocksSocketImpl.java:333) ~[?:?]\n\tat java.net.Socket.connect(Socket.java:645) ~[?:?]\n\tat
      org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)

Resolution Steps

4.3.5 CHF Connector Issues

This section describes the most common CHF connector issues and their resolution steps. It is recommended for users to attempt the resolution steps provided in this guide before contacting Oracle Support.

No CHF profile found

Symptom

CHF Connector receives an error response message saying no CHF Profile found.

Problem

When the CHF connector tries to establish communication with the CHF to process a request, but the request is rejected by CHF because the end user specified in the request cannot be served by the CHF. In the response message, it sends the following response:

message":"Not found matching CHF, refuse this request

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{"instant":{"epochSecond":1621844768,"nanoOfSecond":666347628},"thread":"UserService_ThreadPool_7","level":"WARN","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.chf.ChfDataSourceService","message":"Not
      found matching CHF, refuse this
      request","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":103,"threadPriority":5,"messageTimestamp":"2021-05-24T08:26:08.666+0000"}{"instant":{"epochSecond":1621844768,"nanoOfSecond":666674276},"thread":"UserService_ThreadPool_7","level":"ERROR","loggerName":"ocpm.pcf.service.ud.core.SpendingLimitService","message":"No
      Data Source found for
      op:SUBSCRIBE","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":103,"threadPriority":5,"messageTimestamp":"2021-05-24T08:26:08.666+0000"}{"instant":{"epochSecond":1621844768,"nanoOfSecond":667139735},"thread":"UserService_ThreadPool_7","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"A
      child [GET] request failed, <500 INTERNAL_SERVER_ERROR Internal Server
      Error,[]>","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":103,"threadPriority":5,"messageTimestamp":"2021-05-24T08:26:08.667+0000"}{"instant":{"epochSecond":1621844768,"nanoOfSecond":667602486},"thread":"UserService_ThreadPool_7","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"Request
      failed 500 INTERNAL_SERVER_ERROR, requestContext=RequestContext{userIds=[imsi-10000000002],
      requestParams='{\"spendingLimitReq\":{\"gpsi\":\"msisdn-10000000001\",\"plmn\":{\"mcc\":\"450\",\"mnc\":\"08\"},\"policyCounterIds\":null,\"supportedFeatures\":null,\"asyncQuery\":false},\"ldapDataReq\":{\"subscription\":false,\"params\":null}}',
      requestType='GET'}.","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":103,"threadPriority":5,"messageTimestamp":"2021-05-24T08:26:08.667+0000"}{"instant":{"epochSecond":1621844768,"nanoOfSecond":672460909},"thread":"XNIO-1
      task-1","level":"INFO","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"Send
      reply: \n<500 INTERNAL_SERVER_ERROR Internal Server Error,All sub request
      failed.,[]>","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":65,"threadPriority":5,"messageTimestamp":"2021-05-24T08:26:08.672+0000"}

Resolution Steps

To resolve this issue, perform the following steps:

In the application-config yaml file, check whether CHF is added as a value to the nrfClientSubscriberTypes parameter. If it is not added, add CHF.
Restart nrf-management pod.
Initiate a request again.
Check the logs of chf-connector again to verify if the request has been sent successfully.

CHF Profile found, but CHF request fails

Symptom

CHF Connector receives an error response with code 503.

Problem

When the CHF connector tries to establish communication with the CHF to process a request, but the request cannot be served by the CHF despite finding the CHF profile. In the response message, it sends the following response:

logMsg=:"WARN","loggerName":"com.oracle.cgbu.cnc.nrf.api.NRFClientApi","message":"Error
      Response received with code 503

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{
  "instant": {
    "epochSecond": 1627391543,
    "nanoOfSecond": 982425153
  },
  "thread": "HttpLoggingJettyHttpClient@720ce1b8-7778",
  "level": "WARN",
  "loggerName": "ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper",
  "message": "A child [GET] request failed, <503 SERVICE_UNAVAILABLE Service Unavailable,{\"type\":null,\"title\":\"Service Unavailable\",\"status\":503,\"detail\":\"Service Unavailable\",\"instance\":null,\"cause\":null,\"invalidParams\":null},[]>",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 7778,
  "threadPriority": 5,
  "messageTimestamp": "2021-07-27T13:12:23.982+0000"
}
{
  "instant": {
    "epochSecond": 1627391543,
    "nanoOfSecond": 982547688
  },
  "thread": "HttpLoggingJettyHttpClient@720ce1b8-7778",
  "level": "WARN",
  "loggerName": "ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper",
  "message": "Request failed 500 INTERNAL_SERVER_ERROR, requestContext=RequestContext{userIds=[imsi-450081000011001], requestParams='{\"spendingLimitReq\":{\"gpsi\":\"msisdn-81000000002\",\"plmn\":{\"mcc\":\"450\",\"mnc\":\"08\"},\"policyCounterIds\":null,\"supportedFeatures\":null,\"asyncQuery\":false},\"ldapDataReq\":{\"subscription\":false,\"params\":null},\"ssvEnabled\":true}', requestType='GET'}.",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 7778,
  "threadPriority": 5,
  "messageTimestamp": "2021-07-27T13:12:23.982+0000"
}
{
  "instant": {
    "epochSecond": 1627391543,
    "nanoOfSecond": 982811514
  },
  "thread": "HttpLoggingJettyHttpClient@720ce1b8-7778",
  "level": "DEBUG",
  "loggerName": "ocpm.pcf.service.ud.common.metrics.ChfDataSourceMetricsHelper",
  "message": "Pegging CHF response metric. OperationType : SUBSCRIBE, nfInstanceId : fe7d992b-0541-4c7d-ab84-666666666666, ServiceName : nchf-spendinglimitcontrol, ServiceVersion : v1, ServiceResource : subscriptions, ResponseCode : 5xx",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 7778,
  "threadPriority": 5,
  "messageTimestamp": "2021-07-27T13:12:23.982+0000"
}
{
  "instant": {
    "epochSecond": 1627391543,
    "nanoOfSecond": 984365884
  },
  "thread": "XNIO-1 task-2",
  "level": "INFO",
  "loggerName": "ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper",
  "message": "Send reply: \n<500 INTERNAL_SERVER_ERROR Internal Server Error,All sub request failed.,[]>",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 7692,
  "threadPriority": 5,
  "messageTimestamp": "2021-07-27T13:12:23.984+0000"
}
{
  "instant": {
    "epochSecond": 1627391586,
    "nanoOfSecond": 37087769
  },
  "thread": "Thread-2",
  "level": "INFO",
  "loggerName": "ocpm.cne.common.configclient.ConfigurationAgent",
  "message": "Configuration removed from topic=NRF.UDR, key=fe7d992b-0541-4c7d-ab84-555552222222",
  "endOfBatch": false,
  "loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
  "threadId": 27,
  "threadPriority": 5,
  "messageTimestamp": "2021-07-27T13:13:06.037+0000"
}

Resolution Steps

To resolve this issue, perform the following steps:

Check whether CHF simulator is registered with NRF management. If it is not registered, register it.
If it is registered, verify that correct FQDN is added.
Initiate a request again.
Check the logs of chf-connector again to verify if the request has been processed successfully.

Failed or No Spending Limit data request to CHF on egress

Symptom

CHF Connector receives an error response message saying no CHF Profile found.

Problem

When the CHF connector tries to establish communication with the CHF to process a request from PCF to retrieve policy counter status information for a specific UE, but the request is rejected by CHF because the end user specified in the request cannot be served by the CHF. In the response message, it sends the following response:

message":"No CHF NFProfile found

Sample Error Logs

The following is a sample for error logs that the user may see for this issue:

{"instant":{"epochSecond":1622028135,"nanoOfSecond":231441903},"thread":"UserService_ThreadPool_3","level":"WARN","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.chf.ChfDataSourceService","message":"No CHF NFProfile found.","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":76,"threadPriority":5,"messageTimestamp":"2021-05-26T11:22:15.231+0000"}
{"instant":{"epochSecond":1622028135,"nanoOfSecond":232856398},"thread":"UserService_ThreadPool_3","level":"ERROR","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.chf.ChfDataSourceService","message":"No CHF DataSource found","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":76,"threadPriority":5,"messageTimestamp":"2021-05-26T11:22:15.232+0000"}
{"instant":{"epochSecond":1622028135,"nanoOfSecond":233241908},"thread":"UserService_ThreadPool_3","level":"WARN","loggerName":"ocpm.pcf.service.ud.dbplugin.ds.chf.ChfDataSourceService","message":"Not found matching CHF, refuse this request","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":76,"threadPriority":5,"messageTimestamp":"2021-05-26T11:22:15.233+0000"}
{"instant":{"epochSecond":1622028135,"nanoOfSecond":238565694},"thread":"UserService_ThreadPool_3","level":"ERROR","loggerName":"ocpm.pcf.service.ud.core.SpendingLimitService","message":"No Data Source found for op:SUBSCRIBE","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":76,"threadPriority":5,"messageTimestamp":"2021-05-26T11:22:15.238+0000"}
{"instant":{"epochSecond":1622028135,"nanoOfSecond":239436215},"thread":"UserService_ThreadPool_3","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"A child [GET] request failed, <500 INTERNAL_SERVER_ERROR Internal Server Error,[]>","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":76,"threadPriority":5,"messageTimestamp":"2021-05-26T11:22:15.239+0000"}
{"instant":{"epochSecond":1622028135,"nanoOfSecond":240114557},"thread":"UserService_ThreadPool_3","level":"WARN","loggerName":"ocpm.pcf.service.ud.intf.restful.api.ApiControllerHelper","message":"Request failed 500 INTERNAL_SERVER_ERROR, requestContext=RequestContext{userIds=[imsi-650081000000606], requestParams='{\"spendingLimitReq\":{\"gpsi\":\"msisdn-20000000606\",\"plmn\":{\"mcc\":\"313\",\"mnc\":\"350\"},\"policyCounterIds\":null,\"supportedFeatures\":null,\"asyncQuery\":false},\"ldapDataReq\":{\"subscription\":false,\"params\":null}}', requestType='GET'}.","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":76,"threadPriority":5,"messageTimestamp":"2021-05-26T11:22:15.240+0000"}

Resolution Steps

To resolve this issue, perform the following steps:

In the application-config yaml file, check whether CHF is added as a value to the nrfClientSubscriberTypes parameter. If it is not added, add CHF.
Restart nrf-management pod.
Initiate a request again.
Check the logs of chf-connector again to verify if the request has been sent successfully.

4.4 Upgrade or Rollback Failure

When Policy upgrade or rollback fails, perform the following procedure.

Check the pre or post upgrade or rollback hook logs in Kibana as applicable.
Users can filter upgrade or rollback logs using the following filters:
- For upgrade: lifeCycleEvent=9001 or 9011
- For rollback: lifeCycleEvent=9002
Check the pod logs in Kibana to analyze the cause of failure.
After detecting the cause of failure, do the following:
- For upgrade failure:
  - If the cause of upgrade failure is database or network connectivity issue, contact your system administrator. When the issue is resolved, rerun the upgrade command.
  - If the cause of failure is not related to database or network connectivity issue and is observed during the preupgrade phase, do not perform rollback because Policy deployment remains in the source or older release.
  - If the upgrade failure occurs during the postupgrade phase, for example, post upgrade hook failure due to target release pod not moving to ready state, then perform a rollback.
- For rollback failure: If the cause of rollback failure is database or network connectivity issue, contact your system administrator. When the issue is resolved, rerun the rollback command.
If the issue persists, contact My Oracle Support.

Upgrade failure with specific error in nrf-client-nfmanagement

If the upgrade procedure fails due to the below error in nrf-client-nfmanagement-pre-upgrade hooks:

Upgrade to same or higher versions is only supported. Can not proceed with upgrade. Exiting...

Verify the release version in ReleaseConfig table in nrf-client-nfmanagement. If needed, manually update the version number following the below procedure:

Important:

Perform this procedure in consultation with Oracle Engineering team.

If the release version in ReleaseConfig table is incorrect:

Backup ReleaseConfig table.

mysqldump -u<privileged-user> –p<privileged-password> <release-db-name> ReleaseConfig > ReleaseConfig.sql

Copy the backup to Bastion server.

Run the following command to manually update the ReleaseConfig table.

use <release-db-name>;

select * from ReleaseConfig where CfgKey='nrf-client-nfmanagement';

After the rollback procedure, if the CfgValue is as shown below:

{"currentVersion":{"version":2300100,"jsonSchemaVersionMap":{}},"rollbackVersionSet":[{"version":2200304,"jsonSchemaVersionMap":{}},{"version":2300100,"jsonSchemaVersionMap":{}},{"version":2200401,"jsonSchemaVersionMap":{}}]}

upadte the CfgValue:

update ReleaseConfig set CfgValue='{"currentVersion":{"version":2200304,"jsonSchemaVersionMap":{}},"rollbackVersionSet":[{"version":2200304,"jsonSchemaVersionMap":{}},{"version":2200401,"jsonSchemaVersionMap":{}}]}' where CfgKey='nrf-client-nfmanagement';

Retry the upgrade procedure.