8 Troubleshooting Provisioning Gateway
This section describes how to troubleshoot Provisioning Gateway when provisioning fails.
Verifying Provisioning Gateway Installation
- Run the following command to verify if the Provisioning Gateway services are up
and running:
kubectl get pods -n <provw-namespace>Figure 8-1 Sample: Provisioning Gateway Pod Status

Note:
All the pods in the above image are running. The pod, service, and deployment names are prepended by helm release name, which is used at the time of helm install --name <release name>. - If any pod is not running, then check the auditor service logs using the steps given in the following section.
Checking Auditor Service Logs
$ kubectl logs <provgw-auditor-service pod> -n <provgw-namespace>Note:
Alternatively, use Kibana to view logs.kubectl exec -it provgw-auditor-service-0 -n <namespace> bashFigure 8-2 Verifying Auditor Service Logs

To change the logging level for auditor service using helm:
- Open the latest provgw_value.yaml file used to install or upgrade Provisioning Gateway.
- Change the logging level root value under
provgw to
'INFO'.
auditor-service: ... ... ... logging: level: root: "WARN"Note:
The supported logging level values are DEBUG, INFO, WARN, and ERROR. - Run the following command to change the log
level:
helm upgrade provgw ocudr-helm-repo/provgw -f <updated values.yaml with logging level as INFO> --version <helm version>
Checking PROVGW-SERVICE Logs
In the SLF mode, PROVGW-SERVICE dumps all the response status from active UDR in all the mentioned segments inside values.yaml. In the UDR mode, PROVGW-SERVICE dumps all the response from a single mentioned UDR inside values.yaml. To view logs, run the following command:
kubectl logs <provgw-service pod> -n
<provgw-namespace>
Alternatively, run the following command to check the logs directly on the pods:
kubectl exec –it <provgw-service pod> -n
<provgw-namespace> bash
Figure 8-3 PROVGW-SERVICE Logs

Check the logs in the application.log file.
Changing PROVGW-SERVICE Logging Level
Note:
Redeploy the setup for the changes to take effect.- Open the provgw_value.yaml file used during Provisioning Gateway installation.
- Under provgw-service, change the value of logging level
root attribute to "INFO" or
"DEBUG".
Extract from provgw_values.yaml provgw-service: ... ... ... logging: level: root:"WARN"Other logging level values are DEBUG, INFO, WARN, ERROR.
Checking Provgw-CONFIG Logs
Run the following command to view logs:
$ kubectl logs <nudr-config pod> -n <ocudr-namespace>Note:
You can also use Kibana to view logs.Figure 8-4 Checking Logs on Pods

Changing provgw-config service logging level
- Open the latest provgw_value.yaml file used during UDR installation or upgrade.
- Change the value of logging level root parameter under provgw
to "INFO". Other supported logging level values are: DEBUG, INFO, WARN, and
ERROR.
Sample snippet from provgw_value.yaml
provgw-config: ... ... ... logging: level: root: "WARN" - Run the following command to change the log level.
helm upgrade provgw ocudr-helm-repo/provgw -f <updated values.yaml with logging level as INFO> --version <helm version>
Checking Logs in PROVGW-CONFIG-SERVER
Run the following command to view logs:
$ kubectl logs <provgw-config-server pod> -n
<provgw-namespace>Note:
You can ue Kibana also to view logs.Changing provgw-config-server service logging level
- Open the latest provgw_value.yaml file used during provisioning gateway installation or upgrade.
- Change the value of logging level root attribute under provgw
to "INFO".
Sample snippet from provgw_value.yaml
config-server: ... ... envLoggingLevelApp: WARNThe supported logging level values are DEBUG, INFO, WARN, and ERROR.
- Run the following command to change the log level:
helm upgrade provgw ocudr-helm-repo/provgw -f <updated values.yaml with logging level as INFO> --version <helm version>
Changing provgw-service usageMode from "SLF" to "UDR"
- Open the global values.yaml file used during UDR installation.
- Change the usageMode value in global section from "SLF" to "UDR".
- Change the udrIp mentioned inside the provgw-service inside
soapService tag.
Sample snippet from provgw-service values.yaml
usageMode: "UDR" provgw-service: ... ... ... soapService: port: 62001 udrIp: ocudr-ingressgateway.ocudr #Mention the udr-ingress provisioning ip udrSignallingIp: ocudr-ingressgateway-sig.ocudr #Mention the udr-ingress signalling ip convertToSec: true secEntity: | - name: QuotaEntity elementString: usage innerElemenString: quota outerFields: - version - quota - SequenceNumber innerFields: - name - cid - time - totalVolume - inputVolume - outputVolume - serviceSpecific - nextResetTime - Type - grantedTotalVolume - grantedInputVolume - grantedOutputVolume - grantedTime - grantedServiceSpecific - QuotaState - RefInstanceId - name: DynamicQuotaEntity elementString: definition innerElemenString: DynamicQuota outerFields: - version - DynamicQuota - SequenceNumber innerFields: - Type - name - InstanceId - Priority - InitialTime - InitialTotalVolume - InitialInputVolume - InitialOutputVolume - InitialServiceSpecific - activationdatetime - expirationdatetime - purchaseddatetime - Duration - InterimReportingInterval - name: Subscriber elementString: subscriber outerFields: - IMSI - MSISDN - NAI - EXTID - ACCOUNTID - BillingDay - Entitlement - Tier - SequenceNumber - name: StateEntity elementString: state innerElemenString: property outerFields: - version - property - SequenceNumber innerFields: - name - value
Debug Pod Creation Failure
- Incorrect Pod Image: Check if any pod is in the
ImagePullBackOff state. If it is there, it means the image name
used for one of the pods is not correct. Check the following values in the
values.yaml
file.
global: dockerRegistry: ocudr-registry.us.oracle.com:5000/provgw provgw-service: image: name: provgateway tag: 25.1.100 prov-ingressgateway: image: name: ocingress_gateway tag: 25.1.102 prov-egressgateway: image: name: provgw/ocegress_gateway tag: 25.1.102 auditor-service: image: name: provgw/auditor_service tag: 25.1.100After updating the values.yaml file, run the following command to install helm:
helm install –-name <release-name> --namespace <release-namespace>This helps you to purge the old setup and reinstall or upgrade the provisioning gateway instance.
- Resource Allocation Failure: Check if any pod is in the
Pending state. If yes, then run the following command for those pods:
kubectl describe <provgw-service pod id> --n <provgw-namespace>Check the output for the Insufficient CPU warning. This warning means there are not sufficient CPUs to start the pod. To resolve this warning, either increase the number of CPUs as a hardware or reduce the number of CPUs allotted to a pod in the values.yaml file.provgw-service: resources: limits: cpu: 2 memory: 2Gi requests: cpu: 2 memory: 2Gi prov-ingressgateway: resources: limits: cpu: 2 memory: 2Gi initServiceCpu: 1 initServiceMemory: 1Gi updateServiceCpu: 1 updateServiceMemory: 1Gi requests: cpu: 2 memory: 2Gi initServiceCpu: 1 initServiceMemory: 1Gi updateServiceCpu: 1 updateServiceMemory: 1Gi prov-egressgateway: resources: limits: cpu: 2 memory: 2Gi initServiceCpu: 1 initServiceMemory: 1Gi updateServiceCpu: 1 updateServiceMemory: 1Gi requests: cpu: 2 memory: 2Gi initServiceCpu: 1 initServiceMemory: 1Gi updateServiceCpu: 1 updateServiceMemory: 1Gi auditor-service: resources: limits: cpu: 2 memory: 2Gi requests: cpu: 2 memory: 2GiAfter updating the values.yaml file, run the following command to install helm:
helm install –-name <release-name> --namespace <release-namespace>
Debugging UDR Registration with Provisioning Gateway in SLF Mode
Before deploying Provisioning Gateway, you need to ensure that UDR pods are in the running state and FQDNs of UDRs are correct.
Figure 8-5 UDR FQDN Info

- If the non-preferred FQDN is in active state, provgw registers with the non-preferred FQDN.
- SLF switchover to non-preferred FQDN in a segment is done, If the preferred FQDN respond with a 503 status code and no proper response body or if the proper response body has a corresponding Error Code Profile and is set under Provisioning Gateway Global Configurations. Proper response body means the response must have status code, title, cause, and so on. A common improper response body is no healthy upstream.
- If the non-preferred FQDN is also not available (down) then it keeps the preferred one as the active SLF in that segment.
- All the pods of UDR are in running state
- FQDN of the UDR is correctly mentioned
- The preferred FQDN is considered as active when all the FQDN’s are down in one segment
- The active FQDN is checked every 15 seconds (this value is configurable)
- The GET requests sent to UDR for updating the active FQDN, are not dumped in the ProvGw logs.
- There is only one
provgw-servicethat receives all the requests from Ingress Gateway. If any request fails:- With 404 status code without Problem Details
information:, then there may be issues with the routeConfig
on the ingressgateway custom values
file.
{"title":"404 NOT_FOUND","status":404,"detail":"udr001.oracle.com: ingressgateway: NOT_FOUND: OUDR-IGWSIG-E183"}You must check the custom values.yaml file for the essential route configurations. If the essential route configurations are not present you must add the route configurations.
- With 503 status code with SERVICE_UNAVAILABLE in
Problem Details:, then it means that the nudrdrservice pod
is not reachable due to some
reason.
{"title":"Service Unavailable","status":503,"detail":"udr001.oracle.com: ingressgateway: Service Unavailable: OUDR-IGWSIG-E003","cause":"Encountered unknown host exception at IGW"}Verify this error in the ocudr-ingressgateway-sig and ocudr-ingressgateway-prov pod logs for errors or exception. Check the ocudr-nudr-drservice and ocudr-nudr-dr-provservice pod status and fix the issue.
- Try to find the root cause using metrics as follows:
- If the count of oc_ingressgateway_http_requests_total measurement increases, then check the content of incoming requests. You need to ensure that the incoming JSON data blob is as per specification.
- If the udr_rest_request measurement increases more than one per request then ensure that UDR is working fine and Ingress Gateway of UDR is not down.
- With 404 status code without Problem Details
information:, then there may be issues with the routeConfig
on the ingressgateway custom values
file.
- To debug HTTPS related issues, see the Troubleshooting
Unified Data Repository section in the Oracle Communications
Cloud Native Core, Unified Data Repository Installation, Upgrade, and
Fault Recovery Guide.
Figure 8-6 ProvGw - HTTPS Port Exposure

- To debug HPA Issues, see the Troubleshooting Unified Data Repository section in the Oracle Communications Cloud Native Core, Unified Data Repository Installation, Upgrade, and Fault Recovery Guide.
Debugging Provisioning Gateway with Service Mesh Failures
- Istio-Proxy side car container not attached to Pod: This
particular failure arise when istio injection is not enabled on UDR
installed namespace. Run the following command to verify the same:
kubectl get namespace -L istio-injectionFigure 8-7 Verifying Istio-Proxy

To enable the istio injection, run the following command:
kubectl label --overwrite namespace <nf-namespace> istio-injection=enabledIf any of the hook pods is stuck in the NotReady state and not getting cleared after completion, then check whether the following configuration is set to true under global section. And also ensure the URL configured for istioSidecarQuitUrl is correct.Figure 8-8 service mesh related configuration
There can be cases where Prometheus does not scrap metrics from nudr-nrf-client-service. In these cases, ensure the following annotation is present under nudr-nrf-client-service.Figure 8-9 When Prometheus Does Not Scrap Metric

- If Provisioning System outside service mesh is not able to
contact Provisioning Gateway service through its Ingress Gateway, then:
- Exclude the HTTP container port traffic for prov-ingressgateway from istio side car.
- Ensure the following configuration is enabled under
prov-ingressgateway section.
Figure 8-10 Enabling istioIngressTlsSupport

- If there are issues in viewing Provisioning Gateway metrics on
OSO Prometheus then add the following annotation to all the deployments for
the NF.
Figure 8-11 Annotation to View Provisioning Gateway Metrics

- When vDBTier is used as backend, there are connectivity issues, nudr-preinstall communicates with database (can be seen from error logs on pre-install hook pod), then make the destination rule and service entry for mysql-connectivity-service on occne-infra namespace. For information about creating destination rule and service entry, see Oracle Communications Cloud Native Core, Unified Data Repository Installation, Upgrade, and Fault Recovery Guide, where ASM specific changes are mentioned.
Troubleshooting provgw-service through Metrics
If provgw-service requests fail, try to find the root cause from metrics as well. Some of the troubleshooting tips are as follows:
- If the count of oc_ingressgateway_http_requests_total measurement increases, check the content of incoming request and make sure that incoming JSON data blob is proper and as per the specification.
- If on one request, the udr_rest_request measurement increases
more than once then ensure:
- UDRs are working fine
- Ingress Gateways of UDRs are not down
Troubleshooting provgw-config through Metrics
- Measurements of metrics like provgw_config_total_requests_total{Method='GET'}, provgw_config_total_requests_total{Method='POST'}, provgw_config_total_requests_total{Method='PUT'} gives us information about the total requests pegged for the GET, POST, and PUT methods respectively.
- If the count of provgw_config_total_responses_total{Method='GET/POST/PUT',StatusCode="400/404/405/500"}” measurement increases, then it means the requests are not being processed and resulting in failures.
Troubleshooting provgw-config through Configdb
If provgw-config requests fail, try to find the root cause from configdb. Some of the troubleshooting tips are as follows:
- If you get a BAD REQUEST for GET API, then make sure all the
tables shown below is present in configdb table.
Figure 8-12 Configdb Table

- If all the table are present and you are getting a BAD
REQUEST for GET API, then you must verify the configuration item table
shown below.
Figure 8-13 Configuration Item Table

- If you get a BAD REQUEST and NOT FOUND for import and
export API, then you must verify the import and export table shown below.
Figure 8-14 Import and Export Data

Troubleshooting Auditor Service issues Using Logs
- Make sure that files are transferred completely to both segment folders from Subscriber Export Tool pod.
- Make sure that you do not have the old files with the same subscriber data as the new files at any given time. The old files must be deleted manually from both the segment folders.
- When the auditor service pod starts, audit is scheduled based on the configurations. You may observe warning or error messages in the logs.
- If the message shows as "No audit eligible exported CSV file present for", you must place the CSV files in both the segment folders. The audit will be unsuccessful if the CSV files are placed in only one segment folder.
- If the message shows as "More than configured discrepancies. Cannot process audit". You must the correct the files. By default only 100000 discrepancies are allowed, if the discrepancies are more than the default value, check the files on both sites where the subscriber dump has taken place.
- Only one audit process must be running at a given point of time. If you schedule a second audit when the first audit is in progress, the application responds with an error stating "Audit is in progress, so not launching scheduler".
- If messages shows as "Audit process stopped due to audit pause enabled", You
must set the
auditPause parameterflag to false to resume the audit.
8.1 Debugging Service Mesh related Issues
Figure 8-15 Debugging Service Mesh related Issue

Figure 8-16 Annotation: Issue in Viewing ProvGw Metric on OSO Prometheus

8.2 Log Attribute Details
Table 8-1 Log Attribute Details
| Log Attribute | Details | Example Value | Data Type | Source |
|---|---|---|---|---|
| thread | Thread Name Internal by Spring boot | XNIO-1 task-1 | String | log4j |
| level | Log Level of the log printed | WARN | String | log4j |
| loggerName | Class which printed the log | ocudr.udr.services.service.DbHandler | String | log4j |
| message | Outputs the application supplied message | Subscriber does not exist | String | Application |
| endOfBatch | Log4j2 Internal | false | boolean | log4j |
| loggerFqcn | Log4j2 Internal | org.apache.logging.slf4j.Log4jLogger | String | log4j |
| instant | Epoch time | {"epochSecond":1599703750,"nanoOfSecond":210064000} | Object | log4j |
| threadId | Outputs the ID of the thread that generated the logging event, set internally by Log4j2 | 23 | Integer | log4j |
| threadPriority | Thread Priority set internally by Log4j2 | 5 | Integer | log4j |
| messageTimestamp | Timestamp when log was printed | 21-02-17 07:36:06.343+0000 | String | Application |
| application | NF application name | ocudr | String | Application |
| engVersion | Engineering version of software | 1.10.20 | String | Application |
| mktgVersion | Marketing version of software | 1.10.20.0.0 | String | Application |
| microservice | Microservice name | ocudr-nudr-drservice | String | Application |
| vendor | Vendor name | Oracle | String | Application |
| subscriberId | SubscriberId for which request received | msisdn-1111111113 | String | Application |
| resourceId | Request Uri | nudr-group-id-map/v1/nf-group-ids | String | Application |
| resultCode | Response statusCode | 404 | String | Application |
| ocLogId | Inter NF logId for tracing | 1613547369374_225_ocudr-ingressgateway-6f585c76d4-tp622 | String | Application |
| sbiCorrelationHeader | SBI Correlation Header for request received | msisdn-1111111113 | String | Application |
| requestType | request type received | GET | String | Application |
| kubernetes.container_name | Container name generating log | nudr-dr-service | String | fluentd |
| kubernetes.namespace_name | Namespace of service | ocudr | String | fluentd |
| kubernetes.pod_name | Pod name | ocudr-nudr-drservice-7f8c47f5c9-flkmz | String | fluentd |
| kubernetes.container_image | Container image | cne-170-ga-bastion-1:5000/ocudr/nudr_datarepository_service:1.9.50 | String | fluentd |
| kubernetes.container_image_id | Container image ID | cne-170-ga-bastion-1:5000/ocudr/nudr_datarepository_service@sha256:1141f245a3a437f1423496aebf616a6d3315e22ad09904a868bf1b471759b616 | String | fluentd |
| kubernetes.pod_id | POD id | 6dfa91f8-2d0a-4d8c-a339-381ce98264df | String | fluentd |
| kubernetes.host | Worker node name | cne-170-ga-k8s-node-8 | String | fluentd |
| kubernetes.namespace_id | Unique namespace ID assigned by K8 | d932a8ae-54e9-4df1-8e30-e0335f0b1303 | String | fluentd |
| labels | All the labels on pod that generate the logs |
"labels": { "pod-template-hash": "7f8c47f5c9", "app_kubernetes_io/instance": "ocudr", "app_kubernetes_io/managed-by": "Tiller", "app_kubernetes_io/name": "nudr-drservice", "app_kubernetes_io/part-of": "ocudr", "app_kubernetes_io/version": "1.6.0.0.0", "helm_sh/chart": "nudr-drservice-1.9.50", "io_kompose_service": "nudr-drservice" } |
object | fluentd |
| originHost | Diameter client fqdn | diamcli1.oracle.com | String |
Application Note: Only in diameterproxy and diameter-gateway |
| originRealm | Diameter client realm | oracle.com | String |
Application Note: Only in diameter-gateway |
| serviceIndications | Diameter service indications for GET operations | "serviceIndications" : [ "CamiantUserData", "CamiantStateData"] | Array |
Application Note: Only in diameterproxy |
{"instant":{"epochSecond":1613547366,"nanoOfSecond":343417698},"thread":"XNIO-1 task-1","level":"WARN","loggerName":"ocudr.udr.services.service.DbHandler","message":"Subscriber does not exist","endOfBatch":false,"loggerFqcn":"org.apache.logging.slf4j.Log4jLogger","threadId":46,"threadPriority":5,"messageTimestamp":"21-02-17 07:36:06.343+0000","application":"ocudr","sbiCorrelationHeader":"msisdn-1111111113","engVersion":"1.10.20","mktgVersion":"1.10.20.0.0","microservice":"ocudr-nudr-drservice","vendor":"Oracle","subscriberId":"msisdn-1111111113","resourceId":"nudr-group-id-map/v1/nf-group-ids","resultCode":"404","ocLogId":"1613547369374_225_ocudr-ingressgateway-6f585c76d4-tp622","requestType":"GET"}
{
"_index": "logstash-2021-02-19",
"_type": "_doc",
"_id": "yyQXuHcBwFjE8wmhickN",
"_version": 1,
"_score": 0,
"_source": {
"stream": "stdout",
"docker": {
"container_id": "b1b78faa1043132f77148f16777e60b1db8ae30e9e6b5c2f5af45248063f7d6a"
},
"kubernetes": {
"container_name": "nudr-drservice",
"namespace_name": "bharathudr1",
"pod_name": "bharathudr1-nudr-drservice-7bd66864c6-fw7cq",
"container_image": "cne-172-bastion-1:5000/ocudr/nudr_datarepository_service:ocLogIdTest1",
"container_image_id": "cne-172-bastion-1:5000/ocudr/nudr_datarepository_service@sha256:24875dad7fd363bcb8ec300b007491b2259796e152446fd3d707b17f62fcd6b4",
"pod_id": "ea92c753-48af-4b31-9356-1ce24f27025e",
"host": "cne-172-k8s-node-10",
"labels": {
"pod-template-hash": "7bd66864c6",
"app_kubernetes_io/instance": "bharathudr1",
"app_kubernetes_io/managed-by": "Helm",
"app_kubernetes_io/name": "nudr-drservice",
"app_kubernetes_io/part-of": "ocudr",
"app_kubernetes_io/version": "1.6.0.0.0",
"helm_sh/chart": "nudr-drservice-1.10.20",
"io_kompose_service": "nudr-drservice"
},
"master_url": "https://10.233.0.1:443/api",
"namespace_id": "260e3d4a-d455-4afc-9cff-9aaf8eba79d1"
},
"instant": {
"epochSecond": 1613701232,
"nanoOfSecond": 431593919
},
"thread": "XNIO-1 task-1",
"level": "WARN",
"loggerName": "ocudr.udr.services.service.DbHandler",
"message": "Subscriber does not exist",
"endOfBatch": false,
"loggerFqcn": "org.apache.logging.slf4j.Log4jLogger",
"threadId": 46,
"threadPriority": 5,
"messageTimestamp": "21-02-19 02:20:32.431+0000",
"application": "bharathudr1",
"engVersion": "1.10.20",
"mktgVersion": "1.6.0.0.0",
"microservice": "bharathudr1-nudr-drservice",
"vendor": "Oracle",
"subscriberId": "imsi-100000002",
"sbiCorrelationHeader": "imsi-100000002",
"resourceId": "nudr-group-id-map-prov/v1/slf-group",
"resultCode": "404",
"ocLogId": "1613701211520_2220_bharathudr1-ingressgateway-7d7659c58b-5k6b2",
"requestType": "GET",
"@timestamp": "2021-02-19T02:20:32.432795705+00:00",
"tag": "kubernetes.var.log.containers.bharathudr1-nudr-drservice-7bd66864c6-fw7cq_bharathudr1_nudr-drservice-b1b78faa1043132f77148f16777e60b1db8ae30e9e6b5c2f5af45248063f7d6a.log"
},
"fields": {
"@timestamp": [
"2021-02-19T02:20:32.432Z"
],
"timestamp": []
},
"highlight": {
"resultCode": [
"@kibana-highlighted-field@404@/kibana-highlighted-field@"
],
"message": [
"@kibana-highlighted-field@Subscriber@/kibana-highlighted-field@ @kibana-highlighted-field@does@/kibana-highlighted-field@ @kibana-highlighted-field@not@/kibana-highlighted-field@ @kibana-highlighted-field@exist@/kibana-highlighted-field@"
],
"kubernetes.namespace_name": [
"@kibana-highlighted-field@bharathudr1@/kibana-highlighted-field@"
]
}
}