4 Troubleshooting NSSF
This section provides information about how to identify problems and a systematic approach to resolve the identified issues. It also includes a generic checklist to help identify the problem.
Note:
The performance and capacity of the NSSF system may vary based on the call model, Feature or Interface configuration, and underlying CNE and hardware environment.4.1 Generic Checklist
The following sections provide a generic checklist for troubleshooting tips.
Deployment related tips
- Are NSSF deployment, pods and services created, running and
available?
Run the following command:
# kubectl -n <namespace> get deployments,pods,svcInspect the output, check the following columns:
- AVAILABLE of deployment
- READY, STATUS, and RESTARTS of pods
- PORT(S) of service
- Is the correct image used and the correct environment variables set in
the deployment? Run following the command:
Inspect the output, check the environment and image.
# kubectl -n <namespace> get deployment <deployment-name> -o yaml# kubectl -n nssf-svc get deployment ocnssf-nfregistration -o yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: annotations: deployment.kubernetes.io/revision: "1" kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"apps/v1","kind":"Deployment","metadata": {"annotations":{},"name":"ocnssf-nfregistration","namespace":"nssfsvc"}, "spec":{"replicas":1,"selector":{"matchLabels": {"app":"ocnssf-nfregistration"}}, "template":{"metadata": {"labels":{"app":"ocnssf-nfregistration"}}, "spec": {"containers":[{"env":[{"name":"MYSQL_HOST","value":"mysql"}, {"name":"MYSQL_PORT","value":"3306"}, {"name":"MYSQL_DATABASE","value":"nssfdb"}, {"name":"nssf_REGISTRATION_ENDPOINT","value":"ocnssfnfregistration"}, {"name":"nssf_SUBSCRIPTION_ENDPOINT","value":"ocnssfnfsubscription"}, {"name":"NF_HEARTBEAT","value":"120"}, {"name":"DISC_VALIDITY_PERIOD","value":"3600"}], "image":"dsrmaster0:5000/ocnssfnfregistration:latest", "imagePullPolicy":"Always","name":"ocnssfnfregistration","ports": [{"containerPort":8080,"name":"server"}]}]}}}} creationTimestamp: 2018-08-27T15:45:59Z generation: 1 name: ocnssf-nfregistration namespace: nssf-svc resourceVersion: "2336498" selfLink: /apis/extensions/v1beta1/namespaces/ nssf-svc/deployments/ocnssf-nfregistration uid: 4b82fe89-aa10-11e8-95fd-fa163f20f9e2 spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: app: ocnssf-nfregistration strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: ocnssf-nfregistration spec: containers: - env: - name: MYSQL_HOST value: mysql - name: MYSQL_PORT value: "3306" - name: MYSQL_DATABASE value: nssfdb - name: nssf_REGISTRATION_ENDPOINT value: ocnssf-nfregistration - name: nssf_SUBSCRIPTION_ENDPOINT value: ocnssf-nfsubscription - name: NF_HEARTBEAT value: "120" - name: DISC_VALIDITY_PERIOD value: "3600" image: dsr-master0:5000/ocnssf-nfregistration:latest imagePullPolicy: Always name: ocnssf-nfregistration ports: - containerPort: 8080 name: server protocol: TCP resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 status: availableReplicas: 1 conditions: - lastTransitionTime: 2018-08-27T15:46:01Z lastUpdateTime: 2018-08-27T15:46:01Z message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: "True" type: Available - lastTransitionTime: 2018-08-27T15:45:59Z lastUpdateTime: 2018-08-27T15:46:01Z message: ReplicaSet "ocnssf-nfregistration-7898d657d9" has successfully progressed. reason: NewReplicaSetAvailable status: "True" type: Progressing observedGeneration: 1 readyReplicas: 1 replicas: 1 updatedReplicas: 1 - Check if the microservices can access each other through a REST
interface. Run the following command:
# kubectl -n <namespace> exec <pod name> -- curl <uri>Example:
# kubectl -n nssf-svc exec $(kubectl -n nssf-svc get pods -o name|cut -d'/' -f2|grep nfs) - curl http://ocnssf-nfregistration:8080/nnssf-nfm/v1/nfinstances # kubectl -n nssf-svc exec $(kubectl -n nssf-svc get pods -o name|cut -d'/' -f2|grep nfr) - curl http://ocnssf-nfsubscription:8080/nnssf-nfm/v1/nfinstancesNote:
These commands are in their simple form and display the logs only if there is 1 nssf<registration> and nssf<unscription> pod deployed.
Application related tips
- Run the following command to check the application logs and look for
exceptions:
You can use '-f' to follow the logs or 'grep' for specific patterns in the log output.# kubectl -n <namespace> logs -f <pod name>Example:
# kubectl -n nssf-svc logs -f $(kubectl -n nssf-svc get pods -o name|cut -d'/' -f2|grep nfr) # kubectl -n nssf-svc logs -f $(kubectl -n nssf-svc get pods -o name|cut -d'/' -f2|grep nfs)Note:
These commands are in their simple form and display the logs only if there is 1 nssf<egistration> and nfs<ubscription> pod deployed.
4.2 Deployment Related Issues
This section describes the most common deployment related issues and their resolution steps. Users are recommended to attempt the resolution steps provided in this guide before contacting Oracle Support.
4.2.1 Preinstallation
This section describes the common preinstallation issues and their resolution steps.
4.2.1.1 Debugging General CNE
Problem: The environment is not working as expected.
Solution:
kubectl get events -n <ocnssf_namespace>4.2.1.1.1 The Environment is Not Working As Expected
Problem: The environment is not working as expected.
- Check if kubectl is installed and working as expected.
- Check if
kubectl versioncommand works. This displays the Kubernetes client and server versions. - Check if
kubectl create namespace testcommand works. - Check if
kubectl delete namespace testcommand works. - Check if helm is installed and working as expected.
- Check if
helm versioncommand works. This displays the helm client and server versions.
4.2.1.2 Curl HTTP2 Not Supported
Problem
The system does not support Curl HTTP2.
Error Code or Error Message
Unsupported protocol error is thrown or connection is established with HTTP/1.1 200 OK
Symptom
If unsupported protocol error is thrown or connection is established with HTTP1.1, it is an indication that Curl HTTP2 support is unavailable on your machine.
Solution
Following is the procedure to install Curl with HTTP2 support:
1. Make sure git is installed:
$ sudo yum install git -y 2. Install nghttp2:
$ git clone https://github.com/tatsuhiro-t/nghttp2.git
$ cd nghttp2 $ autoreconf -i
$ automake
$ autoconf
$ ./configure
$ make
$ sudo make install
$ echo '/usr/local/lib' > /etc/ld.so.conf.d/custom-libs.conf
$ ldconfig 3. Install the latest Curl:
$ wget http://curl.haxx.se/download/curl-7.46.0.tar.bz2 (NOTE: Check for latest version during Installation)
$ tar -xvjf curl-7.46.0.tar.bz2
$ cd curl-7.46.0
$ ./configure --with-nghttp2=/usr/local --with-ssl
$ make
$ sudo make install
$ sudo ldconfig 4. Run the following command to verify that HTTP2 is added in features:
$ curl --http2-prior-knowledge -v "<http://10.75.204.35:32270/nnrf
disc/v1/nf-instances?requester-nf-type=AMF&target-nf-type=SMF>"
4.2.2 Installation
This section describes the common installation related issues and their resolution steps.
4.2.2.1 Helm Install Failure
This section describes the various scenarios in which helm install might
fail. Following are some of the scenarios:
- Incorrect image name in ocnssf-custom-values files
- Docker registry is configured incorrectly
- Continuous Restart of Pods
4.2.2.1.1 Incorrect image name in ocnssf-custom-values files
Problem
helm install might fail if an incorrect image name is
provided in the ocnssf_custom_values_25.1.201.yaml
file.
Error Code/Error Message
When kubectl get pods -n <ocnssf_namespace> is
performed, the status of the pods might be ImagePullBackOff or ErrImagePull.
For example:
$ kubectl get pods -n ocnssf
NAME READY STATUS RESTARTS AGE
ocnssf-appinfo-7969c9fbf7-4fmgj 1/1 Running 0 18m
ocnssf-config-server-54bf4bc8f9-s82cv 1/1 Running 0 18m
ocnssf-egress-6b6bff8949-2mf7b 1/1 ImagePullBackOff 0 18m
ocnssf-ingress-68d76954f5-9fsfq 1/1 Running 0 18m
ocnssf-nrf-client-nfdiscovery-cf48cd8d8-l4q2q 1/1 Running 0 18m
ocnssf-nrf-client-nfdiscovery-cf48cd8d8-vmt5v 1/1 Running 0 18m
ocnssf-nrf-client-nfmanagement-7db4598fbb-672hc 1/1 Running 0 18m
ocnssf-nsavailability-644999bbfb-9gcm5 1/1 Running 0 18m
ocnssf-nsconfig-577446c487-dzsh6 1/1 Running 0 18m
ocnssf-nsdb-585f7bd7d-tdth4 1/1 Running 0 18m
ocnssf-nsselection-5dfcc94bc7-q9gct 1/1 Running 0 18m
ocnssf-nssubscription-5c898fbbb9-fqcw6 1/1 Running 0 18m
ocnssf-performance-6d75c7f966-qm5fq 1/1 Running 0 18m
Solution
- Check
ocnssf_custom_values_25.1.201.yamlfile has the release specific image name and tags.
For NSSF images details, see "Customizing NSSF" in Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.vi ocnssf_custom_values_25.1.201.yaml - Edit
ocnssf_custom_values_25.1.201.yamlfile in case the release specific image name and tags must be modified. - Save the file.
- Run the following command to delete the
deployment:
helm delete --purge <release_namespace>Sample command:helm delete --purge ocnssf - In case the helm purge does not clean the deployment and Kubernetes objects completely, then see the "Cleaning NSSF deployment" section in Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.
- Run
helm installcommand. For helm install command, see the "Customizing NSSF" section in Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide. - Run
kubectl get pods -n <ocnssf_namespace>to verify if the status of all the pods is Running.For example:
$ kubectl get pods -n ocnssfNAME READY STATUS RESTARTS AGE ocnssf-appinfo-7969c9fbf7-4fmgj 1/1 Running 0 18m ocnssf-config-server-54bf4bc8f9-s82cv 1/1 Running 0 18m ocnssf-egress-6b6bff8949-2mf7b 1/1 Running 0 18m ocnssf-ingress-68d76954f5-9fsfq 1/1 Running 0 18m ocnssf-nrf-client-nfdiscovery-cf48cd8d8-l4q2q 1/1 Running 0 18m ocnssf-nrf-client-nfdiscovery-cf48cd8d8-vmt5v 1/1 Running 0 18m ocnssf-nrf-client-nfmanagement-7db4598fbb-672hc 1/1 Running 0 18m ocnssf-nsavailability-644999bbfb-9gcm5 1/1 Running 0 18m ocnssf-nsconfig-577446c487-dzsh6 1/1 Running 0 18m ocnssf-nsdb-585f7bd7d-tdth4 1/1 Running 0 18m ocnssf-nsselection-5dfcc94bc7-q9gct 1/1 Running 0 18m ocnssf-nssubscription-5c898fbbb9-fqcw6 1/1 Running 0 18m ocnssf-performance-6d75c7f966-qm5fq 1/1 Running 0 18m
4.2.2.1.2 Docker registry is configured incorrectly
Problem
helm install might fail if the docker registry is not
configured in all primary and secondary nodes.
Error Code/Error Message
When kubectl get pods -n <ocnssf_namespace> is
performed, the status of the pods might be ImagePullBackOff or ErrImagePull.
For example:
$ kubectl get pods -n ocnssf
NAME READY STATUS RESTARTS AGE
ocnssf-appinfo-7969c9fbf7-4fmgj 1/1 Running 0 18m
ocnssf-config-server-54bf4bc8f9-s82cv 1/1 Running 0 18m
ocnssf-egress-6b6bff8949-2mf7b 1/1 ImagePullBackOff 0 18m
ocnssf-ingress-68d76954f5-9fsfq 1/1 Running 0 18m
ocnssf-nrf-client-nfdiscovery-cf48cd8d8-l4q2q 1/1 Running 0 18m
ocnssf-nrf-client-nfdiscovery-cf48cd8d8-vmt5v 1/1 Running 0 18m
ocnssf-nrf-client-nfmanagement-7db4598fbb-672hc 1/1 Running 0 18m
ocnssf-nsavailability-644999bbfb-9gcm5 1/1 Running 0 18m
ocnssf-nsconfig-577446c487-dzsh6 1/1 Running 0 18m
ocnssf-nsdb-585f7bd7d-tdth4 1/1 Running 0 18m
ocnssf-nsselection-5dfcc94bc7-q9gct 1/1 Running 0 18m
ocnssf-nssubscription-5c898fbbb9-fqcw6 1/1 Running 0 18m
ocnssf-performance-6d75c7f966-qm5fq 1/1 Running 0 18mSolution
Configure docker registry on all primary and secondary nodes. For more information on configuring the docker registry, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.
4.2.2.1.3 Continuous Restart of Pods
Problem
helm install might fail if the MySQL primary and
secondary hosts are not configured properly in
ocnssf-custom-values.yaml.
Error Code/Error Message
When kubectl get pods -n <ocnssf_namespace> is
performed, the pods restart count increases continuously.
For example:
$ kubectl get pods -n
ocnssf
NAME READY STATUS RESTARTS AGE
ocnssf-appinfo-7969c9fbf7-4fmgj 1/1 Running 0 18m
ocnssf-config-server-54bf4bc8f9-s82cv 1/1 Running 0 18m
ocnssf-egress-6b6bff8949-2mf7b 1/1 Running 0 18m
ocnssf-ingress-68d76954f5-9fsfq 1/1 Running 0 18m
ocnssf-nrf-client-nfdiscovery-cf48cd8d8-l4q2q 1/1 Running 0 18m
ocnssf-nrf-client-nfdiscovery-cf48cd8d8-vmt5v 1/1 Running 0 18m
ocnssf-nrf-client-nfmanagement-7db4598fbb-672hc 1/1 Running 0 18m
ocnssf-nsavailability-644999bbfb-9gcm5 1/1 Running 0 18m
ocnssf-nsconfig-577446c487-dzsh6 1/1 Running 0 18m
ocnssf-nsdb-585f7bd7d-tdth4 1/1 Running 0 18m
ocnssf-nsselection-5dfcc94bc7-q9gct 1/1 Running 0 18m
ocnssf-nssubscription-5c898fbbb9-fqcw6 1/1 Running 0 18m
ocnssf-performance-6d75c7f966-qm5fq 1/1 Running 0 18mSolution
MySQL servers(s) may not be configured properly according to the pre-installation steps. For configuring MySQL servers, see the "Configuring MySQL Database and User" section in Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.
4.2.2.1.4 Tiller Pod Failure
Problem
Tiller Pod is not ready to run helm install.
Error Code/Error Message
The error 'could not find a ready tiller pod' message is received.
Symptom
When you run helm ls command and receive, 'could not find a ready tiller pod' message error.
Solution
Following is the procedure to install helm and tiller using the below commands:
1. Delete the preinstalled helm:
kubectl delete svc tiller-deploy -n kube-system kubectl delete deploy tiller-deploy -n kube-system
2. Install helm and tiller using these commands:
helm init --client-only
helm plugin install https://github.com/rimusz/
helm-tiller helm tiller install
helm tiller start kube-system4.2.2.2 Custom Value File Parsing Failure
ocnssf_custom_values_25.1.201.yaml
file.
Problem
Not able to parse ocnssf_custom_values_25.1.201.yaml, while running helm install.
Error Code/Error Message
Error: failed to parse ocnssf_custom_values_25.1.201.yaml: error converting YAML to JSON: yaml
Symptom
While creating the ocnssf_custom_values_25.1.201.yaml file, if the aforementioned error is received, it means that the
file is not created properly. The tree structure may not have been followed or there may
also be tab spaces in the file.
Solution
- Download the latest NSSF templates zip file from MOS. For more information, see the "Downloading the NSSF Package and Custom Template ZIP file" section in Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.
- Follow the steps mentioned in the "Installation Tasks" section in Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.
4.2.3 Post installation
This section describes the common post installation issues and their resolution steps.
4.2.3.1 Helm Test Error Scenarios
Identify error scenarios using the helm test as follows:
- Run the following command to get the Helm Test pod
name:
kubectl get pods -n <deployment-namespace> - Check for the Helm Test pod that is in the error state.
- Run the following command to get the
logs:
kubectl logs <podname> -n <namespace>Example:kubectl get <helm_test_pod> -n ocnssfDepending on the failure reasons, perform the resolution steps.
For further assistance, collect the logs and contact MOS.
4.3 Upgrade or Rollback Failure
When Oracle Communications Cloud Native Core, Network Slice Selection Function (NSSF) upgrade or rollback fails, perform the following procedure.
- Check the pre or post upgrade logs or rollback hook logs in Kibana as
applicable.
Users can filter upgrade or rollback logs using the following filters:
- For upgrade: lifeCycleEvent=9001
-
For rollback: lifeCycleEvent=9002
{ "time_stamp":"2021-08-23 06:45:57.698+0000", "thread":"main", "level":"INFO", "logger":"com.oracle.cgbu.cne.ocnssf.hooks.releases.ReleaseHelmHook_1_14_1", "message":"{logMsg=Starting Pre-Upgrade hook Execution, lifeCycleEvent=9001 | Upgrade, sourceRelease=101400, targetRelease=101401}", "loc":"com.oracle.cgbu.ocnssf.common.utils.EventSpecificLogger.submit(EventSpecificLogger.java:94)" } - Check the pod logs in Kibana to analyze the cause of failure.
- After detecting the cause of failure, do the following:
- For upgrade failure:
- If the cause of upgrade failure is a database or network connectivity issue, then resolve the issue and rerun the upgrade command.
- If the cause of failure is not related to a database or network connectivity issue and is observed during the preupgrade phase, then do not perform rollback because NSSF deployment remains in the source or older release.
- If the upgrade failure occurs during the postupgrade phase, for example, post upgrade hook failure due to target release pod not moving to ready state, then perform a rollback.
- For rollback failure: If the cause of rollback failure is a database or network connectivity issue, contact your system administrator. When the issue is resolved, rerun the rollback command.
- For upgrade failure:
- If the issue persists, contact My Oracle Support.
4.3.1 Replication Channel Breaks While Rolling Back cnDBTier from 23.4.x to 23.3.x.
Scenario
Replication Channel has broken while doing a rollback of cnDBTier from 23.4.x to 23.3.x.
Problem
Intermittently, during rollback of cnDBTier from 23.4.x to 23.3.x in georedundant scenario, the replication is going down.
Solution
As a workaround, follow the recovery procedure explained in the sections, "Resolving Georeplication Failure Between cnDBTier Clusters in a Two Site Replication" and "Resolving Georeplication Failure Between cnDBTier Clusters in a Three Site Replication " in Cloud Native Core, cnDBTier Installation, Upgrade, and Fault Recovery Guide to recover the replication.
4.3.2 Helm Hook Failure and NrfClient Discovery Restart
Scenario
Helm Hook Failure and NrfClient Discovery Restart
Problem
During the upgrade process, the NSSF hooks are responsible for updating the
common_configuration table, which maintains one record per service
per version. The hooks perform the following actions:
- Preinstall/Preupgrade: Each service adds a record to the table with its name, release, and current configuration.
- Postupgrade: Removes entries related to the previous release of the same service.
If a Helm hook fails due to issues such as connectivity problems or network glitches,
subsequent hooks do not execute. This can result in the postupgrade
hook failing to run, leaving the common_configuration table in an
inconsistent state—such as having duplicate entries or missing records for the current
release.
Impact:
When the postupgrade hook fails and the table is not updated correctly,
it can cause the NrfClient Discovery service to restart. This occurs because the
NrfClient relies on receiving a 200 OK response from the config_server,
which is only returned when a valid record exists in the
common_configuration table for the current release. The
inconsistency leads to delays in the NrfClient's startup process.
Solution
Due to Helm’s behavior, a failed hook prevents subsequent hooks from running. As a result, cleanup actions, such as removing old release entries and insertion of valid records for the current release may not occur. This prevents the NrfClient from starting correctly.
To ensure the NrfClient Discovery service starts as expected, perform the following steps:
- Manual Cleanup: If a Helm hook failure occurs, manually remove any entries
related to previous releases from the
common_configurationtable. Ensure there is only one entry per service for the current release. - Verify Current Release Record: Confirm that the
common_configurationtable contains a valid record for each service corresponding to the current release.
4.4 Database Related Issues
This section describes the most common database related issues and their resolution steps. It is recommended to attempt the resolution steps provided in this guide before contacting Oracle Support.
4.4.1 NSSF MySQL Database Access
Problem
Keyword - wait-for-db Tags - "config-server" "database" "readiness" "init" "SQLException" "access denied"Due to database accessibility issues from the NSSF service, pods will stay in the init state.
Even though some pods are up, they still keep receiving the following exception: " Cannot connect to database server java.sql.SQLException"
Reasons:
- MySQL host IP address OR MySQL-service name[in case of occne-infra] is incorrect.
- Few MySQL nodes are probably down.
- The username or password given in the secrets are not created in the database or do not have proper grant or access to service databases.
- Databases are not created correctly with the same name mentioned in
the
ocnssf_custom_values_25.1.201.yamlfile while installing NSSF.
Resolution Steps
- Check if the database IP is proper and pingable from worker
nodes of the Kubernetes cluster. Update the database IP and service
accordingly. If required, you can use floating IP as well. If the database
connectivity persists, then update the correct IP address.
In the case of OCCNE-infra, instead of mentioning IP address for MySQL connection, use FQDN for mysql-connectivity-service to connect to the database.
- Manually log in to MySQL through the same database IP as
mentioned in the
ocnssf_custom_values_25.1.201.yamlfile. In case of MySQL service name, run the following command to describe the service:
Log in to the MySQL database with all sets of IPs described in the MySQL service. If any SQL node is down, it can lead to an intermittent database query failure. So, make sure that you can log in to MySQL from all the nodes mentioned in the IP list of MySQL service describe command.kubectl describe svc <mysql-servicename> -n <namespace>Make sure that all the MySQL nodes are up and running before installing NSSF.
- Check the existing user list into the database using SQL query:
"select user from mysql.user;"
Check if all the mentioned users in the custom-value of NSSF installation are present in the database.
Note:
Create the user with correct password as mentioned in the secret file of the NSSF. - Check the grants of all the users mentioned into the
ocnssf_custom_values_25.1.201.yamlfile by SQL query: "show grants for <username>;"If a username or password issue persists, then correctly create a user with the required password and also provide the grants as per the Oracle Communications Cloud Native Core, Network Slice Selection Function Installation, Upgrade, and Fault Recovery Guide.
- Check if the databases are created with the same name as
mentioned in the
ocnssf_custom_values_25.1.201.yamlfile for the services.Note:
Create the database as per theocnssf_custom_values_25.1.201.yamlfile. - Check if problematic pods are getting created on any one unique worker node. If yes, then may be the cause of the error can be the worker node. Try draining the problematic worker node and allowing pods to move to another node.
4.4.2 "Communications link failure" Error
Problem
In two-site georedundant deployment, the "Communications link failure" error is observed in the logs of every Ns-Selection pod at both the sites.This error indicates a connection problem between the NSSF (Network Slice Selection Function) pods and the cnDBTier (Cloud Native Database Tier), potentially disrupting data exchange and regular functionality.
The
com.mysql.cj.jdbc.exceptions.CommunicationsException error
typically signifies a failed connection attempt between NSSF and the MySQL database
server.
Resolution Steps
When there is no data or messages loss, this error can be ignored. Here's why ignoring this error can be considered under certain circumstances.
Temporary Network Hiccups: Brief network interruptions, often caused by switch issues, router fluctuations, or temporary network congestion, can lead to connection failures without impacting data integrity. These interruptions result in the "CommunicationsException" error, but since the interruptions are momentary, and no data transfer occurs in this small amount of time, so no data is lost.
4.5 Troubleshooting TLS Related Issues
This section describes the TLS related issues and their resolution steps. It is recommended to attempt the resolution steps provided in this guide before contacting Oracle Support.
Problem: Handshake is not established between NSSFs.
Scenario: When the client version is TLSv1.2 and the server version is TLSv1.3.
Server Error Message
The client supported protocol versions[TLSv1.2] are not accepted by server preferences [TLSv1.3]
Client Error Message
Received fatal alert: protocol_version
Scenario: When the client version is TLSv1.3 and the server version is TLSv1.2.
Server Error Message
The client supported protocol versions[TLSv1.3]are not accepted by server preferences [TLSv1.2]
Client Error Message
Received fatal alert: protocol_version
Solution:
If the error logs have the SSL exception, do the following:
Check the TLS version of both NSSFs, if both support different and single TLS versions, (that is, NSSF 1 supports TLS 1.2 only and NSSF 2 supports TLS 1.3 only or vice versa), handshake fails. Ensure that the TLS version is same for both NSSFs or revert to default configuration for both NSSFs. The TLS version communication supported are:
Table 4-1 TLS Version Used
| Client TLS Version | Server TLS Version | TLS Version Used |
|---|---|---|
| TLSv1.2, TLSv1.3 | TLSv1.2, TLSv1.3 | TLSv1.3 |
| TLSv1.3 | TLSv1.3 | TLSv1.3 |
| TLSv1.3 | TLSv1.2, TLSv1.3 | TLSv1.3 |
| TLSv1.2, TLSv1.3 | TLSv1.3 | TLSv1.3 |
| TLSv1.2 | TLSv1.2, TLSv1.3 | TLSv1.2 |
| TLSv1.2, TLSv1.3 | TLSv1.2 | TLSv1.2 |
Check the cipher suites being supported by both NSSFs, it should be either the same or should have common cipher suites present. If not, revert to default configuration.
Problem: Pods are not coming up after populating the
clientDisabledExtension or serverDisabledExtension
Helm parameter.
Solution:
- Check the value of the
clientDisabledExtensionorserverDisabledExtensionparameters. The following extensions should not be present for these parameters:- supported_versions
- key_share
- supported_groups
- signature_algorithms
- pre_shared_key
If any of the above values is present, remove them or revert to default configuration for the pod to come up.
Problem: Pods are not coming up after populating the
clientSignatureSchemes Helm parameter.
Solution:
- Check the value of the
clientSignatureSchemesparameter. - The following values should be present for this parameter:
- rsa_pkcs1_sha512
- rsa_pkcs1_sha384
- rsa_pkcs1_sha256