6 Troubleshooting SCP
This section provides information to troubleshoot the common error which can be encountered during the installation and upgrade of Service Communication Proxy (SCP).
Generic Checklist
The following sections provide generic checklist for troubleshooting tips.
Deployment related tips
- Are OCSCP deployment, pods and services created, running and
available?
Execute following the command:
# kubectl -n <namespace> get deployments,pods,svc
Inspect the output, check the following columns:- AVAILABLE of deployment
- READY, STATUS and RESTARTS of pod
- PORT(S) of service
- Is the correct image used and the correct environment variables
set in the deployment?
Execute following the command:
# kubectl -n <namespace> get deployment <deployment-name> -o yaml
- Check if the micro-services can access each other via REST
interface.
Execute following command:
# kubectl -n <namespace> exec <pod name> -- curl <uri>
Example:# kubectl -n scp-svc exec $(kubectl -n scp-svc get pods -o name|cut -d'/' -f2|grep nfs) -- curl http://ocscp-nfregistration:8080/nscp-nfm/v1/nf-instances
# kubectl -n scp-svc exec $(kubectl -n scp-svc get pods -o name|cut -d'/' -f2|grep nfr) -- curl http://ocscp-nfsubscription:8080/nscp-nfm/v1/nf-instances
Note:
These commands are in their simple form and display the logs only if there is 1 scp<registration> and nfs<subscription> pod deployed.
Application related tips
# kubectl -n <namespace> logs -f <pod name>
You can use '-f' to follow the logs or 'grep' for specific pattern in the log output.
Example:
# kubectl -n scp-svc logs -f $(kubectl -n scp-svc get pods -o name|cut -d'/' -f2|grep nfr)
# kubectl -n scp-svc logs -f $(kubectl -n scp-svc get pods -o name|cut -d'/' -f2|grep nfs)
Note:
These commands are in their simple form and display the logs only if there is 1 scp<registration> and nf<subscription> pod deployed.Helm Install Failure
helm install
might fail. Following are some of the scenarios:
Incorrect image name in ocscp-custom-values files
Problem
helm install
might fail if incorrect image name is
provided in the ocscp-custom-values file.
Error Code/Error Message
When kubectl get pods -n <ocscp_namespace>
is
executed, the status of the pods might be ImagePullBackOff or ErrImagePull.
Solution
- Edit ocscp-custom-values file and provide release specific image name and tags.
- Execute
helm install
command. - Execute
kubectl get pods -n <ocscp_namespace>
to verify if the status of all the pods is Running.
Docker registry is configured incorrectly
Problem
helm install
might fail if docker registry is not
configured in all primary and secondary nodes.
Error Code/Error Message
When kubectl get pods -n <ocscp_namespace>
is
executed, the status of the pods might be ImagePullBackOff or ErrImagePull.
Solution
Configure docker registry on all primary and secondary nodes.
Continuous Restart of Pods
Problem
helm install
might fail if MySQL primary and secondary
hosts may not be configured properly in ocscp-custom-values.yaml.
Error Code/Error Message
When kubectl get pods -n <ocscp_namespace>
is
executed, the pods restart count increases continuously.
Solution
MySQL servers(s) may not be configured properly. Refer to Installation Tasks for more information on MySQL configuration.
Custom Value File Parse Failure
Problem
Not able to parse ocscp-custom-values-x.x.x.yaml, while running helm install.
Error Code/Error Message
Error: failed to parse ocscp-custom-values-x.x.x.yaml: error converting YAML to JSON: yaml
Symptom
While creating the ocscp-custom-values-x.x.x.yaml file, if the above mentioned error is received, it means that the file is not created properly. The tree structure may not have been followed and/or there may also be tab spaces in the file.
Solution
- Download the latest SCP templates zip file from OHC. Refer to Installation Tasks for more information.
- Follow the steps mentioned in the Installation Tasks section.
Curl HTTP2 Not Supported
Problem
curl http2 is not supported on the system.
Error Code/Error Message
Unsupported protocol error is thrown or connection is established with HTTP/1.1 200 OK
Symptom
If unsupported protocol error is thrown or connection is established with http1.1, it is an indication that curl http2 support may not be present on your machine.
Solution
Following is the procedure to install curl with HTTP2 support:
- Make sure git is
installed:
$ sudo yum install git -y
- Install
nghttp2:
$ git clone https://github.com/tatsuhiro-t/nghttp2.git $ cd nghttp2
$ autoreconf -i $ automake $ autoconf
$ ./configure $ make $ sudo make install
$ echo '/usr/local/lib' > /etc/ld.so.conf.d/custom-libs.conf
$ ldconfig
- Install the latest Curl:
$ wget http://curl.haxx.se/download/curl-7.46.0.tar.bz2 (NOTE: Check for latest version during Installation) $ tar -xvjf curl-7.46.0.tar.bz2 $ cd curl-7.46.0
$ ./configure --with-nghttp2=/usr/local --with-ssl
$ make
$ sudo make install
$ sudo ldconfig
- Make sure HTTP2 is added in features by executing the following
command:
$ curl --http2-prior-knowledge -v "<http://10.75.204.35:32270/nscp-disc/v1/nf-instances?requester-nf-type=AMF&target-nf-type=SMF>"
Kubernetes Node Failure
Problem
Kubernetes nodes goes down.
Error Code/Error Message
"NotReady" status is displayed against the Kubernetes node.
Symptom
Figure 6-1 Kubernetes Nodes Output
Solution
- Execute the following command to describe the node:
kubectl describe node <kubernete_node_name>
Example:
kubectl describe node k8s-1.odyssey.morrisville.us.lab.oracle.com
- Check Nodes utilization by running the command:
kubectl top nodes
SCP DB goes into deadlock state
Problem
MySQL locks gets struck.
Error Code/Error Message
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Symptom
Unable to access MySQL.
Solution
- Execute the following command on each SQL node:
SELECT CONCAT('KILL ', id, ';') FROM INFORMATION_SCHEMA.PROCESSLIST WHERE `User` = <DbUsername> AND `db` = <DbName>;
This command will retrieve the list of commands to kill each connections.
Example:select CONCAT('KILL ', id, ';') FROM INFORMATION_SCHEMA.PROCESSLIST where `User` = 'scpuser' AND `db` = 'ocscpdb'; +--------------------------+ | CONCAT('KILL ', id, ';') | +--------------------------+ | KILL 204491; | | KILL 200332; | | KILL 202845; | +--------------------------+ 3 rows in set (0.00 sec)
- Execute the kill command on each SQL node.
Tiller Pod Failure
Problem
Tiller Pod is not ready to run helm install.
Error Code/Error Message
The error 'could not find a ready tiller pod' message is received.
Symptom
When helm ls
is executed, 'could not find a ready
tiller pod' message is received.
Solution
- Delete the pre-installed
helm:
kubectl delete svc tiller-deploy -n kube-system kubectl delete deploy tiller-deploy -n kube-system
- Install helm and tiller using this
commands:
helm init --client-only helm plugin install https://github.com/rimusz/helm-tiller helm tiller install helm tiller start kube-system