7 Troubleshooting Cloud Native Core Policy (CNC Policy)
This section provides information to troubleshoot the common error which can be encountered during the installation and upgrade of Cloud Native Core Policy (CNC Policy).
If helm install
command Fails
This section covers the reasons and troubleshooting procedures if the
helm install
command fails.
helm install
failure:- Chart syntax issue [This issue could be shown in the few
seconds]
Please resolve the chart specific things and rerun the
helm install
command, because in this case, no hooks should have begun. - Most possible reason [TIMEOUT]
If any job stuck in a pending/error state and not able to execute, it will result in the timeout after 5 minutes. As default timeout for helm command is "5 minutes". In this case, we have to follow the below steps to troubleshoot.
helm install
command failed in case of duplicated charthelm install /home/cloud-user/pcf_1.6.1/sprint3.1/ocpcf-1.6.1-sprint.3.1.tgz --name ocpcf2 --namespace ocpcf2 -f <custom-value-file>
Error: release ocpcf2 failed: configmaps "perfinfo-config-ocpcf2" already exists
Here, configmap 'perfinfo-config-ocpcf2' exists multiple times, while creating Kubernetes objects after pre-upgrade hooks, this will be failed. In this case also please go through the below troubleshooting steps.Troubleshooting steps:- Check from describe/logs of failure pods and fix
them accordingly. You need to verify what went wrong on the
installation of the CNC Policy by checking the below points:
For the PODs which were not started, run the following command to check the failed pods:
kubectl describe pod <pod-name> -n <release-namespace>
For the PODs which were started but failed to come into "READY"state, run the following command to check the failed pods:kubectl describe logs <pod-name> -n <release-namespace>
- Execute the below command to get kubernetes
objects:
This gives a detailed overview of which objects are stuck or in a failed state.kubectl get all -n <release_namespace>
- Execute the below command to delete all kubernetes
objects:
kubectl delete all --all -n <release_namespace>
- Execute the below command to delete all current
configmaps:
kubectl delete cm --all -n <release-namespace>
- Execute the below command to cleanup the databases
created by the
helm install
command and create the database again:DROP DATABASE IF EXISTS occnp_audit_service; DROP DATABASE IF EXISTS occnp_config_server; DROP DATABASE IF EXISTS occnp_pcf_am; DROP DATABASE IF EXISTS occnp_pcf_sm; DROP DATABASE IF EXISTS occnp_pcf_user; DROP DATABASE IF EXISTS occnp_pcrf_core; DROP DATABASE IF EXISTS occnp_release; DROP DATABASE IF EXISTS occnp_binding; CREATE DATABASE IF NOT EXISTS occnp_audit_service; CREATE DATABASE IF NOT EXISTS occnp_config_server; CREATE DATABASE IF NOT EXISTS occnp_pcf_am; CREATE DATABASE IF NOT EXISTS occnp_pcf_sm; CREATE DATABASE IF NOT EXISTS occnp_pcf_user; CREATE DATABASE IF NOT EXISTS occnp_pcrf_core; CREATE DATABASE IF NOT EXISTS occnp_release; CREATE DATABASE IF NOT EXISTS occnp_binding;
- Execute the below command
:
If this is in a failed state, please purge the namespace using the commandhelm ls --all
helm delete --purge <release_namespace>
Note:
If the execution of this command is taking more time, run the below command parallelly in another session to clear all the delete jobs.while true; do kubectl delete jobs --all -n <release_namespace>; sleep 5;done
Once that is succeeded, press "ctrl+c" to stop the above script.helm delete --purge <release_namespace>
- After the database cleanup and creation of the
database again, run the
helm install
command.
- Check from describe/logs of failure pods and fix
them accordingly. You need to verify what went wrong on the
installation of the CNC Policy by checking the below points: