Troubleshooting Service Communication Proxy

2.1 Generic Checklist

The following sections provide a generic checklist for troubleshooting tips.

Deployment Related Tips

Perform the following checks before the deployment:

Are OCSCP deployment, pods, and services created, running, and available?
To check this, run the following command:
```
# kubectl -n <namespace> get deployments,pods,svc
```
Inspect the output and check the following columns:
- READY, STATUS, and RESTARTS
- PORT(S)
Is the correct image used and the correct environment variables set in the deployment?
To check this, run the following command:
```
# kubectl -n <namespace> get deployment <deployment-name> -o yaml
```
Check if the microservices can access each other through REST interface.
To check this, run the following command:
```
# kubectl -n <namespace> exec <pod name> -- curl <uri>
```
Example:
```
kubectl exec -it ocscp-scpc-subscription-6bf9b7d69f-qvcnn -n ocscp curl http://ocscp-scpc-notification:8082/ocscp/scpc-notification/v2/compositecustomobjects
```
```
kubectl exec -it ocscp-scpc-notification-dd5c74869-cswkb -n ocscp curl http://ocscp-scpc-configuration:8081/ocscp/scpc-configuration/v1/systemoptions/sse
```
Note:
These commands are in their simplest form and display the logs only if scpc-notification and scpc-configuration pods are deployed.

The list of URIs for all the microservices:

scp-worker: http://ocscp-scp-worker:8000/hostNFMapper
scpc-configuration: http://ocscp-scpc-configuration:8081/ocscp/scpc-configuration/v1/systemoptions/sse Or any other configuration URI from SWAGGER-UI
scpc-notification: http://ocscp-scpc-notification:8082/ocscp/scpc-notification/v2/compositecustomobjects
scpc-subscription: http://ocscp-scpc-subscription:8080/ocscp/scpc-subscription/v1/appstate

Application Related Tips

Run the following command to check the application logs and look for exceptions:

# kubectl -n <namespace> logs -f <pod name>

You can use '-f' to follow the logs or 'grep' for specific pattern in the log output.

Example:

# kubectl -n scp-svc logs -f $(kubectl -n scp-svc get pods -o name|cut -d'/' -f 2|grep nfr)

Note:

These commands are in their simple form and display the logs only if there is 1 scp<registration> and nf<subscription> pod deployed.

2.2 Helm Install Failure

This section describes Helm installation failure scenarios.

2.2.1 Incorrect Image Name in the ocscp-custom-values Files

Problem

helm install might fail if incorrect image name is provided in the ocscp_values.yaml file.

Error Code or Error Message

When you run kubectl get pods -n <ocscp_namespace>, the status of the pods might be ImagePullBackOff or ErrImagePull.

Solution

Perform the following steps to verify and correct the image name:

Edit the ocscp_values.yaml file and provide release specific image name and tags.
Run the helm install command.
Run the kubectl get pods -n <ocscp_namespace> command to verify if the status of all the pods is Running.

2.2.2 Docker Registry is Incorrectly Configured

Problem

helm install might fail if docker registry is not configured in all primary and secondary nodes.

Error Code or Error Message

When you run kubectl get pods -n <ocscp_namespace>, the status of the pods might be ImagePullBackOff or ErrImagePull.

Solution

Configure docker registry on all primary and secondary nodes.

For information about docker registry configuration, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

2.2.3 Continuous Restart of Pods

Problem

helm install might fail if MySQL primary and secondary hosts may not be configured properly in ocscp-custom-values.yaml.

Error Code/Error Message

When you run kubectl get pods -n <ocscp_namespace>, the pods restart count increases continuously.

Solution

MySQL servers may not be configured properly. For more information about the MySQL configuration, see Oracle Communications Cloud Native Core, Service Communication Proxy Installation, Upgrade, and Fault Recovery Guide.

2.3 Custom Value File Parse Failure

This section explains troubleshooting procedure in case of failure during parsing custom-values.yaml file.

Problem

Unable to parse the ocscp_values-x.x.x.yaml file while running Helm install.

Error Code or Error Message

Error: failed to parse ocscp_values-x.x.x.yaml: error converting YAML to JSON: yaml

Symptom

When parsing the ocscp-custom-values-x.x.x.yaml file, if the above mentioned error is received, it indicates that the file is not parsed because of the following reasons:

The tree structure may not have been followed
There may be a tab spaces in the file

Solution

Download the ocscp_csar_23_2_0_0_0.zip folder from MOS and complete the steps as described in Oracle Communications Cloud Native Core, Service Communication Proxy Installation, Upgrade, and Fault Recovery Guide.

2.4 Curl HTTP2 Not Supported

Problem

curl http2 is not supported on the system.

Error Code or Error Message

Unsupported protocol error is thrown or connection is established with HTTP/1.1 200 OK

Symptom

If unsupported protocol error is thrown or connection is established with http1.1, it is an indication that curl http2 support may not be present on your machine.

Solution

Note:

You must check the software platform policies before executing the following procedure.

Following is the procedure to install curl with HTTP2 support:

Run the following command to ensure that Git is installed:
```
$ sudo yum install git -y 
```

Run the following commands to install nghttp2:

$ git clone https://github.com/tatsuhiro-t/nghttp2.git
$ cd nghttp2

$ autoreconf -i
$ automake
$ autoconf

$ ./configure
$ make
$ sudo make install

$ echo '/usr/local/lib' > /etc/ld.so.conf.d/custom-libs.conf

$ ldconfig

Run the following commands to install the latest Curl:

$ wget http://curl.haxx.se/download/curl-7.46.0.tar.bz2  (NOTE: Check for latest version during Installation)
$ tar -xvjf curl-7.46.0.tar.bz2
$ cd curl-7.46.0

$ ./configure --with-nghttp2=/usr/local --with-ssl

$ make

$ sudo make install

$ sudo ldconfig

Run the following command to ensure that HTTP2 is added in features:

$ curl --http2-prior-knowledge -v "<http://10.75.204.35:32270/nscp-disc/v1/nf-instances?requester-nf-type=AMF&target-nf-type=SMF>"

2.5 SCP DB goes into the Deadlock State

Problem

MySQL locks get struck.

Error Code/Error Message

ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction.

Symptom

Unable to access MySQL.

Solution

Following is the procedure to remove the deadlock:

Run the following command on each SQL node:

SELECT
CONCAT('KILL ', id, ';')
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE `User` = <DbUsername>
AND `db` = <DbName>;

This command retrieves the list of commands to kill each connections.

Example:

select
 CONCAT('KILL ', id, ';')
 FROM INFORMATION_SCHEMA.PROCESSLIST
 where `User` = 'scpuser'
      AND `db` = 'ocscpdb';
+--------------------------+
| CONCAT('KILL ', id, ';') |
+--------------------------+
| KILL 204491;             |
| KILL 200332;             |
| KILL 202845;             |
+--------------------------+
3 rows in set (0.00 sec)

Run the kill command on each SQL node.

2.6 Helm Test Error Scenarios

Following are the error scenarios that may be identified using helm test.

Note:

You must clean up any residual job from the SCP deployment before performing the following procedure.

Run the following command to retrieve the Helm Test pod name:
```
kubectl get pods -n <deployment-namespace>
```
Check for the Helm Test pod that is in the error state:

Figure 2-1 Helm test pod
Run the following command to retrieve the logs:
```
kubectl logs <podname> -n <namespace>
```

Readiness Probe Failure

Helm install might fail due to the readiness probe URL failure.

In case the following error appears, check for readiness probe URL correctness in the particular microservice helm charts under charts folder:

Figure 2-2 Readiness Probe Failure

Low Resources

Helm install might fail due to low resource and following error may appear:

Figure 2-3 Low Resource

In this case, check the CPU and memory availability in the Kubernetes cluster.

2.7 Using Debug Tool

The Debug Tool provides third-party troubleshooting tools for debugging the runtime issues for lab environment. The following tools are available:

tcpdump
ip
netstat
curl
ping
nmap
dig

2.7.1 Prerequisites to Use the Debug Tool

This section describes the prerequisites for using the debug tool.

Note:

For CNE 23.2.0 and later versions, follow Step 1.
For CNE versions prior to 23.2.0, follow Step 2.

The debug tool requires security context with the following permissions:

securityContext:
        allowPrivilegeEscalation: true
        capabilities:
          drop:
          - ALL
          add:
          - NET_RAW
          - NET_ADMIN
        runAsUser: <user-id>
      volumeMounts:
      - mountPath: /tmp/tools
        name: debug-tools-dir

For OpenShift environment, security context constraint must exist to allow above permissions to enable debug tool deployment.

Ensure that you have admin privileges.

If you are using CNE 23.2.0 and later versions, add a namespace to an empty resource by running the following command to verify if the current disallow-capabilities cluster policy has namespace in it:

$ kubectl get clusterpolicies disallow-capabilities -oyaml

Sample output:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
...
...
spec:
  rules:
  -exclude:
      any:
      -resources:{}

If there are no namespaces, then patch the policy using the following command to add <namespace> under resources:

$ kubectl patch clusterpolicy disallow-capabilities --type=json \
  -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources", "value": {"namespaces":["<namespace>"]} }]'

Example:

$ kubectl patch clusterpolicy disallow-capabilities --type=json \
  -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources", "value": {"namespaces":["ocscp"]} }]'

Sample output:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
...
...
spec:
  rules:
  -exclude:
      resources:
        namespaces:
        -ocscp

If it is required to remove the namespace added in the above mentioned step, run the following command:

$ kubectl patch clusterpolicy disallow-capabilities --type=json \
  -p='[{"op": "replace", "path": "/spec/rules/0/exclude/any/0/resources", "value": {} }]'

Sample output:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
...
...
spec:
  rules:
  -exclude:
      any:
      -resources:{}

To add a namespace to an existing namespace list, run the following command to verify if the current disallow-capabilities cluster policy has namespaces in it:

$ kubectl get clusterpolicies disallow-capabilities -oyaml

Sample output:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
...
...
spec:
  rules:
  -exclude:
      any:
      -resources:
          namespaces:
          -namespace1
          -namespace2
          -namespace3

If namespaces are already added, then patch the policy by running the following command to add <namespace> to the existing list:

$ kubectl patch clusterpolicy disallow-capabilities --type=json \
  -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/-", "value": "<namespace>" }]'

Example:

$ kubectl patch clusterpolicy disallow-capabilities --type=json \
  -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/-", "value": "ocscp" }]'

Sample output:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
...
...
spec:
  rules:
  -exclude:
      resources:
        namespaces:
        -namespace1
        -namespace2
        -namespace3
        -ocscp

If it is required to remove the namespace added in the above mentioned step, run the following command:

$ kubectl patch clusterpolicy disallow-capabilities --type=json \
  -p='[{"op": "remove", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/<index>"}]'

Example:

$ kubectl patch clusterpolicy disallow-capabilities --type=json \
  -p='[{"op": "remove", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/3"}]'

Sample output:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
...
...
spec:
  rules:
  -exclude:
      resources:
        namespaces:
        -namespace1
        -namespace2
        -namespace3

Note:

While removing the namespace, provide the index value for namespace within the array. The index starts from '0'.

If you are using CNE versions prior to 23.2.0, create PodSecurityPolicy (PSP) in the Bastion Host by following these steps:

Log in to the Bastion Host.

Run the following command to create a new PSP:

Note:

readOnlyRootFileSystem, allowPrivilegeEscalation, and allowedCapabilities parameters are required by the debug container. Other parameters are mandatory for PSP creation and can be customized as per the CNE environment. It is recommended to use the default values.

$ kubectl apply -f - <<EOF

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: debug-tool-psp
spec:
  readOnlyRootFilesystem: false
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - NET_ADMIN
  - NET_RAW
  fsGroup:
    ranges:
    - max: 65535
      min: 1
    rule: MustRunAs
  runAsUser:
    rule: MustRunAsNonRoot
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - configMap
  - downwardAPI
  - emptyDir
  - persistentVolumeClaim
  - projected
  - secret
EOF

Run the following command to create a role for PSP:

kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: debug-tool-role
  namespace: cncc
rules:
- apiGroups:
  - policy
  resources:
  - podsecuritypolicies
  verbs:
  - use
  resourceNames:
  - debug-tool-psp
EOF

Run the following command to associate the service account for your NF namespace with the role created for the tool PSP:

$ kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: debug-tool-rolebinding
  namespace: ocscp
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: debug-tool-role
subjects:
- kind: Group
  apiGroup: rbac.authorization.k8s.io
  name: system:serviceaccounts
EOF

For information about parameters, see Debug Tool Configuration Parameters.

To complete the NF specific Helm configurations, do the following:
1. Log in to the NF server.
2. Run the following command to open the ocscp_custom_values.yaml file:
```
$ vim <ocscp_custom_values file>
```
3. In the global configuration section, add the following:
```
extraContainersTpl: |
    - command:
        - /bin/sleep
        - infinity
      name: tools
      resources:
        requests:
          cpu: "1"
          memory: {{ .Values.global.debugToolContainerMemoryLimit | quote }}
          ephemeral-storage: "2Gi"
        limits:
          cpu: "1"
          memory: {{ .Values.global.debugToolContainerMemoryLimit | quote }}
          ephemeral-storage: "4Gi"
      securityContext:
        allowPrivilegeEscalation: true
        capabilities:
          drop:
          - ALL
          add:
          - NET_RAW
          - NET_ADMIN
        runAsUser: <user-id>
      volumeMounts:
      - mountPath: /tmp/tools
        name: debug-tools-dir
```
  Note:
  - The Debug Tool Container has a default user ID: 7000. If the you want to override this default value, it can be done using the runAsUser field. Otherwise, the field can be skipped.
    Default value: uid=7000(debugtool) gid=7000(debugtool) groups=7000(debugtool)
  - If you want to customize the container name, replace the name field in the above mentioned values.yaml with the following:
    name: {{ printf "%s-tools-%s" (include "getprefix" .) (include "getsuffix" .) | trunc 63 | trimPrefix "-" | trimSuffix "-" }}
    This ensures that the container name is prefixed and suffixed with the required values.
4. In service specific configurations for which debugging is required, add the following:
```
# Allowed Values: DISABLED, ENABLED, USE_GLOBAL_VALUE
extraContainers: USE_GLOBAL_VALUE
```
  Note:
  - At the global level, the extraContainers parameter can be used to enable or disable injecting extra containers globally. This ensures that all the services that use this global value have extra containers enabled or disabled using a single parameter.
  - At the service level, the extraContainers parameter determines whether to use the extra container configuration from the global level, or enable or disable injecting extra containers for the specific service.

2.7.2 Running the Debug Tool

Perform the following procedure to run the debug tool.

To enter the debug tool container, run the following command to retrieve the pod details:

$ kubectl get pods -n <k8s namespace>

Example:

$ kubectl get pods -n ocscp

Sample Output:

NAME                                        READY   STATUS    RESTARTS   AGE
ocscp-scp-worker-58c8469595-5d6gc           2/2     Running   0          3d12h
ocscp-scpc-audit-d8f5cdc96-4nw7c            2/2     Running   0          3d12h
ocscp-scpc-configuration-774f9bc65b-f9ttn   2/2     Running   0          3d12h
ocscp-scpc-notification-779965766-9ckr5     2/2     Running   0          3d12h
ocscp-scpc-subscription-7bf6d6c884-2r4vp    2/2     Running   0          3d12h

Run the following command to enter the debug tool container:

$ kubectl exec -it <pod name> -c <debug_container name> -n <namespace> bash

Example:

$ kubectl exec -it ocscp-scpc-notification-779965766-9ckr5 -c tools -n ocscp bash

Run the debug tools:

bash -4.2$ <debug_tools>

Example:

bash -4.2$ tcpdump

Copy the output files from container to host:

$ kubectl cp -c <debug_container name> <pod name>:<file location in container> -n <namespace> <destination location>

Example:

$ kubectl cp -c tools ocscp-scpc-notification-779965766-9ckr5:/tmp/capture.pcap -n ocscp /tmp/

2.7.3 Debug Tool Configuration Parameters

Following are the parameters used to configure the debug tool.

CNE Parameters

Table 2-1 CNE Parameters

Parameter	Description
apiVersion	Defines the version schema of this representation of an object.
kind	Indicates a string value representing the REST resource this object represents.
metadata	Indicates the metadata of Standard object.
metadata.name	Indicates the metadata name that must be unique within a namespace.
spec	Defines the policy enforced.
spec.readOnlyRootFilesystem	Controls whether the containers run with a read-only root filesystem, that is, no writable layer.
spec.allowPrivilegeEscalation	Gates whether or not a user is allowed to set the security context of a container to allowPrivilegeEscalation=true.
spec.allowedCapabilities	Provides a list of capabilities that are allowed to be added to a container.
spec.fsGroup	Controls the supplemental group applied to some volumes. RunAsAny allows any fsGroup ID to be specified.
spec.runAsUser	Controls which user ID the containers are run with. RunAsAny allows any runAsUser to be specified.
spec.seLinux	RunAsAny allows any seLinuxOptions to be specified.
spec.supplementalGroups	Controls which group IDs containers add. RunAsAny allows any supplementalGroups to be specified.
spec.volumes	Provides a list of allowed volume types. The allowed values correspond to the volume sources that are defined when creating a volume.

Role Creation Parameters

Table 2-2 Role Creation

Parameter	Description
apiVersion	Defines the versioned schema of this representation of an object.
kind	Indicates a string value representing the REST resource this object represents.
metadata	Indicates the metadata of Standard object.
metadata.name	Indicates the name of metadata that must be unique within a namespace.
metadata.namespace	Defines the space within which each name must be unique.
rules	Manages all the PolicyRules for this Role.
apiGroups	Indicates the name of the APIGroup that contains the resources.
rules.resources	Indicates the list of resources this rule applies to.
rules.verbs	Indicates the list of Verbs that applies to ALL the ResourceKinds and AttributeRestrictions contained in this rule.
rules.resourceNames	Indicates an optional white list of names that the rule applies to.

Table 2-3 Role Binding Creation

Parameter	Description
apiVersion	Defines the versioned schema of this representation of an object.
kind	Indicates the string value representing the REST resource this object represents.
metadata	Indicates the metadata of Standard object.
metadata.name	Indicates the name that must be unique within a namespace.
metadata.namespace	Defines the space within which each name must be unique.
roleRef	References a Role in the current namespace or a ClusterRole in the global namespace.
roleRef.apiGroup	Indicates the group for the resource being referenced.
roleRef.kind	Indicates the type of resource being referenced.
roleRef.name	Indicates the name of resource being referenced.
subjects	Manages references to the objects the role applies to.
subjects.kind	Indicates the type of object being referenced. Values defined by this API group are "User", "Group", and "ServiceAccount".
subjects.apiGroup	Manages the API group of the referenced subject.
subjects.name	Indicates the name of the object being referenced.

Debug Tool Configuration Parameters

Table 2-4 Debug Tool Configuration Parameters

Parameter	Description
command	Indicates the string array used for container command.
image	Indicates the docker image name.
imagePullPolicy	Indicates the Image Pull Policy.
name	Indicates the name of the container.
resources	Indicates the Compute Resources required by this container.
resources.limits	Describes the maximum amount of compute resources allowed.
resources.requests	Describes the minimum amount of compute resources required.
resources.limits.cpu	Indicates the CPU limits.
resources.limits.memory	Indicates the Memory limits.
resources.limits.ephemeral-storage	Indicates the Ephemeral Storage limits.
resources.requests.cpu	Indicates the CPU requests.
resources.requests.memory	Indicates the Memory requests.
resources.requests.ephemeral-storage	Indicates the Ephemeral Storage requests.
securityContext	Indicates the Security options the container should run with.
securityContext.allowPrivilegeEscalation	AllowPrivilegeEscalation controls whether a process can gain more privileges than its parent process. It directly controls whether the no_new_privs flag to be set on the container process.
secuirtyContext.readOnlyRootFilesystem	Indicates whether this container has a read-only root filesystem. The default value is false.
securityContext.capabilities	Indicates the capabilities to add or drop when running containers. It defaults to the default set of capabilities granted by the container runtime.
securityContext.capabilities.drop	Indicates the removed capabilities.
secuirtyContext.capabilities.add	Indicates the added capabilities.
securityContext.runAsUser	Indicates the UID to run the entrypoint of the container process.
extraContainersTpl.volumeMounts	Indicates that the parameter is used for mounting the volume.
extraContainersTpl.volumeMounts.mountPath	Indicates the path for volume mount.
extraContainersTpl.volumeMounts.name	Indicates the name of the directory for debug tool logs storage.
extraContainersVolumesTpl.name	Indicates the name of the volume for debug tool logs storage.
extraContainersVolumesTpl.emptyDir.medium	Indicates where the `emptyDir` volume is stored.
extraContainersVolumesTpl.emptyDir.sizeLimit	Indicates the `emptyDir` volume size.

2.7.4 Tools Tested in Debug Container

The following tables describe the list of debug tools that are tested.

Table 2-5 tcpdump

Options Tested	Description	Output	Capabilities
-D	Print the list of the network interfaces available on the system and on which `tcpdump` can capture packets.	`tcpdump -D` eth02. nflog (Linux netfilter log (NFLOG) interface) nfqueue (Linux netfilter queue (NFQUEUE) interface) any (Pseudo-device that captures on all interfaces) lo [Loopback]	NET_ADMIN, NET_RAW
-i	Listen on `interface`.	`tcpdump -i eth0` tcpdump: verbose output suppressed, use -v or -vv for full protocol decodelistening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes12:10:37.381199 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [P.], seq 1986927241:1986927276, ack 1334332290, win 626, options [nop,nop,TS val 849591834 ecr 849561833], length 3512:10:37.381952 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.45868 > kube-dns.kube-system.svc.cluster.local.domain: 62870+ PTR? 1.0.96.10.in-addr.arpa. (40)	NET_ADMIN, NET_RAW
-w	Write the raw packets to file rather than parsing and printing them out.	`tcpdump -w capture.pcap -i eth0`	NET_ADMIN, NET_RAW
-r	Read packets from `file`, which was created with the `-w` option.	`tcpdump -r capture.pcap` reading from file /tmp/capture.pcap, link-type EN10MB (Ethernet)12:13:07.381019 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [P.], seq 1986927416:1986927451, ack 1334332445, win 626, options [nop,nop,TS val 849741834 ecr 849711834], length 3512:13:07.381194 IP kubernetes.default.svc.cluster.local.https > cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519: Flags [P.], seq 1:32, ack 35, win 247, options [nop,nop,TS val 849741834 ecr 849741834], length 3112:13:07.381207 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [.], ack 32, win 626, options [nop,nop,TS val 849741834 ecr 849741834], length 0	NET_ADMIN, NET_RAW

Table 2-6 ip

Options Tested Description Output Capabilities

addr show

Look at the protocol addresses.

Options Tested	Description	Output	Capabilities
addr show	Look at the protocol addresses.	`ip addr show` 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group defaultlink/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft forever2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group defaultlink/ipip 0.0.0.0 brd 0.0.0.04: eth0@if190: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group defaultlink/ether aa:5a:27:8d:74:6f brd ff:ff:ff:ff:ff:ff link-netnsid 0inet 192.168.219.112/32 scope global eth0valid_lft forever preferred_lft forever	-
route show	List routes.	`ip route show` default via 169.254.1.1 dev eth0 169.254.1.1 dev eth0 scope link	-
addrlabel list	List address labels	`ip addrlabel list` prefix ::1/128 label 0 prefix ::/96 label 3 prefix ::ffff:0.0.0.0/96 label 4 prefix 2001::/32 label 6 prefix 2001:10::/28 label 7 prefix 3ffe::/16 label 12 prefix 2002::/16 label 2 prefix fec0::/10 label 11 prefix fc00::/7 label 5 prefix ::/0 label 1	-

ip addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group defaultlink/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft forever2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group defaultlink/ipip 0.0.0.0 brd 0.0.0.04: eth0@if190: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group defaultlink/ether aa:5a:27:8d:74:6f brd ff:ff:ff:ff:ff:ff link-netnsid 0inet 192.168.219.112/32 scope global eth0valid_lft forever preferred_lft forever

-

route show

List routes.

ip route show

default via 169.254.1.1 dev eth0

169.254.1.1 dev eth0 scope link

-

addrlabel list

List address labels

ip addrlabel list

prefix ::1/128 label 0

prefix ::/96 label 3

prefix ::ffff:0.0.0.0/96 label 4

prefix 2001::/32 label 6

prefix 2001:10::/28 label 7

prefix 3ffe::/16 label 12

prefix 2002::/16 label 2

prefix fec0::/10 label 11

prefix fc00::/7 label 5

prefix ::/0 label 1

-

Table 2-7 netstat

Options Tested	Description	Output	Capabilities
-a	Show both listening and non-listening sockets. For TCP, this means established connections.	`netstat -a` Active Internet connections (servers and established)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 0.0.0.0:tproxy 0.0.0.0:* LISTENtcp 0 0 0.0.0.0:websm 0.0.0.0:* LISTENtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47292 TIME_WAITtcp 0 0 cncc-core-ingress:46519 kubernetes.defaul:https ESTABLISHEDtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47240 TIME_WAITtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47347 TIME_WAITudp 0 0 localhost:59351 localhost:ambit-lm ESTABLISHEDActive UNIX domain sockets (servers and established)Proto RefCnt Flags Type State I-Node Pathunix 2 [ ] STREAM CONNECTED 576064861	-
-l	Show only listening sockets.	`netstat -l` Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 0.0.0.0:tproxy 0.0.0.0:* LISTENtcp 0 0 0.0.0.0:websm 0.0.0.0:* LISTENActive UNIX domain sockets (only servers)Proto RefCnt Flags Type State I-Node Path	-
-s	Display summary statistics for each protocol.	`netstat -s` Ip:4070 total packets received0 forwarded0 incoming packets discarded4070 incoming packets delivered4315 requests sent outIcmp:0 ICMP messages received0 input ICMP message failed.ICMP input histogram:2 ICMP messages sent0 ICMP messages failedICMP output histogram:destination unreachable: 2	-
-i	Display a table of all network interfaces.	`netstat -i` Kernel Interface tableIface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flgeth0 1440 4131 0 0 0 4355 0 0 0 BMRUlo 65536 0 0 0 0 0 0 0 0 LRU	-

Table 2-8 curl

Options Tested	Description	Output	Capabilities
-o	Write output to <file> instead of stdout.	`curl -o file.txt http://abc.com/file.txt`	-
-x	Use the specified HTTP proxy.	`curl -x proxy.com:8080 -o http://abc.com/file.txt`	-
--http2	Use the specified HTTP/2	`curl --http2 -v http://cncc-iam-ingress-gateway.cncc.svc.cluster.local:30085/cncc/auth/admin`	-

Table 2-9 ping

Options Tested	Description	Output	Capabilities
<ip>	Run a ping test to see whether the target host is reachable or not.	`ping 10.178.254.194`	NET_ADMIN, NET_RAW
-c	Stop after sending 'c' number of ECHO_REQUEST packets.	`ping -c 5 10.178.254.194`	NET_ADMIN, NET_RAW
-f (with non zero interval)	Flood ping. For every ECHO_REQUEST sent a period ''.'' is printed, while for every ECHO_REPLY received a backspace is printed.	`ping -f -i 2 10.178.254.194`	NET_ADMIN, NET_RAW

Table 2-10 nmap

Options Tested Description Output Capabilities

<ip>

Scan for Live hosts, Operating systems, packet filters, and open ports running on remote hosts.

Options Tested	Description	Output	Capabilities
<ip>	Scan for Live hosts, Operating systems, packet filters, and open ports running on remote hosts.	`nmap 10.178.254.194` `Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:54 UTCNmap scan report for 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194)Host is up (0.00046s latency).Not shown: 995 closed portsPORT STATE SERVICE22/tcp open ssh179/tcp open bgp6666/tcp open irc6667/tcp open irc30000/tcp open unknownNmap done: 1 IP address (1 host up) scanned in 0.04 seconds`	-
-v	Increase verbosity level	`nmap -v 10.178.254.194` Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:55 UTC Initiating Ping Scan at 05:55 Scanning 10.178.254.194 [2 ports] Completed Ping Scan at 05:55, 0.00s elapsed (1 total hosts) Initiating Parallel DNS resolution of 1 host. at 05:55 Completed Parallel DNS resolution of 1 host. at 05:55, 0.00s elapsed Initiating Connect Scan at 05:55 Scanning 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194) [1000 ports] Discovered open port 22/tcp on 10.178.254.194 Discovered open port 30000/tcp on 10.178.254.194 Discovered open port 6667/tcp on 10.178.254.194 Discovered open port 6666/tcp on 10.178.254.194 Discovered open port 179/tcp on 10.178.254.194 Completed Connect Scan at 05:55, 0.02s elapsed (1000 total ports) Nmap scan report for 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194) Host is up (0.00039s latency). Not shown: 995 closed ports PORT STATE SERVICE 22/tcp open ssh 179/tcp open bgp 6666/tcp open irc 6667/tcp open irc 30000/tcp open unknown Read data files from: /usr/bin/../share/nmap Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds	-
-iL	Scan all the listed IP addresses in a file. Sample file	`nmap -iL sample.txt` Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:57 UTC Nmap scan report for localhost (127.0.0.1) Host is up (0.00036s latency). Other addresses for localhost (not scanned): 127.0.0.1 Not shown: 998 closed ports PORT STATE SERVICE 8081/tcp open blackice-icecap 9090/tcp open zeus-admin Nmap scan report for 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194) Host is up (0.00040s latency). Not shown: 995 closed ports PORT STATE SERVICE 22/tcp open ssh 179/tcp open bgp 6666/tcp open irc 6667/tcp open irc 30000/tcp open unknown Nmap done: 2 IP addresses (2 hosts up) scanned in 0.06 seconds	-

nmap
                                10.178.254.194

Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:54 UTCNmap scan report for
      10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194)Host is up (0.00046s
      latency).Not shown: 995 closed portsPORT STATE SERVICE22/tcp open
      ssh179/tcp open bgp6666/tcp open irc6667/tcp open irc30000/tcp open
      unknownNmap done: 1 IP address (1 host up) scanned in 0.04 seconds

-

-v

Increase verbosity level

nmap -v
                                10.178.254.194

Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:55 UTC
Initiating Ping Scan at 05:55
Scanning 10.178.254.194 [2 ports]
Completed Ping Scan at 05:55, 0.00s elapsed (1 total hosts)
Initiating Parallel DNS resolution of 1 host. at 05:55
Completed Parallel DNS resolution of 1 host. at 05:55, 0.00s elapsed
Initiating Connect Scan at 05:55
Scanning 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194) [1000 ports]
Discovered open port 22/tcp on 10.178.254.194
Discovered open port 30000/tcp on 10.178.254.194
Discovered open port 6667/tcp on 10.178.254.194
Discovered open port 6666/tcp on 10.178.254.194
Discovered open port 179/tcp on 10.178.254.194
Completed Connect Scan at 05:55, 0.02s elapsed (1000 total ports)
Nmap scan report for 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194)
Host is up (0.00039s latency).
Not shown: 995 closed ports
PORT STATE SERVICE
22/tcp open ssh
179/tcp open bgp
6666/tcp open irc
6667/tcp open irc
30000/tcp open unknown

Read data files from: /usr/bin/../share/nmap
Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds

-

-iL

Scan all the listed IP addresses in a file. Sample file

nmap -iL
                                sample.txt

Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:57 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00036s latency).
Other addresses for localhost (not scanned): 127.0.0.1
Not shown: 998 closed ports
PORT STATE SERVICE
8081/tcp open blackice-icecap
9090/tcp open zeus-admin

Nmap scan report for 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194)
Host is up (0.00040s latency).
Not shown: 995 closed ports
PORT STATE SERVICE
22/tcp open ssh
179/tcp open bgp
6666/tcp open irc
6667/tcp open irc
30000/tcp open unknown

Nmap done: 2 IP addresses (2 hosts up) scanned in 0.06 seconds

-

Table 2-11 dig

Options Tested	Description	Output	Capabilities
<ip>	It performs DNS lookups and displays the answers that are returned from the name servers that were queried.	`dig 10.178.254.194`Note: The IP should be reachable from inside the container.	-
-x	Query DNS Reverse Look-up.	`dig -x 10.178.254.194`	-

2.8 Logs

This chapter explains the process to retrieve the logs and status that can be used for effective troubleshooting. SCP provides various sources of information that may be helpful in the troubleshooting process.

Log Levels

Logs register system events along with their date and time of occurrence. They also provide important details about a chain of events that could have led to an error or problem.

Supported Log Levels

For SCP, the log level for a microservice can be set to any of the following valid values:

DEBUG: A log level used for events considered to be useful during software debugging when more granular information is required.
INFO: The standard log level indicating that something happened, the application entered a certain state, and so on.
WARN: Indicates that something unexpected happened in the application, a problem, or a situation that might disturb one of the processes. But that doesn’t mean that the application has failed. The WARN level should be used in situations that are unexpected, but the code can continue the work.
ERROR: The log level that should be used when the application reaches an issue preventing one or more functionalities from properly functioning.

Configuring Log Levels

To view logging configurations and update logging levels, use the Logging Config option on the Cloud Native Configuration Console. For more information, see "Configuring Global Options" in Oracle Communications Cloud Native Core, Service Communication Proxy User Guide.

2.8.1 Collecting Logs

Perform this procedure to collect logs from pods or containers.

Run the following command to get the pods details:
```
$ kubectl -n <namespace_name> get pods
```
Run the following command to collect the logs from the specific pods or containers:
```
$ kubectl logs <podname> -n <namespace>
```
Example:
```
$ kubectl logs ocscp-scp-worker-xxxxx -n ocscp
```

Run the following command to store the log in a file:

$ kubectl logs <podname> -n <namespace>  > <filename>

Example:

$ kubectl logs ocscp-scp-worker-xxxxx -n ocscp > logs.txt

(Optional) Run the following command for the log stream with file redirection starting with last 100 lines of log:
```
$ kubectl logs <podname> -n <namespace> -f --tail <number of lines> > <filename>
```
Example:
```
$ kubectl logs ocscp-scp-worker-xxxxx -n ocscp -f --tail 100 > logs.txt
```

2.8.2 Understanding Logs

This section explains the logs required to look into, to handle different SCP debugging issues.

The log level attributes of SCP services are as follows:

SCPC-Subscription
SCPC-Notification
SCP-Worker

Sample Logs

Sample log statement for SCPC-Subscription:

{"instant":{"epochSecond":1614521111,"nanoOfSecond":908545000},"thread":"main","level":"INFO","loggerName":"com.oracle.cgbu.cne.scp.soothsayer.subscription.processor.SubscriptionDataConsumer","message":"{logMsg=Subscription consumer cycle completed.Thread Will now sleep for given time, cycle=232, threadSleepTimeInMs=100}","endOfBatch":false,"loggerFqcn":"org.apache.logging.log4j.spi.AbstractLogger","threadId":30,"threadPriority":5,"messageTimestamp":"21-06-07 12:38:56.784+0000","application":"ocscp","microservice":"scpc-subscription","engVersion":"1.12.0","mktgVersion":"1.12.0.0.0","vendor":"oracle","namespace":"scpsvc","node":"master","pod":"ocscp-scpc-subscription-7f5b7c8ccd-z89wb","subsystem":"subscription","instanceType":"prod","processId":"1"}

Sample log statement for SCPC-Notification:

{"instant":{"epochSecond":1623069485,"nanoOfSecond":496630558},"thread":"main","level":"INFO","loggerName":"com.oracle.cgbu.cne.scp.soothsayer.Process","message":"{logMsg=Successfully processed notification, nfInstanceId=6faf1bbc-6e4a-2828-a507-a14ef8e1bc5b, nfType=NRF}","endOfBatch":false,"loggerFqcn":"org.apache.logging.log4j.spi.AbstractLogger","threadId":1,"threadPriority":5,"messageTimestamp":"21-06-07 12:38:05.496+0000","application":"ocscp","microservice":"scpc-notification","engVersion":"1.12.0","mktgVersion":"1.12.0.0.0","vendor":"oracle","namespace":"scpsvc","node":"master","pod":"ocscp-scpc-notification-76597b5b7-wfmxb","subsystem":"notification","instanceType":"prod","processId":"1"}

Sample log statement for SCP-Worker

{"instant":\{"epochSecond":1623069702,"nanoOfSecond":672444454},"thread":"scp-upstream-worker-7","level":"WARN","loggerName":"com.oracle.cgbu.cne.scp.router.routelayer.MsgRouteChain","message":"MsgRouteChain::sendRsp(): SCP generated Response, Response Code = 503, body = {\"title\":\"Service Unavailable\",\"status\":\"503\",\"detail\":\"Service Unavailable:: Service Unavailable\"}, error category = Destination webclient Connection Failure, ingress request host = ocscp-scp-worker.scpsvc.svc.cluster.local:8000, ingress request path = /nnrf-nfm/v1/subscriptions, ingress 3gpp-sbi-target-apiRoot = http://nrf1svc.scpsvc.svc.cluster.local:8080, egress request Uri = http://nrf2svc.scpsvc.svc.cluster.local:8080/nnrf-nfm/v1/subscriptions, egress request Destination = nrf2svc.scpsvc.svc.cluster.local","endOfBatch":false,"loggerFqcn":"org.apache.logging.log4j.spi.AbstractLogger","threadId":139,"threadPriority":5,"messageId":"c49b3b48-afc1-4058-83e8-b719ee181ed8","messageTimestamp":"21-06-07 12:41:42.672+0000","application":"ocscp","microservice":"scp-worker","engVersion":"17.17.17","mktgVersion":"1.12.0.0.0","vendor":"oracle","namespace":"scpsvc","node":"master","pod":"ocscp-scp-worker-9567767dc-7bqg9","subsystem":"router","instanceType":"prod","processId":"1"}

The following table describes different log attributes:

Table 2-12 Log Attribute Details

Log Attribute	Details	Sample Value	Data Type
instant	Epoch time Note: It is a group of two values such as epochSecond and nanoOfSecond.	{"epochSecond":1614521244,"nanoOfSecond":775702000}	Object
thread	Logging Thread Name	"pool-4-thread-1"	String
level	Log Level of the log printed	"ERROR"	String
loggerName	Class or Module which printed the log	"com.oracle.cgbu.cne.scp.soothsayer.audit.process.AuditMaster"	String
message	Message related to the log providing brief details Note: WARN log level indicates that SCP connection with NRF is established.	{logMsg=NRF health check did not complete successfully. Next health check will start in given interval, timeIntervalInSec=2}	String
endOfBatch	Log4j2 Internal	false	boolean
loggerFqcn	Log4j2 Internal	org.apache.logging.log4j.spi.AbstractLogger	String
threadId	Thread Id generated internally by Log4j2	31	Integer
threadPriority	Thread Priority set internally by Log4j2	5	Integer
messageTimestamp	Timestamp when log was printed	"21-06-07 12:41:15.277+0000"	String
application	Application name	"ocscp"	String
scpFqdn	SCP FQDN	ocscp-scp-worker.ocscp-thrust5-06.svc.thrust5	String
microservice	Name of the microservice	"scpc-audit"	String
engVersion	Engineering version of the software	"1.12.0"	String
mktgVersion	Marketing version of the software	"1.12.0.0.0"	String
vendor	Vendor of the software	"oracle"	String
namespace	Namespace where application is deployed	"scpsvc"	String
node	Node where the pod resides	"master"	String
pod	Pod name of deployed application	"ocscp-scpc-audit-6c5ddb4c54-hf8kr"	String
subsystem	Subsystem inside microservice group	"audit"	String
instanceType	Instance type	"prod"	String
processId	Process ID internally assigned	1	Integer

2.9 Verifying the Availability of Multus Container Network Interface

Perform the following procedure to verify whether the Multus Container Network Interface (CNI) feature is enabled after SCP installation is complete.

Note:

Ensure that Multus Container Network Interface configuration is completed as described in Oracle Communications Cloud Native Core, Service Communication Proxy Installation, Upgrade, and Fault Recovery Guide.

To verify whether the pod contains signaling network, go to Kubernetes cluster and run the following command:
```
kubectl describe pod <pod-name> -n <namespace>
```

Check whether the pod output contains the "net1" interface.

Sample pod output:

k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "",
                    "ips": [
                        "192.168.219.111"
                    ],
                    "default": true,
                    "dns": {}
                },{
                    "name": "scpsvc/macvlan-siga",
                    "interface": "net1",
                    "ips": [
                        "192.68.3.78"
                    ],
                    "mac": "b6:41:f9:a6:c8:8e",
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks: [{ "name": "macvlan-siga"}]

2.10 Upgrade or Rollback Failure

When Service Communication Proxy (SCP) upgrade or rollback fails, perform the following procedure.

Check the pre or post upgrade or rollback hook logs in Kibana as applicable.

Users can filter upgrade or rollback logs using the following filters:

For upgrade: lifeCycleEvent=9001 or 9011
For rollback: lifeCycleEvent=9002

Sample Log:

{"time_stamp":"2021-11-22 10:28:11.820+0000","thread":"main","level":"INFO","logger":"com.oracle.cgbu.cne.scp.soothsayer.hooks.releases.ReleaseHelmHook_1_14_1","message":"{logMsg=Starting Pre-Upgrade hook Execution, lifeCycleEvent=9001 | Upgrade, sourceRelease=101400, targetRelease=101500}","loc":"com.oracle.cgbu.cne.scp.soothsayer.common.utils.EventSpecificLogger.submit(EventSpecificLogger.java:94)"

Check the pod logs in Kibana to analyze the cause of failure.
After detecting the cause of failure, do the following:
- For upgrade failure:
  - If the cause of upgrade failure is database or network connectivity issue, contact your system administrator. When the issue is resolved, rerun the upgrade command.
  - If the cause of failure is not related to database or network connectivity issue and is observed during the preupgrade phase, do not perform rollback because SCP deployment remains in the source or older release.
  - If the upgrade failure occurs during the postupgrade phase, for example, post upgrade hook failure due to target release pod not moving to ready state, then perform a rollback.
- For rollback failure: If the cause of rollback failure is database or network connectivity issue, contact your system administrator. When the issue is resolved, rerun the rollback command.
If the issue persists, contact My Oracle Support.

2.11 Error Messages for Mediation Rule Configuration

This section allows you to troubleshoot and resolve problems in Mediation Rules using Drools Rule Language (DRL).

The following types of errors are handled:

Custom Errors
Errors specific to Drool library

The custom errors generated by application are as follows:

Value not valid for type: This error is shown in cases where a field in the request body contains a value that is not expected.
Fields are required and missing for rules with state: This error is shown in cases where a required field file is missing in the request.
Rule name already exists in the database: When cloning a rule and the new name exists in the database, this error appears.
Rule cannot be deleted on status: An APPLIED rule cannot be deleted. In the case of a delete request with APPLIED status, this exception will appear.
Request body ruleName does not match with param ruleName: If the value of the rule name in the URL does not match the value in the "name" field in the request body, this exception will appear.
Rule was not found: This error is shown when a get or delete request has a rule name that doesn’t exist in the database.
The State field is not valid at the creation of the rule; it must be SAVE state: The rule is not going to be created if the value of state in the request body is different from SAVE state.
State transition is not valid for status: For more information, refer "Mediation Rule API Migration" section in the Oracle Communications Cloud Native Core, Service Communication Proxy User Guide.
The APPLIED Status is not valid at the creation of the rule, it must be in the DRAFT status.

Note:

Once a rule is generated, ruleName, Format, and Status fields cannot be changed. Some fields are dependent on Status fields to be able to modify. For more information, refer "Mediation Rule API Migration" section in the Oracle Communications Cloud Native Core, Service Communication Proxy User Guide.

For DRL errors, the Drools provide standardized messages in the following format:

1st Block: Indicates the error code.
2nd Block: Indicates the DRL source line and column where the problem occurred.
3rd Block: Indicates the description of the problem.
4th Block: Indicates the Components in the DRL source where the issue occurred, such as rules, functions, and queries.
5th Block: Indicates the DRL source pattern where the issue occurred.

Figure 2-4 Error Message Format

The following standardized error messages are supported by Drools:

101: no viable alternative

Indicates that the parser reached a decision point but was unable to find an alternative.

Example rule with incorrect spelling

//incorrect spelling
//incorrect syntax
package com.oracle.cgbu.ocmediation.nfmediation;
import com.oracle.cgbu.ocmediation.factdetails.Request;

dialect "mvel"

rule "ruleTest"
when
Â req : Request(headers.has("header") == true) //special character
then
 req.headers.add("newHeader","132465")
end

Note:

The parser reported an invalid token due to a special character.

102: mismatched input

Indicates that the parser expects a specific input that is not present at the current input point.
Example rule with an incomplete rule statement
```
//incomplete rule statement
// incorrect syntax
package com.oracle.cgbu.ocmediation.nfmediation;
import com.oracle.cgbu.ocmediation.factdetails.Request;

dialect "mvel"

//Must have a rule name

when
 req : Request(headers.has("header1") == true)
then
 req.headers.add("TEST","132465")
end
```
Note:
The parser encountered an incomplete construct due to a missing rule name.
103: failed predicate

Indicates that a validating semantic predicate is evaluated as false. These semantic predicates are commonly used in DRL files to identify component terms such as declare, rule, exists, not, and others.
Example rule with an invalid keyword
```
package com.oracle.cgbu.ocmediation.nfmediation;
import com.oracle.cgbu.ocmediation.factdetails.Request;

dialect "mvel"

fdsfdsfds

rule "ruleTest"
when
 req : Request(headers.has("header") == true)
then
 req.headers.add("newHeader","132465")
end
```
Note:
This text line is not a DRL keyword construct, hence the parser cannot validate the rest of the DRL file.
105: did not match anything

Indicates the parser reached a grammar sub-rule that must match an alternative at least once but did not match anything.
Example rule with invalid text in an empty condition
```
package com.oracle.cgbu.ocmediation.nfmediation;
import com.oracle.cgbu.ocmediation.factdetails.Request;

dialect "mvel"

rule "ruleTest"
when
 None // Must remove `None` if condition is empty
then
 req.headers.add("TEST","132465")
end
```
Note:
The condition should be empty, but the word None is used. This error is resolved by removing None, which is not a valid DRL keyword.

For more information on the list of errors specific to the Drool library, see Drools User Guide.

2.12 Errors Observed on Grafana and OCI Dashboards

This section provides information to troubleshoot errors observed on Grafana and OCI dashboards. These errors occur when the number of records fetched by a query, which is an expression that uses application metrics and dimensions, exceeds the configured limit, due to which charts or widgets are invisible on dashboards.

The following sample error messages are displayed on the:

Grafana dashboard: execution: execution: query processing would load too many samples into memory in query execution.
OCI dashboard: Query cannot result in more than 2000 streams

To resolve this issue, perform the following tasks to view the data on these dashboards:

On the Grafana or OCI dashboard, minimize the query interval to check if the data appears.
1. To debug an issue, check the error logs and select the interval on the dashboard based on the timestamp of the observed error logs to minimize the search results.
This task fetches only records within that query interval and reduces the number of records on the dashboard.
You can add more filters using the metric dimensions (as per the traffic being run) to the query to minimize the search results. For more information about metric dimensions, see Oracle Communications Cloud Native Core, Service Communication Proxy User Guide. To query a particular metric, these dimensions act as filter keys as defined in Table 2-13 and Table 2-14.
Ensure that the metric used in the query is pegged by the SCP application.

The following tables describe examples of queries on Grafana and OCI dashboards:

Table 2-13 Examples of Query on Grafana Dashboard

Metric Used	Default Query on Dashboard	Query after Applying Additional Filters
ocscp_metric_http_rx_req_total	sum(irate(ocscp_metric_http_rx_req_total{namespace="scpsvc"}[2m])) by (pod)	sum(irate(ocscp_metric_http_rx_req_total{namespace="scpsvc",ocscp_nf_service_type="nausf-auth"}[2m])) by (pod)
ocscp_metric_http_tx_req_total	sum(irate(ocscp_metric_http_tx_req_total{namespace="ocscp"}[5m])) by (ocscp_nf_type,ocscp_producer_service_instance_id)	sum(irate(ocscp_metric_http_rx_req_total{namespace="ocscp",,ocscp_nf_service_type="nausf-auth"}[5m])) by (ocscp_nf_type,ocscp_producer_service_instance_id)

Table 2-14 Examples of Query on OCI Dashboard

Metric Used	Default Query on Dashboard	Query after Applying Additional Filters
ocscp_metric_http_rx_req_total	ocscp_metric_http_rx_req_total[10m]{k8Namespace="ocscp-2"}.rate().groupBy(podname).sum()	ocscp_metric_http_rx_req_total[10m]{k8Namespace="ocscp-2",ocscp_nf_service_type="nausf-auth"}.rate().groupBy(podname).sum() ocscp_metric_http_rx_req_total[10m]{k8Namespace="ocscp-2",ocscp_nf_service_type="nausf-auth", ocscp_consumer_info="smf#smf8svc.ocscp-2.svc.cluster.local#NA "}.rate().groupBy(podname).sum()
ocscp_metric_http_tx_res_total	ocscp_metric_http_tx_res_total[10m]{k8Namespace="ocscp-2"}.rate().groupBy(ocscp_consumer_info).sum()	ocscp_metric_http_tx_res_total[10m]{k8Namespace="ocscp-2", ocscp_nf_service_type="nausf-auth"}.rate().groupBy(ocscp_consumer_info).sum() ocscp_metric_http_tx_res_total[10m]{k8Namespace="ocscp-2",ocscp_nf_service_type="nausf-auth", ocscp_consumer_info="smf#smf8svc.ocscp-2.svc.cluster.local#NA "}.rate().groupBy(ocscp_consumer_info).sum()

2.13 TLS Connection Failure

This section describes the TLS related issues and their resolution steps. It is recommended to attempt the resolution steps provided in this guide before contacting Oracle Support:

Problem: Ingress Connection Failure Due to TLS 1.3 Version Mismatch Between Consumer NF and SCP Server

Scenario: The Consumer NF has requested a connection using TLS 1.3, but the SCP server is currently configured to support only TLS 1.2. This version mismatch prevents successful communication between the client and server.

Server Error Message

The client supported protocol versions [TLSv1.3] are not accepted by server preferences [TLSv1.2]

Client Error Message

Received fatal alert: protocol_version

Resolution Steps:

Update the ocscp-custom-values.yaml file on the SCP server to set the sbiProxySslConfigurations.server.tlsVersion parameters to include TLS 1.3 and TLS 1.2. This configuration applies to fresh installations. For fixing issues or changing the version, you can use the REST API for configuration.
Verify TLS 1.3 for secure communication between the consumer NF and SCP to ensure that the issue has been resolved.

Problem: Egress Connection Failure Due to Cipher Mismatch: SCP Client and Producer Server for TLS v1.3.

Scenario: The SCP client is configured to request a connection using TLSv1.3 with specific ciphers that are not supported by the Producer server. As a result, the connection fails due to the cipher mismatch, preventing secure communication between the client and server.

Client Error Message

No appropriate protocol(protocol is disabled or cipher suites are inappropriate)

Server Error Message

Received fatal alert: handshake failure

Resolution Steps:

Ensure that the following cipher suites are configured for the SCP client to use with TLS 1.3:
- TLS_AES_128_GCM_SHA256
- TLS_AES_256_GCM_SHA384
- TLS_CHACHA20_POLY1305_SHA256
Ensure that both the client and server have at least one common TLS 1.3 cipher configured.
Verify TLS 1.3 for secure communication between the SCP client and the Producer server to ensure that the issue has been resolved.

Problem: Egress Connection Failure for TLS v1.3 Due to Expired Certificates.

Scenario: The SCP client is attempting to establish an egress connection using TLSv1.3, but the connection fails due to expired certificates. Specifically, the SCP client is presenting TLSv1.3 certificates that have passed their validity period, which causes the Producer server to reject the connection.

Client Error Message

Service Unavailable for producer due to Certificate Expired

Server Error Message

Received fatal alert: handshake failure

Resolution Steps:

Verify the validity of the current certificate (client_ocscp.cer).
If the certificate has expired, renew it or extend its validity.
Attempt to establish a connection between the SCP client and the Producer server to confirm that the issue has been resolved.
Verify the TLSv1.3 for secure communication.

Problem: Pods are not available after populating the clientDisabledExtension or serverDisabledExtension parameter.

Resolution Steps:

Check the value of the clientDisabledExtension or serverDisabledExtension parameters. The following values cannot be added to these parameters:
- supported_versions
- key_share
- supported_groups
- signature_algorithms
- certificate_authorities
If any of the above values are present, remove them or revert to the default configuration for the pod to start properly.

Note:
The certificate_authorities value can be added to the clientDisabledExtension parameter but cannot be added to the serverDisabledExtension parameter.

Problem: Pods are not available after populating the clientAllowedSignatureSchemes parameter.

Solution:

Check the value of the clientAllowedSignatureSchemes parameter.
The following values should be present for this parameter:
- ecdsa_secp521r1_sha512
- ecdsa_secp384r1_sha384
- ecdsa_secp256r1_sha256
- ed448
- ed25519
- rsa_pss_rsae_sha512
- rsa_pss_rsae_sha384
- rsa_pss_rsae_sha256
- rsa_pss_pss_sha512
- rsa_pss_pss_sha384
- rsa_pss_pss_sha256
- rsa_pkcs1_sha512
- rsa_pkcs1_sha384
- rsa_pkcs1_sha256
If any of the above values are not present, add them or revert to the default configuration to ensure the pod starts properly.

2 Troubleshooting Service Communication Proxy

2.1 Generic Checklist

2.2 Helm Install Failure

2.2.1 Incorrect Image Name in the ocscp-custom-values Files

2.2.2 Docker Registry is Incorrectly Configured

2.2.3 Continuous Restart of Pods

2.3 Custom Value File Parse Failure

2.4 Curl HTTP2 Not Supported

2.5 SCP DB goes into the Deadlock State

2.6 Helm Test Error Scenarios

2.7 Using Debug Tool

2.7.1 Prerequisites to Use the Debug Tool

2.7.2 Running the Debug Tool

2.7.3 Debug Tool Configuration Parameters

2.7.4 Tools Tested in Debug Container

2.8 Logs

2.8.1 Collecting Logs

2.8.2 Understanding Logs

2.9 Verifying the Availability of Multus Container Network Interface

2.10 Upgrade or Rollback Failure

2.11 Error Messages for Mediation Rule Configuration

2.12 Errors Observed on Grafana and OCI Dashboards

2.13 TLS Connection Failure