3 Using Debug Tool
The Debug Tool provides third-party troubleshooting tools for debugging the runtime issues in a lab environment. The following tools are available to debug NEF issues:
- tcpdump
- ip
- netstat
- curl
- ping
- nmap
- dig
Prerequisites
Note:
- For CNE 23.2.0 and later versions, follow Step a of Configuration in CNE.
- For CNE versions prior to 23.2.0, follow Step b of Configuration in CNE.
- Configuration in CNEPerform the following configurations in the Bastion Host. You need admin privileges to perform these configurations.
- When NEF is installed on CNE version 23.2.0 or
above
Note:
- In CNE version 23.2.0 or above, the default CNE Kyverno policy, disallow-capabilities, do not allow NET_ADMIN and NET_RAW capabilities that are required for debug tool.
- To run Debug tool on CNE 23.2.0 and above, the user must modify the existing Kyverno policy, disallow-capabilities, as below.
Adding a Namespace to an Empty Resource- Run the following command to verify if the current
disallow-capabilities cluster policy has namespace in it.
Example:
Sample output:$ kubectl get clusterpolicies disallow-capabilities -oyaml
apiVersion: kyverno.io/v1 kind: ClusterPolicy ... ... spec: rules: -exclude: any: -resources:{}
- If there are no namespaces, then patch the policy using the
following command to add <namespace> under
resources:
Example:$ kubectl patch clusterpolicy disallow-capabilities --type=json \ -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources", "value": {"namespaces":["<namespace>"]} }]'
Sample output:$ kubectl patch clusterpolicy disallow-capabilities --type=json \ -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources", "value": {"namespaces":["ocnef"]} }]'
apiVersion: kyverno.io/v1 kind: ClusterPolicy ... ... spec: rules: -exclude: resources: namespaces: -ocnef
- If in case it is needed to remove the namespace added in the
above step, use the following
command:
Sample output:$ kubectl patch clusterpolicy disallow-capabilities --type=json \ -p='[{"op": "replace", "path": "/spec/rules/0/exclude/any/0/resources", "value": {} }]'
apiVersion: kyverno.io/v1 kind: ClusterPolicy ... ... spec: rules: -exclude: any: -resources:{}
Adding a Namespace to an Existing Namespace List- Run the following command to verify if the current
disallow-capabilities cluster policy has namespaces in it.
Example:
Sample output:$ kubectl get clusterpolicies disallow-capabilities -oyaml
apiVersion: kyverno.io/v1 kind: ClusterPolicy ... ... spec: rules: -exclude: any: -resources: namespaces: -namespace1 -namespace2 -namespace3
- If there are namespaces already added, then patch the policy
using the following command to add <namespace> to the existing
list:
Example:$ kubectl patch clusterpolicy disallow-capabilities --type=json \ -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/-", "value": "<namespace>" }]'
Sample output:$ kubectl patch clusterpolicy disallow-capabilities --type=json \ -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/-", "value": "ocnef" }]'
apiVersion: kyverno.io/v1 kind: ClusterPolicy ... ... spec: rules: -exclude: resources: namespaces: -namespace1 -namespace2 -namespace3 -ocnef
- If in case it is needed to remove the namespace added in the
above step, use the following
command:
Example:$ kubectl patch clusterpolicy disallow-capabilities --type=json \ -p='[{"op": "remove", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/<index>"}]'
Sample output:$ kubectl patch clusterpolicy disallow-capabilities --type=json \ -p='[{"op": "remove", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/3"}]'
apiVersion: kyverno.io/v1 kind: ClusterPolicy ... ... spec: rules: -exclude: resources: namespaces: -namespace1 -namespace2 -namespace3
Note:
While removing the namespace, provide the index value for namespace within the array. The index starts from '0'.
- When NEF is installed on CNE version prior to
23.2.0
PodSecurityPolicy (PSP) Creation
Create a PSP by running the following command from the bastion host. The parameters readOnlyRootFileSystem, allowPrivilegeEscalation, allowedCapabilities are required by debug container.Note:
Other parameters are mandatory for PSP creation and can be customized as per the CNE environment. The default values are recommended.$ kubectl apply -f - <<EOF apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: debug-tool-psp spec: readOnlyRootFilesystem: false allowPrivilegeEscalation: true allowedCapabilities: - NET_ADMIN - NET_RAW fsGroup: ranges: - max: 65535 min: 1 rule: MustRunAs runAsUser: rule: MustRunAsNonRoot seLinux: rule: RunAsAny supplementalGroups: rule: RunAsAny volumes: - configMap - downwardAPI - emptyDir - persistentVolumeClaim - projected - secret EOF
Role Creation
Run the following command to create a role for the PSP:kubectl apply -f - <<EOF apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: debug-tool-role namespace: cncc rules: - apiGroups: - policy resources: - podsecuritypolicies verbs: - use resourceNames: - debug-tool-psp EOF
RoleBinding Creation
Run the following command to associate the service account for the NEF namespace with the role created for the PSP:$ kubectl apply -f - <<EOF apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: debug-tool-rolebinding namespace: ocnef roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: debug-tool-role subjects: - kind: Group apiGroup: rbac.authorization.k8s.io name: system:serviceaccounts EOF
- When NEF is installed on CNE version 23.2.0 or
above
- Configuration in NF specific Helm
Following updates must be performed in
custom-values.yaml
file.- Log in to the NEF server.
- Open the custom-values
file:
$ vim <custom-values file>
- Under global configuration, add the
following:
# Allowed Values: DISABLED, ENABLED extraContainers: ENABLED debugToolContainerMemoryLimit: 4Gi extraContainersVolumesTpl: | - name: debug-tools-dir emptyDir: medium: Memory sizeLimit: {{ .Values.global.debugToolContainerMemoryLimit | quote }} extraContainersTpl: | - command: - /bin/sleep - infinity image: {{ .Values.global.dockerRegistry }}/ocdebug-tools:23.3.1 imagePullPolicy: Always name: {{ printf "%s-tools-%s"(include "getprefix".) (include "getsuffix".) | trunc 63| trimPrefix "-"| trimSuffix "-" }} resources: requests: ephemeral-storage: "512Mi" cpu: "0.5" memory: {{ .Values.global.debugToolContainerMemoryLimit | quote }} limits: ephemeral-storage: "512Mi" cpu: "0.5" memory: {{ .Values.global.debugToolContainerMemoryLimit | quote }} securityContext: allowPrivilegeEscalation: true capabilities: drop: - ALL add: - NET_RAW - NET_ADMIN runAsUser: 7000 volumeMounts: - mountPath: /tmp/tools name: debug-tools-dir
Note:
- Debug Tool Container comes up with the default user ID - 7000.
If the operator wants to override this default value, it can be done using the
`runAsUser` field, otherwise, the field can be skipped.
Default value: uid=7000(debugtool) gid=7000(debugtool) groups=7000(debugtool)
- In case you want to customize the container name, replace the
`name` field in the above values.yaml with the
following:
This will ensure that the container name is prefixed and suffixed with the necessary values.name: {{ printf "%s-tools-%s" (include "getprefix" .) (include "getsuffix" .) | trunc 63 | trimPrefix "-" | trimSuffix "-" }}
- Debug Tool Container comes up with the default user ID - 7000.
If the operator wants to override this default value, it can be done using the
`runAsUser` field, otherwise, the field can be skipped.
- Under service specific configurations for which debugging is
required, add the
following:
# Allowed Values: DISABLED, ENABLED, USE_GLOBAL_VALUE extraContainers: USE_GLOBAL_VALUE
Note:
- At the global level,
extraContainers
flag can be used to enable/disable injecting extra containers globally. This ensures that all the services that use this global value have extra containers enabled/disabled using a single flag. - At the service level,
extraContainers
flag determines whether to use the extra container configuration from the global level or enable/disable injecting extra containers for the specific service.
- At the global level,
Run the Debug Tool
- Run the following command to retrieve the POD
details:
$ kubectl get pods -n <k8s namespace>
Example:
$ kubectl get pods -n ocnef
- Run the following command to enter into Debug Tool
Container:
$ kubectl exec -it <pod name> -c <debug_container name> -n <namespace> bash
Example:
$ kubectl exec -it ocnef-nfaccesstoken-49fb96494c-k8w9q -c tools -n ocnef bash
- Run the debug
tools:
bash -4.2$ <debug_tools>
Example:
bash -4.2$ tcpdump
- Copy the output files from container to
host:
$ kubectl cp -c <debug_container name> <pod name>:<file location in container> -n <namespace> <destination location>
Example:
$ kubectl cp -c tools ocnef-nfaccesstoken-49fb96494c-k8w9q:/tmp/tools/capture.pcap -n ocnef /tmp/tools/
Tools Tested in Debug Container
Following is the list of debug tools that are tested.
Table 3-1 tcpdump
Options Tested | Description | Output | Capabilities |
---|---|---|---|
-D | Print the list of the network interfaces available on the system and on which tcpdump can capture packets. | tcpdump -D
|
NET_ADMIN, NET_RAW |
-i | Listen on interface | tcpdump -i eth0 tcpdump: verbose output suppressed, use -v or -vv for full protocol decodelistening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes12:10:37.381199 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [P.], seq 1986927241:1986927276, ack 1334332290, win 626, options [nop,nop,TS val 849591834 ecr 849561833], length 3512:10:37.381952 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.45868 > kube-dns.kube-system.svc.cluster.local.domain: 62870+ PTR? 1.0.96.10.in-addr.arpa. (40) |
NET_ADMIN, NET_RAW |
-w | Write the raw packets to file rather than parsing and printing them out. | tcpdump -w capture.pcap -i eth0 |
NET_ADMIN, NET_RAW |
-r | Read packets from file (which was created with the -w option). | tcpdump -r capture.pcap reading from file /tmp/capture.pcap, link-type EN10MB (Ethernet)12:13:07.381019 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [P.], seq 1986927416:1986927451, ack 1334332445, win 626, options [nop,nop,TS val 849741834 ecr 849711834], length 3512:13:07.381194 IP kubernetes.default.svc.cluster.local.https > cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519: Flags [P.], seq 1:32, ack 35, win 247, options [nop,nop,TS val 849741834 ecr 849741834], length 3112:13:07.381207 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [.], ack 32, win 626, options [nop,nop,TS val 849741834 ecr 849741834], length 0 |
NET_ADMIN, NET_RAW |
Table 3-2 ip
Options Tested | Description | Output | Capabilities |
---|---|---|---|
addr show | Look at protocol addresses. |
ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group defaultlink/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft forever2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group defaultlink/ipip 0.0.0.0 brd 0.0.0.04: eth0@if190: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group defaultlink/ether aa:5a:27:8d:74:6f brd ff:ff:ff:ff:ff:ff link-netnsid 0inet 192.168.219.112/32 scope global eth0valid_lft forever preferred_lft forever |
-- |
route show | List routes. | ip route show default via 169.254.1.1 dev eth0 169.254.1.1 dev eth0 scope link |
-- |
addrlabel list | List address labels |
ip addrlabel list prefix ::1/128 label 0 prefix ::/96 label 3 prefix ::ffff:0.0.0.0/96 label 4 prefix 2001::/32 label 6 prefix 2001:10::/28 label 7 prefix 3ffe::/16 label 12 prefix 2002::/16 label 2 prefix fec0::/10 label 11 prefix fc00::/7 label 5 prefix ::/0 label 1 |
-- |
Table 3-3 netstat
Options Tested | Description | Output | Capabilities |
---|---|---|---|
-a | Show both listening and non-listening (for TCP, this means established connections) sockets. | netstat -a Active Internet connections (servers and established)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 0.0.0.0:tproxy 0.0.0.0:* LISTENtcp 0 0 0.0.0.0:websm 0.0.0.0:* LISTENtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47292 TIME_WAITtcp 0 0 cncc-core-ingress:46519 kubernetes.defaul:https ESTABLISHEDtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47240 TIME_WAITtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47347 TIME_WAITudp 0 0 localhost:59351 localhost:ambit-lm ESTABLISHEDActive UNIX domain sockets (servers and established)Proto RefCnt Flags Type State I-Node Pathunix 2 [ ] STREAM CONNECTED 576064861 |
-- |
-l | Show only listening sockets. | netstat -l Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 0.0.0.0:tproxy 0.0.0.0:* LISTENtcp 0 0 0.0.0.0:websm 0.0.0.0:* LISTENActive UNIX domain sockets (only servers)Proto RefCnt Flags Type State I-Node Path |
-- |
-s | Display summary statistics for each protocol. | netstat -s Ip:4070 total packets received0 forwarded0 incoming packets discarded4070 incoming packets delivered4315 requests sent outIcmp:0 ICMP messages received0 input ICMP message failed.ICMP input histogram:2 ICMP messages sent0 ICMP messages failedICMP output histogram:destination unreachable: 2 |
-- |
-i | Display a table of all network interfaces. | netstat -i Kernel Interface tableIface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flgeth0 1440 4131 0 0 0 4355 0 0 0 BMRUlo 65536 0 0 0 0 0 0 0 0 LRU |
-- |
Table 3-4 curl
Options Tested | Description | Output | Capabilities |
---|---|---|---|
-o | Write output to <file> instead of stdout. |
curl -o file.txt http://abc.com/file.txt |
-- |
-x | Use the specified HTTP proxy. |
curl -x proxy.com:8080 -o http://abc.com/file.txt |
-- |
Table 3-5 ping
Options Tested | Description | Output | Capabilities |
---|---|---|---|
<ip> | Run a ping test to see whether the target host is reachable or not. |
ping 10.178.254.194 |
NET_ADMIN, NET_RAW |
-c | Stop after sending 'c' number of ECHO_REQUEST packets. |
ping -c 5 10.178.254.194 |
NET_ADMIN, NET_RAW |
-f (with non zero interval) | Flood ping. For every ECHO_REQUEST sent, a period ''.'' is printed, while for every ECHO_REPLY received a backspace is printed. |
ping -f -i 2 10.178.254.194 |
NET_ADMIN, NET_RAW |
Table 3-6 nmap
Options Tested | Description | Output | Capabilities |
---|---|---|---|
<ip> | Scan for Live hosts, Operating systems, packet filters and open ports running on remote hosts. |
nmap
10.178.254.194
|
-- |
-v | Increase verbosity level |
nmap -v
10.178.254.194
|
-- |
-iL | Scan all the listed IP addresses in a file. Sample file |
nmap -iL
sample.txt
|
-- |
Table 3-7 dig
Options Tested | Description | Output | Capabilities |
---|---|---|---|
<ip> | It performs DNS lookups and displays the answers that are returned from the name server(s) that were queried. | dig 10.178.254.194 Note: The IP should be reachable from inside the container. |
-- |
-x | Query DNS Reverse Look-up. | dig -x 10.178.254.194 |
-- |
3.1 Debug Tool Configuration Parameters
Following are the parameters used to configure debug tool.
CNE Parameters
Table 3-8 CNE Parameters
Parameter | Description |
---|---|
apiVersion | APIVersion defines the version schema of this representation of an object. |
kind | Kind is a string value representing the REST resource this object represents. |
metadata | Standard object's metadata. |
metadata.name | Name must be unique within a namespace. |
spec | spec defines the policy enforced. |
spec.readOnlyRootFilesystem | Controls whether the containers run with a read-only root filesystem (that is, no writable layer). |
spec.allowPrivilegeEscalation | Gates whether or not a user is allowed to set the security context of a container to allowPrivilegeEscalation=true. |
spec.allowedCapabilities | Provides a list of capabilities that are allowed to be added to a container. |
spec.fsGroup | Controls the supplemental group applied to some volumes. RunAsAny allows any fsGroup ID to be specified. |
spec.runAsUser | Controls which user ID the containers are run with. RunAsAny allows any runAsUser to be specified. |
spec.seLinux | RunAsAny allows any seLinuxOptions to be specified. |
spec.supplementalGroups | Controls which group IDs containers add. RunAsAny allows any supplementalGroups to be specified. |
spec.volumes | Provides a list of allowed volume types. The allowable values correspond to the volume sources that are defined when creating a volume. |
Role Creation Parameters
Table 3-9 Role Creation
Parameter | Description |
---|---|
apiVersion | APIVersion defines the versioned schema of this representation of an object. |
kind | Kind is a string value representing the REST resource this object represents. |
metadata | Standard object's metadata. |
metadata.name | Name must be unique within a namespace. |
metadata.namespace | Namespace defines the space within which each name must be unique. |
rules | Rules holds all the PolicyRules for this Role |
apiGroups | APIGroups is the name of the APIGroup that contains the resources. |
rules.resources | Resources is a list of resources this rule applies to. |
rules.verbs | Verbs is a list of Verbs that apply to ALL the ResourceKinds and AttributeRestrictions contained in this rule. |
rules.resourceNames | ResourceNames is an optional allowed list of names that the rule applies to. |
Table 3-10 Role Binding Creation
Parameter | Description |
---|---|
apiVersion | APIVersion defines the versioned schema of this representation of an object. |
kind | Kind is a string value representing the REST resource this object represents. |
metadata | Standard object's metadata. |
metadata.name | Name must be unique within a namespace. |
metadata.namespace | Namespace defines the space within which each name must be unique. |
roleRef | RoleRef can reference a Role in the current namespace or a ClusterRole in the global namespace. |
roleRef.apiGroup | APIGroup is the group for the resource being referenced |
roleRef.kind | Kind is the type of resource being referenced |
roleRef.name | Name is the name of resource being referenced |
subjects | Subjects holds references to the objects the role applies to. |
subjects.kind | Kind of object being referenced. Values defined by this API group are "User", "Group", and "ServiceAccount". |
subjects.apiGroup | APIGroup holds the API group of the referenced subject. |
subjects.name | Name of the object being referenced. |
Debug Tool Configuration Parameters
Table 3-11 Debug Tool Configuration Parameters
Parameter | Description |
---|---|
extraContainers | Specifies the spawns debug container along with application container in the pod. |
debugToolContainerMemoryLimit | Indicates the memory assigned for the debug tool container. |
extraContainersVolumesTpl | Specifies the extra container template for the debug tool volume. |
extraContainersVolumesTpl.name | Indicates the name of the volume for debug tool logs storage. |
extraContainersVolumesTpl.emptyDir.medium | Indicates the location where
emptyDir volume is stored.
|
extraContainersVolumesTpl.emptyDir.sizeLimit | Indicates the emptyDir volume
size.
|
command | String array used for container command. |
image | Docker image name |
imagePullPolicy | Image Pull Policy |
name | Name of the container |
resources | Compute Resources required by this container |
resources.limits | Limits describes the maximum amount of compute resources allowed |
resources.requests | Requests describes the minimum amount of compute resources required |
resources.limits.cpu | CPU limits |
resources.limits.memory | Memory limits |
resources.limits.ephemeral-storage | Ephemeral Storage limits |
resources.requests.cpu | CPU requests |
resources.requests.memory | Memory requests |
resources.requests.ephemeral-storage | Ephemeral Storage requests |
securityContext | Security options the container should run with. |
securityContext.allowPrivilegeEscalation | AllowPrivilegeEscalation controls whether a process can gain more privileges than its parent process. This directly controls if the no_new_privs flag will be set on the container process |
securityContext.capabilities | The capabilities to add/drop when running containers. Defaults to the default set of capabilities granted by the container runtime. |
securityContext.capabilities.drop | Removed capabilities |
secuirtyContext.capabilities.add | Added capabilities |
securityContext.runAsUser | The UID to run the entrypoint of the container process. |
volumeMounts.mountPath | Indicates the path for volume mount. |
volumeMounts.name | Indicates the name of the directory for debug tool logs storage. |