3 Using Debug Tool

Overview

The Debug Tool provides third-party troubleshooting tools for debugging the runtime issues in the lab environment.

  • tcpdump
  • ip
  • netstat
  • curl
  • ping
  • nmap
  • dig

Prerequisites

This section describes the prerequisites for using debug tool.

Note:

  1. Configuration in CNE

    The following configurations must be performed in the Bastion Host.

    1. When BSF is installed on CNE version 23.2.0 or above:

      Note:

      • In CNE version 23.2.0 or above, the default CNE 23.2.0 Kyverno policy, disallow-capabilities, do not allow NET_ADMIN and NET_RAW capabilities that are required for debug tool.
      • To run Debug tool on CNE 23.2.0 and above, the user must modify the existing Kyverno policy, disallow-capabilities, as below.
      Adding a Namespace to an Empty Resource
      • Run the following command to verify if the current disallow-capabilities cluster policy has namespace in it.

        Example:

        $ kubectl get clusterpolicies disallow-capabilities -oyaml

        Sample output:

        apiVersion: kyverno.io/v1
        kind: ClusterPolicy
        ...
        ...
        spec:
          rules:
          -exclude:
              any:
              -resources:{}
      • If there are no namespaces, then patch the policy using the following command to add <namespace> under resources.

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources", "value": {"namespaces":["<namespace>"]} }]'

        Example:

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources", "value": {"namespaces":["ocbsf"]} }]'

        Sample output:

        apiVersion: kyverno.io/v1
        kind: ClusterPolicy
        ...
        ...
        spec:
          rules:
          -exclude:
              resources:
                namespaces:
                -ocbsf
      • If in case it is needed to remove the namespace added in the above step, use the following command:

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "replace", "path": "/spec/rules/0/exclude/any/0/resources", "value": {} }]'

        Sample output:

        apiVersion: kyverno.io/v1
        kind: ClusterPolicy
        ...
        ...
        spec:
          rules:
          -exclude:
              any:
              -resources:{}
      Adding a Namespace to an Existing Namespace List
      1. Run the following command to verify if the current disallow-capabilities cluster policy has namespaces in it.

        Example:

        $ kubectl get clusterpolicies disallow-capabilities -oyaml

        Sample output:

        apiVersion: kyverno.io/v1
        kind: ClusterPolicy
        ...
        ...
        spec:
          rules:
          -exclude:
              any:
              -resources:
                  namespaces:
                  -namespace1
                  -namespace2
                  -namespace3
      2. If there are namespaces already added, then patch the policy using the following command to add <namespace> to the existing list:

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/-", "value": "<namespace>" }]'

        Example:

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/-", "value": "ocbsf" }]'

        Example:

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "add", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/-", "value": "ocbsf" }]'

        Sample output:

        apiVersion: kyverno.io/v1
        kind: ClusterPolicy
        ...
        ...
        spec:
          rules:
          -exclude:
              resources:
                namespaces:
                -namespace1
                -namespace2
                -namespace3
                -ocbsf
      3. If in case it is needed to remove the namespace added in the above step, use the following command:

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "remove", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/<index>"}]'

        Example:

        $ kubectl patch clusterpolicy disallow-capabilities --type=json \
          -p='[{"op": "remove", "path": "/spec/rules/0/exclude/any/0/resources/namespaces/3"}]'

        Sample output:

        apiVersion: kyverno.io/v1
        kind: ClusterPolicy
        ...
        ...
        spec:
          rules:
          -exclude:
              resources:
                namespaces:
                -namespace1
                -namespace2
                -namespace3

      Note:

      While removing the namespace, provide the index value for namespace within the array. The index starts from '0'.

    2. When BSF is installed on CNE version prior to 23.2.0

      PodSecurityPolicy (PSP) Creation

      The following configurations must be performed in the Bastion Host.

      1. Log in to the Bastion Host.
      2. Create a new PSP by running the following command from the bastion host. The parameters readOnlyRootFileSystem, allowPrivilegeEscalation, allowedCapabilities are required by debug container.

        Note:

        Other parameters are mandatory for PSP creation and can be customized as per the CNE environment. Default values are recommended.
        $ kubectl apply -f - <<EOF
        
        apiVersion: bsf/v1beta1
        kind: PodSecurityPolicy
        metadata:
          name: debug-tool-psp
        spec:
          readOnlyRootFilesystem: false
          allowPrivilegeEscalation: true
          allowedCapabilities:
          - NET_ADMIN
          - NET_RAW
          fsGroup:
            ranges:
            - max: 65535
              min: 1
            rule: MustRunAs
          runAsUser:
            rule: MustRunAsNonRoot
          seLinux:
            rule: RunAsAny
          supplementalGroups:
            rule: RunAsAny
          volumes:
          - configMap
          - downwardAPI
          - emptyDir
          - persistentVolumeClaim
          - projected
          - secret
        EOF

      Role Creation

      Run the following command to create a role for the PSP:
      kubectl apply -f - <<EOF
      apiVersion: rbac.authorization.k8s.io/v1
      kind: Role
      metadata:
         name: debug-tool-role
         namespace: ocbsf
      rules:
      - apiGroups:
         - policy
         resources:
         - podsecuritypolicies
         verbs:
         - use
         resourceNames:
         - debug-tool-psp
      EOF

      RoleBinding Creation

      Run the following command to attach the service account for your NF namespace with the role created for the tool PSP:
      $ kubectl apply -f - <<EOF
      apiVersion: rbac.authorization.k8s.io/v1
      kind: RoleBinding
      metadata:
         name: debug-tool-rolebinding
         namespace: ocbsf
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: Role
        name: debug-tool-role
      subjects:
      - kind: Group
        apiGroup: rbac.authorization.k8s.io
        name: system:serviceaccounts
      EOF

      Refer to Debug Tool Configuration Parameters for parameter details.

  2. Configuration in NF specific Helm

    Following updates must be performed in custom_values.yaml file.

    1. Log in to the NF server.
    2. Open the custom_values file:
      $ vim <custom_values file>
    3. Under global configuration, add the following:
      
      global:
         extraContainers: ENABLED

      Note:

      • Debug Tool Container comes up with the default user ID - 7000. If the operator wants to override this default value, it can be done using the `runAsUser` field, otherwise the field can be skipped.

        Default value: uid=7000(debugtool) gid=7000(debugtool) groups=7000(debugtool)

      • In case you want to customize the container name, replace the 'name' field in the above values.yaml with the following:
        name: {{ printf "%s-tools-%s" (include "getprefix" .) (include "getsuffix" .) | trunc 63 | trimPrefix "-" | trimSuffix "-"  }}
        It ensures that necessary values are added as prefix and suffix to the container name.
    4. Under service specific configurations for which debugging is required, add the following:
      
      bsf-management-service:
        #extraContainers: DISABLED
        envMysqlDatabase: ocpm_bsf_1.7.0
        resources:
          limits:
            cpu: 1
            memory: 1Gi
          requests:
            cpu: 0.5
            memory: 1Gi
        minReplicas: 1
      

      Note:

      • At the global level, extraContainers flag can be used to enable or disable injecting extra containers globally. This ensures that all the services that use this global value have extra containers enabled/disabled using a single flag.
      • At the service level, extraContainers flag determines whether to use the extra container configuration from the global level or enable/disable injecting extra containers for the specific service.

Execution of Debug Tool

Following is the procedure to run Debug Tool.

Run the following command to enter Debug Tool Container:
  1. Run the following command to retrieve the POD details:
    $ kubectl get pods -n <k8s namespace>

    Example:

    $ kubectl get pods -n ocbsf
  2. Run the following command to enter Debug Tool Container:
    $ kubectl exec -it <pod name> -c <debug_container name> -n <namespace> bash
  3. Run the debug tools:
    bash -4.2$ <debug_tools>
    Example:
    bash -4.2$ tcpdump
  4. Copy the output files from container to host:
    $ kubectl cp -c <debug_container name> <pod name>:<file location in container> -n <namespace> <destination location>

Tools Tested in Debug Container

Following is the list of debug tools that are tested.

tcpdump

Table 3-1 tcpdump

Options Tested Description Output Capabilities
-D Print the list of the network interfaces available on the system and on which tcpdump can capture packets. tcpdump -D
  1. eth02.
  2. nflog (Linux netfilter log (NFLOG) interface)
  3. nfqueue (Linux netfilter queue (NFQUEUE) interface)
  4. any (Pseudo-device that captures on all interfaces)
  5. lo [Loopback]
NET_ADMIN, NET_RAW
-i Listen on interface. tcpdump -i eth0

tcpdump: verbose output suppressed, use -v or -vv for full protocol decodelistening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes12:10:37.381199 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [P.], seq 1986927241:1986927276, ack 1334332290, win 626, options [nop,nop,TS val 849591834 ecr 849561833], length 3512:10:37.381952 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.45868 > kube-dns.kube-system.svc.cluster.local.domain: 62870+ PTR? 1.0.96.10.in-addr.arpa. (40)

NET_ADMIN, NET_RAW
-w Write the raw packets to file rather than parsing and printing them out. tcpdump -w capture.pcap -i eth0 NET_ADMIN, NET_RAW
-r Read packets from file (which was created with the -w option). tcpdump -r capture.pcap

reading from file /tmp/capture.pcap, link-type EN10MB (Ethernet)12:13:07.381019 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [P.], seq 1986927416:1986927451, ack 1334332445, win 626, options [nop,nop,TS val 849741834 ecr 849711834], length 3512:13:07.381194 IP kubernetes.default.svc.cluster.local.https > cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519: Flags [P.], seq 1:32, ack 35, win 247, options [nop,nop,TS val 849741834 ecr 849741834], length 3112:13:07.381207 IP cncc-core-ingress-gateway-7ffc49bb7f-2kkhc.46519 > kubernetes.default.svc.cluster.local.https: Flags [.], ack 32, win 626, options [nop,nop,TS val 849741834 ecr 849741834], length 0

NET_ADMIN, NET_RAW
ip

Table 3-2 ip

Options Tested Description Output Capabilities
addr show Look at protocol addresses. ip addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group defaultlink/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host lovalid_lft forever preferred_lft forever2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group defaultlink/ipip 0.0.0.0 brd 0.0.0.04: eth0@if190: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group defaultlink/ether aa:5a:27:8d:74:6f brd ff:ff:ff:ff:ff:ff link-netnsid 0inet 192.168.219.112/32 scope global eth0valid_lft forever preferred_lft forever

--
route show List routes. ip route show

default via 169.254.1.1 dev eth0

169.254.1.1 dev eth0 scope link

--
addrlabel list List address labels ip addrlabel list

prefix ::1/128 label 0

prefix ::/96 label 3

prefix ::ffff:0.0.0.0/96 label 4

prefix 2001::/32 label 6

prefix 2001:10::/28 label 7

prefix 3ffe::/16 label 12

prefix 2002::/16 label 2

prefix fec0::/10 label 11

prefix fc00::/7 label 5

prefix ::/0 label 1

--
netstat

Table 3-3 netstat

Options Tested Description Output Capabilities
-a Show both listening and non-listening (for TCP this means established connections) sockets. netstat -a

Active Internet connections (servers and established)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 0.0.0.0:tproxy 0.0.0.0:* LISTENtcp 0 0 0.0.0.0:websm 0.0.0.0:* LISTENtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47292 TIME_WAITtcp 0 0 cncc-core-ingress:46519 kubernetes.defaul:https ESTABLISHEDtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47240 TIME_WAITtcp 0 0 cncc-core-ingress:websm 10-178-254-194.ku:47347 TIME_WAITudp 0 0 localhost:59351 localhost:ambit-lm ESTABLISHEDActive UNIX domain sockets (servers and established)Proto RefCnt Flags Type State I-Node Pathunix 2 [ ] STREAM CONNECTED 576064861

--
-l Show only listening sockets. netstat -l

Active Internet connections (only servers)Proto Recv-Q Send-Q Local Address Foreign Address Statetcp 0 0 0.0.0.0:tproxy 0.0.0.0:* LISTENtcp 0 0 0.0.0.0:websm 0.0.0.0:* LISTENActive UNIX domain sockets (only servers)Proto RefCnt Flags Type State I-Node Path

--
-s Display summary statistics for each protocol. netstat -s

Ip:4070 total packets received0 forwarded0 incoming packets discarded4070 incoming packets delivered4315 requests sent outIcmp:0 ICMP messages received0 input ICMP message failed.ICMP input histogram:2 ICMP messages sent0 ICMP messages failedICMP output histogram:destination unreachable: 2

--
-i Display a table of all network interfaces. netstat -i

Kernel Interface tableIface MTU RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flgeth0 1440 4131 0 0 0 4355 0 0 0 BMRUlo 65536 0 0 0 0 0 0 0 0 LRU

--
jq

Table 3-4 jq

Options Tested Description Output Capabilities
<jq filter> [file...]

Use it to slice and filter and map and transform structured data.

Sample JSON file:
{
"fruit": {
"name": "apple",
"color": "green",
"price": 1.2
}
}
jq '.fruit' sample.json
{
"name": "apple",
"color": "green",
"price": 1.2
}
--
Sample Json Sample JSON file:
{
"fruit": {
"name": "apple",
"color": "green",
"price": 1.2
}
}
jq '.fruit.color,.fruit.price' sample.json
"green"
1.2
--
curl

Table 3-5 curl

Options Tested Description Output Capabilities
-o Write output to <file> instead of stdout. curl -o file.txt http://abc.com/file.txt --
-x Use the specified HTTP proxy. curl -x proxy.com:8080 -o http://abc.com/file.txt --
ping

Table 3-6 ping

Options Tested Description Output Capabilities
<ip> Run a ping test to see whether the target host is reachable or not. ping 10.178.254.194 NET_ADMIN, NET_RAW
-c Stop after sending 'c' number of ECHO_REQUEST packets. ping -c 5 10.178.254.194 NET_ADMIN, NET_RAW
-f (with non zero interval) Flood ping. For every ECHO_REQUEST sent a period ''.'' is printed, while for every ECHO_REPLY received a backspace is printed. ping -f -i 2 10.178.254.194 NET_ADMIN, NET_RAW
nmap

Table 3-7 nmap

Options Tested Description Output Capabilities
<ip> Scan for Live hosts, Operating systems, packet filters and open ports running on remote hosts. nmap 10.178.254.194
Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:54 UTCNmap scan report for
      10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194)Host is up (0.00046s
      latency).Not shown: 995 closed portsPORT STATE SERVICE22/tcp open
      ssh179/tcp open bgp6666/tcp open irc6667/tcp open irc30000/tcp open
      unknownNmap done: 1 IP address (1 host up) scanned in 0.04 seconds
--
-v Increase verbosity level nmap -v 10.178.254.194
Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:55 UTC
Initiating Ping Scan at 05:55
Scanning 10.178.254.194 [2 ports]
Completed Ping Scan at 05:55, 0.00s elapsed (1 total hosts)
Initiating Parallel DNS resolution of 1 host. at 05:55
Completed Parallel DNS resolution of 1 host. at 05:55, 0.00s elapsed
Initiating Connect Scan at 05:55
Scanning 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194) [1000 ports]
Discovered open port 22/tcp on 10.178.254.194
Discovered open port 30000/tcp on 10.178.254.194
Discovered open port 6667/tcp on 10.178.254.194
Discovered open port 6666/tcp on 10.178.254.194
Discovered open port 179/tcp on 10.178.254.194
Completed Connect Scan at 05:55, 0.02s elapsed (1000 total ports)
Nmap scan report for 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194)
Host is up (0.00039s latency).
Not shown: 995 closed ports
PORT STATE SERVICE
22/tcp open ssh
179/tcp open bgp
6666/tcp open irc
6667/tcp open irc
30000/tcp open unknown

Read data files from: /usr/bin/../share/nmap
Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds
--
-iL Scan all the listed IP addresses in a file. Sample file nmap -iL sample.txt
Starting Nmap 6.40 ( http://nmap.org ) at 2020-09-29 05:57 UTC
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00036s latency).
Other addresses for localhost (not scanned): 127.0.0.1
Not shown: 998 closed ports
PORT STATE SERVICE
8081/tcp open blackice-icecap
9090/tcp open zeus-admin

Nmap scan report for 10-178-254-194.kubernetes.default.svc.cluster.local (10.178.254.194)
Host is up (0.00040s latency).
Not shown: 995 closed ports
PORT STATE SERVICE
22/tcp open ssh
179/tcp open bgp
6666/tcp open irc
6667/tcp open irc
30000/tcp open unknown

Nmap done: 2 IP addresses (2 hosts up) scanned in 0.06 seconds
--
dig

Table 3-8 dig

Options Tested Description Output Capabilities
<ip> It performs DNS lookups and displays the answers that are returned from the name server(s) that were queried. dig 10.178.254.194Note: The IP should be reachable from inside the container. --
-x Query DNS Reverse Look-up. dig -x 10.178.254.194 --

3.1 Configurable Parameters for Debug Tool

This section describes the configurable parameters that users can customize to configure debug tool.

CNE Parameters

Table 3-9 CNE Parameters

Parameter Description
apiVersion APIVersion defines the version schema of this representation of an object.
kind Kind is a string value representing the REST resource this object represents.
metadata Standard object's metadata.
metadata.name Name must be unique within a namespace.
spec spec defines the policy enforced.
spec.readOnlyRootFilesystem Controls whether the containers run with a read-only root filesystem (no writable layer).
spec.allowPrivilegeEscalation Gates whether or not a user is allowed to set the security context of a container to allowPrivilegeEscalation=true.
spec.allowedCapabilities Provides a list of capabilities that are allowed to be added to a container.
spec.fsGroup Controls the supplemental group applied to some volumes. RunAsAny allows any fsGroup ID to be specified.
spec.runAsUser Controls which user ID the containers are run with. RunAsAny allows any runAsUser to be specified.
spec.seLinux RunAsAny allows any seLinuxOptions to be specified.
spec.supplementalGroups Controls which group IDs containers add. RunAsAny allows any supplementalGroups to be specified.
spec.volumes Provides a list of allowed volume types. The allowable values correspond to the volume sources that are defined when creating a volume.

Role Creation Parameters

Table 3-10 Role Creation

Parameter Description
apiVersion APIVersion defines the versioned schema of this representation of an object.
kind Kind is a string value representing the REST resource this object represents.
metadata Standard object's metadata.
metadata.name Name must be unique within a namespace.
metadata.namespace Namespace defines the space within which each name must be unique.
rules Rules holds all the PolicyRules for this Role
apiGroups APIGroups is the name of the APIGroup that contains the resources.
rules.resources Resources is a list of resources this rule applies to.
rules.verbs Verbs is a list of Verbs that apply to ALL the ResourceKinds and AttributeRestrictions contained in this rule.
rules.resourceNames ResourceNames is an optional white list of names that the rule applies to.

Table 3-11 Role Binding Creation

Parameter Description
apiVersion APIVersion defines the versioned schema of this representation of an object.
kind Kind is a string value representing the REST resource this object represents.
metadata Standard object's metadata.
metadata.name Name must be unique within a namespace.
metadata.namespace Namespace defines the space within which each name must be unique.
roleRef RoleRef can reference a Role in the current namespace or a ClusterRole in the global namespace.
roleRef.apiGroup APIGroup is the group for the resource being referenced
roleRef.kind Kind is the type of resource being referenced
roleRef.name Name is the name of resource being referenced
subjects Subjects holds references to the objects the role applies to.
subjects.kind Kind of object being referenced. Values defined by this API group are "User", "Group", and "ServiceAccount".
subjects.apiGroup APIGroup holds the API group of the referenced subject.
subjects.name Name of the object being referenced.

Debug Tool Configuration Parameters

Table 3-12 Debug Tool Configuration Parameters

Parameter Description
command String array used for container command.
image Docker image name
imagePullPolicy Image Pull Policy
name Name of the container
resources Compute Resources required by this container
resources.limits Limits describes the maximum amount of compute resources allowed
resources.requests Requests describes the minimum amount of compute resources required
resources.limits.cpu CPU limits
resources.limits.memory Memory limits
resources.limits.ephemeral-storage Ephemeral Storage limits
resources.requests.cpu CPU requests
resources.requests.memory Memory requests
resources.requests.ephemeral-storage Ephemeral Storage requests
securityContext Security options the container should run with.
securityContext.allowPrivilegeEscalation AllowPrivilegeEscalation controls whether a process can gain more privileges than its parent process. This directly controls if the no_new_privs flag will be set on the container process
secuirtyContext.readOnlyRootFilesystem Whether this container has a read-only root filesystem. Default is false.
securityContext.capabilities The capabilities to add/drop when running containers. Defaults to the default set of capabilities granted by the container runtime.
securityContext.capabilities.drop Removed capabilities
secuirtyContext.capabilities.add Added capabilities
securityContext.runAsUser The UID to run the entrypoint of the container process.
debugToolContainerMemoryLimit Indicates the memory assigned for the debug tool container.
extraContainersVolumesTpl Specifies the extra container template for the debug tool volume.
extraContainersVolumesTpl.name Indicates the name of the volume for debug tool logs storage.
extraContainersVolumesTpl.emptyDir.medium Indicates the location where emptyDir volume is stored.
extraContainersVolumesTpl.emptyDir.sizeLimit Indicates the emptyDir volume size.
volumeMounts.mountPath Indicates the path for volume mount.
volumeMounts.name Indicates the name of the directory for debug tool logs storage.