CNE Services

3.1 Runtime Environment

The primary function of CNE is to provide a runtime environment by using Kubernetes for cloud native applications.

Note:

CNE 23.4.x supports Kubernetes version 1.27.x.

3.1.1 Kubernetes Cluster

The Kubernetes cluster in CNE contains three controller nodes. Kubernetes uses these controller nodes to coordinate the execution of application workloads across many worker nodes.

The Kubernetes cluster can:

sustain the loss of one controller node with no impact to the CNE service.
tolerate the loss of two controller nodes without losing the cluster configuration data. Kubernetes is placed in the read-only mode.

In the case where two controller nodes are lost, Kubernetes is in the read-only mode. The application workload continues to run on the worker node. However, no changes can be made to the cluster configuration data. Due to this situation, it becomes impossible to install new applications and make changes to the application-specific custom resources that are used to change an application's configuration.

The CNE Kubernetes can contain many worker nodes and the minimum recommendation is six. Kubernetes distributes workloads across all the available worker nodes. The Kubernetes placement algorithm attempts to balance workloads across all available worker nodes so that no worker node is overloaded. Kubernetes also attempts to honor the affinity or anti-affinity rules specified by applications.

Additional Kubernetes cluster details are as follows:

Cluster configuration data, as well as application-specific custom resources, are stored in etcd.
Workloads in the Kubernetes cluster are run by the containerd runtime.
NTP time synchronization is maintained by chrony.

3.1.2 Networking

In-cluster Networking

CNE includes the Calico plug-in for networking within the Kubernetes cluster. Calico provides network security solutions for containers. Calico is known for its performance, flexibility, and power.

External Networking

You must configure at least one external network to allow the applications that run in CNE to communicate with applications that are outside of the CNE platform.

You can also configure multiple external networks to CNE to allow specific applications to communicate on networks that are dedicated for specific purposes. For more information on configuring external networks, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

3.1.3 Storage

Bare Metal Deployment Storage

When CNE is deployed on Bare Metal servers, it constructs a Ceph storage cluster from the disk drives attached to the servers designated as Kubernetes worker nodes. The Ceph cluster is then integrated with Kubernetes so that CNE common services and applications that run on CNE can use the Ceph cluster for persistent data storage.

Virtual Deployment Storage

When CNE is deployed on a virtual infrastructure, it delivers a cloud provider that corresponds to the Kubernetes cloud provider interface. For more information, see cloud provider interface. The cloud provider interacts with the Kubernetes cluster and the storage manager to provide storage when the applications request it. For example, OpenStack supports Cinder storage manager and VMware supports vSphere storage manager.

After Kubernetes has access to persistent storage, CNE then creates several Kubernetes StorageClass classes for the common services and applications to use.

CNE creates a storage class named Standard, which allows applications to store application-specific data. Standard is the default storage class and is used when the applications create a Persistent Volume Claim (PVC) with no StorageClass reference.
CNE creates one storage class to allow Prometheus to store metrics data.
CNE creates two storage classes to allow OpenSearch to store log and trace data.

3.1.4 Ingress Traffic Delivery

Each application must create a Kubernetes Service of the type LoadBalancer to allow the clients from an external network to connect with the applications within the CNE. Each LoadBalancer service is assigned to a service IP from an external network. CNE provides load balancers to ensure that ingress traffic from external clients is distributed evenly across Kubernetes worker nodes that host the application pods.

For Bare Metal deployments, the top of rack (ToR) switches perform the load balancing function. For virtualized deployments, dedicated virtual machines perform the load balancing function.

For more information on selecting an external network for an NF application, see the installation instructions of the NF.

Note:

CNE does not support NodePort services because they are considered insecure in a production environment.

Ingress traffic distribution

CNE Load Balancers distribute ingress traffic evenly across all Kubernetes worker nodes. Each ingress message is distributed in a round-robin fashion to all in-service Kubernetes worker nodes, regardless of the sender, or destination IP or port.

Note:

CNE Load Balancers distribute ingress traffic to Kubernetes worker nodes and not pods. Once an ingress message is delivered to a Kubernetes worker node, Kubernetes selects a pod belonging to the target service and delivers the ingress message to the pod. For more information on how Kubernetes delivers ingress messages to pods in the target service, see Kubernetes Network Model.
CNE Load Balancers support only Cluster value for the spec.externalTrafficPolicy field in the Kubernetes service specification and do not support Local value.

3.1.5 Egress Traffic Delivery

CNE uses the Load Balancers used for ingress traffic distribution to deliver egress requests to servers outside the CNE cluster. CNE Load Balancers also perform Egress Network Address Translation (NAT) on the egress requests to ensure that the source IP field of all egress requests contains an IP address that belongs to the external network to which the egress request is delivered. This is done to ensure that the egress requests are not rejected by security rules on the external networks.

For more information on selecting an external network for an NF application, see to the installation instructions of the NF.

3.1.6 DNS Query Services

CoreDNS is used to resolve Domain Name System (DNS) queries for services within the Kubernetes cluster. DNS queries for services outside the Kubernetes cluster are routed to DNS nameservers running on the Bastion Hosts, which in turn use customer-specified DNS servers outside the CNE to resolve these DNS queries.

3.1.6.1 Local DNS

Local DNS feature is a reconfiguration of core DNS (CoreDNS) to support external hostname resolution. Local DNS allows the pods and services running inside the CNE cluster to connect with the ones running outside the CNE cluster using core DNS. That is, when Local DNS is enabled, CNE routes the connection to external hosts through core DNS rather than the nameservers on the Bastion Hosts. For information about activating this feature, see Activating Local DNS.

Local DNS provides Local DNS API, that runs on each Bastion server, to define the external hostnames as custom records. These records are used to identify and locate hosts, services, or NFs outside the CNE cluster. Local DNS supports adding and removing the following records:

A Records: allows to define the hostname and the IP address to locate the service
SRV Records: allows to define the location (hostname and port number) of a specific service and how your domain handles the service

For information about adding and removing these records, see Adding and Removing DNS Records.

For additional details about this feature, see Activating and Configuring Local DNS.

3.2 Automation

CNE provides considerable automation throughout the phases of an application's deployment, upgrade, and maintenance cycles.

3.2.1 Deployment

CNE provides the Helm application to automate the deployment of applications into the Kubernetes cluster. The applications must provide a Helm chart to perform automated deployment. For more information about deploying an application in CNE, see Maintenance Procedures section.

You can deploy CNE automatically through custom scripts provided with the CNE platform. For more information about installing CNE, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

3.2.2 Upgrade

CNE provides Helm to automate the upgrade of applications in the Kubernetes cluster. The applications must provide a Helm chart to perform an automated software upgrade. For more information about an application upgrade in CNE, see Maintenance Procedures section.

You can upgrade the CNE platform automatically through the execution of a Continuous Delivery (CD) pipeline. For more information on CNE upgrade instructions, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

3.2.3 Maintenance

CNE provides a combination of automated and manual procedures for performing maintenance operations on the Kubernetes cluster and common services. For more instructions about performing maintenance operations, see Maintenance Procedures section.

3.3 Security

CNE provides multi-level security measures to protect against malicious attacks.

All Kubernetes nodes and Bastion Hosts are security-hardened to prevent unauthorized access.
Maintenance operations on the Kubernetes cluster can only be performed from the Bastion Hosts to prevent Denial of Service (DOS) attacks against the Kubernetes cluster.
Only the Kubernetes worker nodes can access the Kubernetes controller nodes to protect sensitive cluster data.
Kyverno provides the policies to ensure that malicious applications don't corrupt the Kubernetes controller, worker nodes, and cluster data.
Bastion Host container registry is secured with TLS and can be accessed from CNE only.
Kubernetes secrets are encrypted before they are stored using secretbox. This ensures that the Kubernetes secrets are secure and accessible to authorized users only.

For more information on security hardening, see Oracle Communications Cloud Native Core, Security Guide.

3.4 Redundancy

This section provides detail about maintaining redundancy in Kubernetes, ingress load balancing, common services, and Bastion Host.

Infrastructure

In virtualized infrastructures, CNE requires the specification of antiaffinity rules to distribute Kubernetes controller and worker nodes across several physical servers. In Bare Metal infrastructures, each Kubernetes controller and worker node are placed on its own physical server.

Kubernetes

For the Kubernetes cluster, CNE runs three controller nodes with an etcd node on each controller node.

The etcd node allows the cluster to:

sustain the loss of one controller node with no impact on the service.
sustain the loss of two controller nodes without losing the cluster data in the etcd database. However, in cases where two controller nodes are lost, Kubernetes is placed in read-only mode. In the read-only mode, the creation of new pods and the deployment of new services are not possible.
restore etcd data from the backup when all the three controller nodes fail. When all the three controller nodes fail, all the cluster data in etcd is lost, and it is essential to restore data from the backup to run the operations.

CNE uses internal load balancers to distribute API requests from applications running in worker nodes evenly across all controller nodes.

Common Services

OpenSearch uses three master nodes, three ingress nodes, and five data nodes by default. The number of data nodes provisioned can be configured as per the requirement. For more details, see the "Common Installation Configuration" section in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide. The system can withstand failure of one node in each of these groups with no impact on the service.
Fluentd OpenSearch runs as a DaemonSet, where one Fluentd OpenSearch pod runs on each Kubernetes worker node. When the worker node is up and running, Fluentd OpenSearch sends the logs for that node to OpenSearch.
Two Prometheus instances are run simultaneously to provide redundancy. Each Prometheus instance independently collects and stores metrics from all CNE services and all applications running in the CNE Kubernetes cluster. Automatic deduplication removes the repetitive metric information in Prometheus.

Kubernetes distributes common service pods across Kubernetes worker nodes such that even the loss of one worker node has no impact on the service. Kubernetes automatically reschedules the failed pods to maintain redundancy.

Ingress and Egress Traffic Delivery

For virtual installations, the Load Balancer VMs (LBVM) perform both ingress and egress traffic delivery.
Two Load Balancer VMs are run for each configured external network in virtual installations. These LB VMs run in an active or standby configuration. Only one LB VM (the active LB VM) processes the ingress traffic for a given external network.
The active LB VM is continuously monitored. If the active LB VM fails, the standby LB VM is automatically promoted to be the active LB VM.
If the virtual infrastructure is able to recover the failed LB VM, The recovered LB VM automatically becomes the standby LB VM, and load balancing redundancy is restored. If the infrastructure is NOT able to recover the failed LB VM, you must perform the Restoring a Failed Load Balancer fault recovery procedure described in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide to restore load balancing redundancy.

Bastion Hosts

There are two Bastion Hosts per CNE.
Bastion Hosts run in an active or standby configuration.
Bastion Host health is monitored from an application that runs in the Kubernetes cluster, and instructs standby Bastion Host to take over when active Bastion Host fails.
A DaemonSet running on each Kubernetes worker node ensures that the worker node always retrieves container images and Helm charts from the active Bastion Host.
Container images, Helm charts, and other files essential to CNE internal operations are synchronized from the active Bastion Host to the standby Bastion Host periodically. This way, when an active Bastion Host fails, the standby appears as an identical replacement upon switchover.

3.5 Observability

CNE provides services to capture metrics, logs, and traces for both itself and the cloud native applications. You can use the observability data for problem detection, troubleshooting, and preventive maintenance. CNE also includes tools to filter, view, and analyze the observability data.

3.5.1 Metrics

Metrics collection

Prometheus collects the application metrics and CNE metrics from the following services:

The servers (Bare Metal) or VMs (virtualized) that host the Kubernetes cluster. The node-exporter service collects hardware and Operating system (OS) metrics exposed by the Linux OS.
The kube-state-metrics service collects information about the state of all Kubernetes objects. Since Kubernetes uses objects to store information about all nodes, deployments, and pods in the Kubernetes cluster, kube-state-metrics effectively captures metrics on all aspects of the Kubernetes cluster.
All of the CNE services generate metrics that Prometheus collects and stores.

Metrics storage

Prometheus stores the application metrics and CNE metrics in an internal time-series database.

Metrics filtering and viewing

Grafana allows you to view and filter metrics from Prometheus. CNE offers some default Grafana dashboards which can be cloned and customized as per your requirement. For more information about these default dashboards, see CNE Grafana Dashboards.

For more information about CNE Metrics, see Metrics.

3.5.2 Alerts

CNE uses AlertManager to raise alerts. These alerts notify the user when any of its common services are in abnormal conditions. CNE delivers alerts and Simple Network Management Protocol (SNMP) traps to an external SNMP trap manager. For more information about the detailed CNE definition and description of each alert, see Alerts section.

Applications deployed on CNE can define their alerts to inform the user of problems specific to each application. For instructions about applications loading the alerting rules, see Maintenance Procedures.

3.5.3 Logs

Logs collection

Fluentd OpenSearch collects logs for applications installed in CNE and CNE services logs.

Logs storage

CNE stores its logs in Oracle OpenSearch. Each log message is written to an OpenSearch index. The source of the log message determines which index the record is written to is as follows:

If the applications in the CNE generates the logs, these logs are stored in the current day's application index named logstash-YYYY.MM.DD.
If the CNE services generate the logs, these logs are stored in the current day's CNE index named occne-logstash-YYYY.MM.DD.

CNE creates new log indices each day. Each day's indices are uniquely named by appending the current date to the index name as a suffix. For example, all application logs generated on November 13, 2021, are written to the logstash-2021.11.13 index. By default, indices are retained for seven days.

Only current-day indices are writable and they are configured for efficient storage of new log messages. Current-day indices are called "hot" indices and stored on "hot" data nodes. At the end of each day, new current-day indices are created. At the same time, the previous day's indices are marked as read-only, so no new log messages are written to them, and compressed, to allow for more efficient storage. These previous-day indices are considered "warm" indices, and are stored on "warm" data nodes.

Log filtering and viewing

Oracle OpenSearch Dashboard filters and views logs that are available in OpenSearch.

3.5.4 Traces

Trace collection

Jaeger collects traces from applications deployed in CNE.

Note:

Jaeger captures only a small percentage of all application message flows. The default capture rate is .01% (1 in 10,000 message flows).

Traces are not collected from CNE services.

Trace storage

CNE stores its traces in Oracle OpenSearch. The applications deployed in CNE generate traces and stores them in the jaeger-trace-YYYY.MM.DD index. CNE creates a new Jaeger index each day. For example, all traces generated by applications on November 13, 2021, are written to the jaeger-trace-2021.11.13 index.

Trace filtering and viewing

Oracle OpenSearch Dashboard filters and views the traces that are available in Oracle OpenSearch.

3.6 Maintenance Access

To access the Kubernetes cluster for maintenance purposes, CNE provides 2 Bastion Hosts. Each Bastion Host provides command-line access to several troubleshooting and maintenance tools such as kubectl, Helm, and Jenkins.

3.7 Frequently Used Common Services

This section provides some of the frequently used common services and their version supported by CNE 23.4.x. You can find the complete list of third-party services in the dependencies TGZ file provided as part of the software delivery package.

Table 3-1 Frequently Used Common Services

Common Service	Supported Version
AlertManager	0.25.0
Grafana	9.5.3
Prometheus	2.44.0
Calico	3.25.2
cert-manager	1.12.4
Containerd	1.7.5
Fluentd - OpenSearch	1.16.2
HAProxy	2.6.7
Helm	3.12.3
Istio	1.18.2
Jaeger	1.45.0
Kubernetes	1.27.5
Kyverno	1.9.0
MetalLB	0.13.11
Oracle OpenSearch	2.3.0
Oracle OpenSearch Dashboard	2.3.0
Prometheus Operator	0.65.1

3 CNE Services

3.1 Runtime Environment

3.1.1 Kubernetes Cluster

3.1.2 Networking

3.1.3 Storage

3.1.4 Ingress Traffic Delivery

3.1.5 Egress Traffic Delivery

3.1.6 DNS Query Services

3.1.6.1 Local DNS

3.2 Automation

3.2.1 Deployment

3.2.2 Upgrade

3.2.3 Maintenance

3.3 Security

3.4 Redundancy

3.5 Observability

3.5.1 Metrics

3.5.2 Alerts

3.5.3 Logs

3.5.4 Traces

3.6 Maintenance Access

3.7 Frequently Used Common Services