3 CNE Services

CNE provides the following common services for all the installed applications:

  • Runtime Environment - CNE provides a runtime environment in which you can run all the cloud native applications.
  • Automation - CNE provides automation solutions to deploy, upgrade, and maintain cloud native applications.
  • Security - CNE provides multi-level security measures to protect against malicious attacks.
  • Redundancy - CNE provides redundancy or high availability through the specification of anti-affinity rules at the infrastructure level.
  • Observability - CNE provides services to capture metrics, logs, and traces for both itself (CNE) and the cloud native applications.
  • Maintenance Access - CNE provides a Bastion Host to access the Kubernetes cluster for maintenance purposes.

3.1 Runtime Environment

The primary function of CNE is to provide a runtime environment by using Kubernetes for cloud native applications.

Note:

CNE 25.1.1xx supports Kubernetes version 1.31.x.

3.1.1 Kubernetes Cluster

The Kubernetes cluster in CNE contains three controller nodes. Kubernetes uses these controller nodes to coordinate the execution of application workloads across many worker nodes.

The Kubernetes cluster manages control node failures as follows:
  • Kubernetest sustains the loss of one controller node with no impact to the CNE service.
  • Kubernetest tolerates the loss of two controller nodes without losing the cluster configuration data. In a case where two controller nodes are lost, application workload continues to run on the worker node. However, no changes can be made to the cluster configuration data. Due to this, the system experiences the following constrains:
    • New applications cannot be installed.
    • Changes to application-specific custom resources, that are used to change an application's configuration, cannot be performed.
    • Most kubectl commands, especially those that require interacting with the control plane, doesn't work. Commands that don't depend on realtime data from the control plane can still work depending on the situation.

It is recommended to have a minimum of six worker nodes for a proper functioning of CNE. The maximum number of worker nodes depend on your requirement. However, CNE allows you to setup a BareMetal deployment with a minimal resources of three worker nodes which is only ideal for testing and getting started with CNE. For more information about installing and upgrading CNE with bare minimum resources, see the "Installing BareMetal CNE using Bare Minimum Servers" and "Upgrading BareMetal CNE Deployed using Bare Minimum Servers" sections respectively in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

Kubernetes distributes workloads across all the available worker nodes. The Kubernetes placement algorithm attempts to balance workloads across all available worker nodes, such that no worker node is overloaded. Kubernetes also attempts to honor the affinity or anti-affinity rules specified by applications.

Additional Kubernetes cluster details are as follows:

  • Cluster configuration data, as well as application-specific custom resources, are stored in etcd.
  • Workloads in the Kubernetes cluster are run by the containerd runtime.
  • NTP time synchronization is maintained by chrony.

3.1.2 Networking

In-cluster Networking

CNE includes the Calico plug-in for networking within the Kubernetes cluster. Calico provides network security solutions for containers. Calico is known for its performance, flexibility, and power.

External Networking

You must configure at least one external network to allow the applications that run in CNE to communicate with applications that are outside of the CNE platform.

You can also configure multiple external networks to CNE to allow specific applications to communicate on networks that are dedicated for specific purposes. For more information on configuring external networks, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

Support for Floating IPs in OpenStack

Floating IPs are additional public IP addresses that are associated with instances such as control nodes, worker nodes, Bastion Host, and LBVMs. Floating IPs can be quickly re-assigned and switched from one instance to another using an API interface, thereby ensuring high availability and less maintenance. You can activate the Floating IP feature after installing or upgrading CNE. For information about enabling or disabling the Floating IP feature, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

3.1.3 Storage

Bare Metal Deployment Storage

When CNE is deployed on Bare Metal servers, it constructs a Ceph storage cluster from the disk drives attached to the servers designated as Kubernetes worker nodes. The Ceph cluster is then integrated with Kubernetes so that CNE common services and applications that run on CNE can use the Ceph cluster for persistent data storage.

Virtual Deployment Storage

When CNE is deployed on a virtual infrastructure, it delivers a cloud provider that corresponds to the Kubernetes cloud provider interface. For more information, see cloud provider interface. The cloud provider interacts with the Kubernetes cluster and the storage manager to provide storage when the applications request it. For example, OpenStack supports Cinder storage manager and VMware supports vSphere storage manager.

After Kubernetes has access to persistent storage, CNE then creates several Kubernetes StorageClass classes for the common services and applications to use.

  • CNE creates a storage class named Standard, which allows applications to store application-specific data. Standard is the default storage class and is used when the applications create a Persistent Volume Claim (PVC) with no StorageClass reference.
  • CNE creates one storage class to allow Prometheus to store metrics data.
  • CNE creates two storage classes to allow OpenSearch to store log and trace data.

3.1.4 Traffic Segregation

CNE provides the following options to segregate ingress and egress traffic:
  • Load Balancer Virtual Machine (LBVM)
  • Cloud Native Load Balancer (CNLB)
You can choose either of the options to configure and segregate network traffic during the installation. By default, CNE uses LBVM for ingress and egress traffic segregation. If you want to use CNLB for traffic segregation, then you must perform the CNLB-specific predeployment configurations while installing CNE. For more information about CNLB-specific configurations, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

Traffic Segregation Using Load Balancer Virtual Machine (LBVM)

When this option is selected, CNE uses Load Balancer Virtual Machines (LBVM) to segregate ingress and egress traffic.
Ingress Traffic Distribution:

Each application must create a Kubernetes Service of the type LoadBalancer to allow the clients from an external network to connect with the applications within the CNE. Each LoadBalancer service is assigned to a service IP from an external network. CNE provides load balancers to ensure that ingress traffic from external clients is distributed evenly across Kubernetes worker nodes that host the application pods.

For BareMetal deployments, the top of rack (ToR) switches perform the load balancing function. For virtualized deployments, dedicated virtual machines perform the load balancing function.

For more information on selecting an external network for an NF application, see the installation instructions in NF-Specific Installation, Upgrade, and Fault Recovery Guide.

Note:

CNE does not support NodePort services because they are considered insecure in a production environment.
Support for externalTrafficPolicy: CNE supports externalTrafficPolicy to control the traffic distribution. This feature can be configured with two options:
  • externalTrafficPolicy=Local: When this attribute is set to Local, the Kubernetes service routes traffic only to the nodes where the endpoint pods are running. For example, if a pod for the service is running on Node A, the external traffic will be sent to Node A. This setting is useful when you want to keep traffic local to the nodes running the service pods, potentially optimizing the traffic for reduced latency or specific node-level configurations.
  • externalTrafficPolicy=Cluster (default value): When this attribute is set to Cluster, external traffic is not restricted to the nodes running the service's pods. The traffic can be directed to any node in the Kubernetes cluster and then it is internally forwarded to the node where the pod is running.
This feature does not impact the original functionality. To use this feature, Network Functions (NFs) must deploy their services with the externalTrafficPolicy set to Local, and update their Helm charts and manifest file to include this attribute.
Egress Traffic Delivery:

CNE uses the Load Balancers used for ingress traffic distribution to deliver egress requests to servers outside the CNE cluster. CNE Load Balancers also perform Egress Network Address Translation (NAT) on the egress requests to ensure that the source IP field of all egress requests contains an IP address that belongs to the external network to which the egress request is delivered. This is done to ensure that the egress requests are not rejected by security rules on the external networks.

For more information on selecting an external network for an NF application, see the installation instructions in NF-Specific Installation, Upgrade, and Fault Recovery Guide.

Traffic Segregation Using Cloud Native Load Balancer (CNLB)

When this option is selected, CNE uses Cloud Native Load Balancer (CNLB) for managing ingress and egress network. CNLB is as an alternate to the existing LBVM, lb-controller, and egress-controller solutions. You can enable or disable this feature only at the time of a fresh CNE installation. To use CNLB, you must preconfigure the ingress and egress network details in the cnlb.ini file before installing CNE. For more information about enabling and configuring CNLB, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

3.1.5 DNS Query Services

CoreDNS is used to resolve Domain Name System (DNS) queries for services within the Kubernetes cluster. DNS queries for services outside the Kubernetes cluster are routed to DNS nameservers running on the Bastion Hosts, which in turn use customer-specified DNS servers outside the CNE to resolve these DNS queries.

3.1.5.1 Local DNS

Local DNS feature is a reconfiguration of core DNS (CoreDNS) to support external hostname resolution. Local DNS allows the pods and services running inside the CNE cluster to connect with the ones running outside the CNE cluster using core DNS. That is, when Local DNS is enabled, CNE routes the connection to external hosts through core DNS rather than the nameservers on the Bastion Hosts. For information about activating this feature, see Activating Local DNS.

Local DNS provides Local DNS API, that runs on each Bastion server, to define the external hostnames as custom records. These records are used to identify and locate hosts, services, or NFs outside the CNE cluster. Local DNS supports adding and removing the following records:
  • A Records: allows to define the hostname and the IP address to locate the service
  • SRV Records: allows to define the location (hostname and port number) of a specific service and how your domain handles the service
For information about adding and removing these records, see Adding and Removing DNS Records.

For additional details about this feature, see Activating and Configuring Local DNS.

3.2 Automation

CNE provides considerable automation throughout the phases of an application's deployment, upgrade, and maintenance cycles.

3.2.1 Deployment

CNE provides the Helm application to automate the deployment of applications into the Kubernetes cluster. The applications must provide a Helm chart to perform automated deployment. For more information about deploying an application in CNE, see Maintenance Procedures section.

You can deploy CNE automatically through custom scripts provided with the CNE platform. For more information about installing CNE, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

3.2.2 Upgrade

CNE provides Helm to automate the upgrade of applications in the Kubernetes cluster. The applications must provide a Helm chart to perform an automated software upgrade. For more information about an application upgrade in CNE, see Maintenance Procedures section.

You can upgrade the CNE platform automatically through the execution of a Continuous Delivery (CD) pipeline. For more information on CNE upgrade instructions, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.

3.2.3 Maintenance

CNE provides a combination of automated and manual procedures for performing maintenance operations on the Kubernetes cluster and common services. For more instructions about performing maintenance operations, see Maintenance Procedures section.

3.3 Security

CNE provides multilevel security measures to protect against malicious attacks.

  • All Kubernetes nodes and Bastion Hosts are security-hardened to prevent unauthorized access.
  • Maintenance operations on the Kubernetes cluster can only be performed from the Bastion Hosts to prevent Denial of Service (DOS) attacks against the Kubernetes cluster.
  • Kubernetes control plane and Kubernetes worker nodes communicate over secure communication channels to protect sensitive cluster data.
  • Kyverno policy framework is deployed in CNE with base line policies. This ensures that the containers and resources running in CNE are compliant.
  • Kyverno pod security policies ensure that malicious applications don't corrupt the Kubernetes controller, worker nodes, and cluster data. Kyverno provides the set of policies equivalent to those provided by the built-in Kubernetes PSP as Kubernetes PSP is deprecated.
  • Bastion Host container registry is secured with TLS and can be accessed from CNE only.
  • CNE supports both TLS 1.2 and TLS 1.3 for establishing secure connections. Currently, the minimum supported TLS version for CNE internal and external communication is TLSv12. However, in the upcoming releases, CNE will support only TLSv13 for security and governance compliance.
  • Kubernetes secrets are encrypted before they are stored using secretbox. This ensures that the Kubernetes secrets are secure and accessible to authorized users only.

For more information on security hardening, see Oracle Communications Cloud Native Core, Security Guide.

3.4 Redundancy

This section provides detail about maintaining redundancy in Kubernetes, ingress load balancing, common services, and Bastion Host.

Infrastructure

In virtualized infrastructures, CNE requires the specification of antiaffinity rules to distribute Kubernetes controller and worker nodes across several physical servers. In Bare Metal infrastructures, each Kubernetes controller and worker node are placed on its own physical server.

Kubernetes

For the Kubernetes cluster, CNE runs three controller nodes with an etcd node on each controller node.

The etcd node allows the cluster to:
  • sustain the loss of one controller node with no impact on the service.
  • sustain the loss of two controller nodes without losing the cluster data in the etcd database. However, in cases where two controller nodes are lost, Kubernetes is placed in read-only mode. In the read-only mode, the creation of new pods and the deployment of new services are not possible.
  • restore etcd data from the backup when all the three controller nodes fail. When all the three controller nodes fail, all the cluster data in etcd is lost, and it is essential to restore data from the backup to run the operations.

CNE uses internal load balancers to distribute API requests from applications running in worker nodes evenly across all controller nodes.

Common Services

  • OpenSearch uses three master nodes, three ingress nodes, and five data nodes by default. The number of data nodes provisioned can be configured as per the requirement. For more details, see the "Common Installation Configuration" section in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide. The system can withstand failure of one node in each of these groups with no impact on the service.
  • Fluentd OpenSearch runs as a DaemonSet, where one Fluentd OpenSearch pod runs on each Kubernetes worker node. When the worker node is up and running, Fluentd OpenSearch sends the logs for that node to OpenSearch.
  • Two Prometheus instances are run simultaneously to provide redundancy. Each Prometheus instance independently collects and stores metrics from all CNE services and all applications running in the CNE Kubernetes cluster. Automatic deduplication removes the repetitive metric information in Prometheus.

Note:

The occne-logstash indices which used to collect occne-infra logs is disabled from 23.1.x release onwards. By default, CNE supports the following indices:
  • logstash: consists of 12 primary shards and 1 replica, and rotates daily
  • jaeger-span: consists of 5 primary shards and 1 replica, and rotates daily
  • jaeger-service: consists of 5 primary shards and 1 replica, and rotates daily

Kubernetes distributes common service pods across Kubernetes worker nodes such that even the loss of one worker node has no impact on the service. Kubernetes automatically reschedules the failed pods to maintain redundancy.

Ingress and Egress Traffic Delivery

  • For virtual installations, the Load Balancer VMs (LBVM) perform both ingress and egress traffic delivery.
  • Two Load Balancer VMs are run for each configured external network in virtual installations. These LB VMs run in an active or standby configuration. Only one LB VM (the active LB VM) processes the ingress traffic for a given external network.
  • The active LB VM is continuously monitored. If the active LB VM fails, the standby LB VM is automatically promoted to be the active LB VM.
  • If the virtual infrastructure is able to recover the failed LB VM, The recovered LB VM automatically becomes the standby LB VM, and load balancing redundancy is restored. If the infrastructure is NOT able to recover the failed LB VM, you must perform the Restoring a Failed Load Balancer fault recovery procedure described in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide to restore load balancing redundancy.

Bastion Hosts

  • There are two Bastion Hosts per CNE.
  • Bastion Hosts run in an active or standby configuration.
  • Bastion Host health is monitored from an application that runs in the Kubernetes cluster, and instructs standby Bastion Host to take over when active Bastion Host fails.
  • A DaemonSet running on each Kubernetes worker node ensures that the worker node always retrieves container images and Helm charts from the active Bastion Host.
  • Container images, Helm charts, and other files essential to CNE internal operations are synchronized from the active Bastion Host to the standby Bastion Host periodically. This way, when an active Bastion Host fails, the standby appears as an identical replacement upon switchover.

3.5 Observability

CNE provides services to capture metrics, logs, and traces for both itself and the cloud native applications. You can use the observability data for problem detection, troubleshooting, and preventive maintenance. CNE also includes tools to filter, view, and analyze the observability data.

3.5.1 Metrics

Metrics collection

Prometheus collects the application metrics and CNE metrics from the following services:
  • The servers (Bare Metal) or VMs (virtualized) that host the Kubernetes cluster. The node-exporter service collects hardware and Operating system (OS) metrics exposed by the Linux OS.
  • The kube-state-metrics service collects information about the state of all Kubernetes objects. Since Kubernetes uses objects to store information about all nodes, deployments, and pods in the Kubernetes cluster, kube-state-metrics effectively captures metrics on all aspects of the Kubernetes cluster.
  • All of the CNE services generate metrics that Prometheus collects and stores.

Metrics storage

Prometheus stores the application metrics and CNE metrics in an internal time-series database.

Metrics filtering and viewing

Grafana allows you to view and filter metrics from Prometheus. CNE offers some default Grafana dashboards which can be cloned and customized as per your requirement. For more information about these default dashboards, see CNE Grafana Dashboards.

For more information about CNE Metrics, see Metrics.

3.5.2 Alerts

CNE uses AlertManager to raise alerts. These alerts notify the user when any of its common services are in abnormal conditions. CNE delivers alerts and Simple Network Management Protocol (SNMP) traps to an external SNMP trap manager. For more information about the detailed CNE definition and description of each alert, see Alerts section.

Applications deployed on CNE can define their alerts to inform the user of problems specific to each application. For instructions about applications loading the alerting rules, see Maintenance Procedures.

3.5.3 Logs

Logs collection

Fluentd OpenSearch collects logs for applications installed in CNE and CNE services logs.

Logs storage

CNE stores its logs in Oracle OpenSearch. Each log message is written to an OpenSearch index. The source of the log message determines which index the record is written to is as follows:

  • If the applications in the CNE generates the logs, these logs are stored in the current day's application index named logstash-YYYY.MM.DD.
  • If the CNE services generate the logs, these logs are stored in the current day's CNE index named occne-logstash-YYYY.MM.DD.

CNE creates new log indices each day. Each day's indices are uniquely named by appending the current date to the index name as a suffix. For example, all application logs generated on November 13, 2021, are written to the logstash-2021.11.13 index. By default, indices are retained for seven days.

Only current-day indices are writable and they are configured for efficient storage of new log messages. Current-day indices are called "hot" indices and stored on "hot" data nodes. At the end of each day, new current-day indices are created. At the same time, the previous day's indices are marked as read-only, so no new log messages are written to them, and compressed, to allow for more efficient storage. These previous-day indices are considered "warm" indices, and are stored on "warm" data nodes.

Log filtering and viewing

Oracle OpenSearch Dashboard filters and views logs that are available in OpenSearch.

3.5.4 Traces

Trace collection

Jaeger collects traces from applications deployed in CNE.

Note:

Jaeger captures only a small percentage of all application message flows. The default capture rate is .01% (1 in 10,000 message flows).

Traces are not collected from CNE services.

Trace storage

CNE stores its traces in Oracle OpenSearch. The applications deployed in CNE generate traces and stores them in the jaeger-trace-YYYY.MM.DD index. CNE creates a new Jaeger index each day. For example, all traces generated by applications on November 13, 2021, are written to the jaeger-trace-2021.11.13 index.

Trace filtering and viewing

Oracle OpenSearch Dashboard filters and views the traces that are available in Oracle OpenSearch.

3.6 Maintenance Access

To access the Kubernetes cluster for maintenance purposes, CNE provides 2 Bastion Hosts. Each Bastion Host provides command-line access to several troubleshooting and maintenance tools such as kubectl, Helm, and Jenkins.

3.7 Frequently Used Common Services

This section provides some of the frequently used common services and their version supported by CNE 25.1.1xx. You can find the complete list of third-party services in the dependencies TGZ file provided as part of the software delivery package.

Table 3-1 Frequently Used Common Services

Common Service Supported Version Usage
AlertManager 0.27.0 Alertmanager is a component that works in conjunction with Prometheus to manage and dispatch alerts. It handles the routing and notification of alerts to various receivers.
Grafana 9.5.3 Grafana is a popular open-source platform for monitoring and observability. It provides a user-friendly interface for creating and viewing dashboards based on various data sources.
Prometheus 2.52.0 Prometheus is a popular open-source monitoring and alerting toolkit. It collects and stores metrics from various sources and allows for alerting and querying.
Calico 3.28.1 Calico provides networking and security for NFs in Kubernetes with scalable, policy-driven connectivity.
cert-manager 1.12.4 cert-manager automates certificate management for secure NFs in Kubernetes.
Containerd 1.7.22 Containerd manages container lifecycles for running NFs efficiently in Kubernetes.
Fluentd - OpenSearch 1.17.1 Fluentd is an open-source data collector that streamlines data collection and consumption, allowing for improved data utilization and comprehension.
HAProxy 2.4.22 HAProxy provides load balancing and high availability for 5G NFs, ensuring efficient traffic distribution and reliability.
Helm 3.16.2 Helm, a package manager, simplifies deploying and managing network functions (NFs) on Kubernetes with reusable, versioned charts for easy automation and scaling.
Istio 1.18.2 Istio extends Kubernetes to establish a programmable, application-aware network. Istio brings standard, universal traffic management, telemetry, and security to complex deployments.
Jaeger 1.60.0 Jaeger provides distributed tracing for 5G NFs, enabling performance monitoring and troubleshooting across microservices.
Kubernetes 1.31.0 Kubernetes orchestrates scalable, automated network function (NF) deployments for high availability and efficient resource utilization.
Kyverno 1.12.5 Kyverno is a Kubernetes policy engine that helps manage and enforce policies for resource configurations within a Kubernetes cluster.
MetalLB 0.14.4 MetalLB provides load balancing and external IP management for 5G NFs in Kubernetes environments.
Oracle OpenSearch 2.11.0 OpenSearch provides scalable search and analytics for 5G NFs, enabling efficient data exploration and visualization.
Oracle OpenSearch Dashboard 2.11.0 OpenSearch Dashboard visualizes and analyzes data for 5G NFs, offering interactive insights and custom reporting.
Prometheus Operator 0.76.0 The Prometheus Operator is used for managing Prometheus monitoring systems in Kubernetes. Prometheus Operator, simplifies the configuration and management of Prometheus instances.
Velero 1.13.2 Velero backs up and restores Kubernetes clusters for 5G NFs, ensuring data protection and disaster recovery.
metrics-server 0.7.2 Metrics Server is used in Kubernetes for collecting resource usage data from pods and nodes.
snmp-notifier 1.5.0 snmp-notifier sends SNMP alerts for 5G NFs, providing real-time notifications for network events.