3 CNE Services
CNE provides the following common services for all the installed applications:
- Runtime Environment - CNE provides a runtime environment in which you can run all the cloud native applications.
- Automation - CNE provides automation solutions to deploy, upgrade, and maintain cloud native applications.
- Security - CNE provides multi-level security measures to protect against malicious attacks.
- Redundancy - CNE provides redundancy or high availability through the specification of anti-affinity rules at the infrastructure level.
- Observability - CNE provides services to capture metrics, logs, and traces for both itself (CNE) and the cloud native applications.
- Maintenance Access - CNE provides a Bastion Host to access the Kubernetes cluster for maintenance purposes.
3.1 Runtime Environment
Note:
CNE 25.1.1xx supports Kubernetes version 1.31.x.3.1.1 Kubernetes Cluster
The Kubernetes cluster in CNE contains three controller nodes. Kubernetes uses these controller nodes to coordinate the execution of application workloads across many worker nodes.
- Kubernetest sustains the loss of one controller node with no impact to the CNE service.
- Kubernetest tolerates the loss of two controller nodes without
losing the cluster configuration data. In a case where two controller nodes are
lost, application workload continues to run on the worker node. However, no
changes can be made to the cluster configuration data. Due to this, the system
experiences the following constrains:
- New applications cannot be installed.
- Changes to application-specific custom resources, that are used to change an application's configuration, cannot be performed.
- Most
kubectl
commands, especially those that require interacting with the control plane, doesn't work. Commands that don't depend on realtime data from the control plane can still work depending on the situation.
It is recommended to have a minimum of six worker nodes for a proper functioning of CNE. The maximum number of worker nodes depend on your requirement. However, CNE allows you to setup a BareMetal deployment with a minimal resources of three worker nodes which is only ideal for testing and getting started with CNE. For more information about installing and upgrading CNE with bare minimum resources, see the "Installing BareMetal CNE using Bare Minimum Servers" and "Upgrading BareMetal CNE Deployed using Bare Minimum Servers" sections respectively in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.
Kubernetes distributes workloads across all the available worker nodes. The Kubernetes placement algorithm attempts to balance workloads across all available worker nodes, such that no worker node is overloaded. Kubernetes also attempts to honor the affinity or anti-affinity rules specified by applications.
Additional Kubernetes cluster details are as follows:
- Cluster configuration data, as well as application-specific custom
resources, are stored in
etcd
. - Workloads in the Kubernetes cluster are run by the containerd runtime.
- NTP time synchronization is maintained by chrony.
3.1.2 Networking
In-cluster Networking
CNE includes the Calico plug-in for networking within the Kubernetes cluster. Calico provides network security solutions for containers. Calico is known for its performance, flexibility, and power.
External Networking
You must configure at least one external network to allow the applications that run in CNE to communicate with applications that are outside of the CNE platform.
You can also configure multiple external networks to CNE to allow specific applications to communicate on networks that are dedicated for specific purposes. For more information on configuring external networks, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.
Support for Floating IPs in OpenStack
Floating IPs are additional public IP addresses that are associated with instances such as control nodes, worker nodes, Bastion Host, and LBVMs. Floating IPs can be quickly re-assigned and switched from one instance to another using an API interface, thereby ensuring high availability and less maintenance. You can activate the Floating IP feature after installing or upgrading CNE. For information about enabling or disabling the Floating IP feature, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.
3.1.3 Storage
Bare Metal Deployment Storage
When CNE is deployed on Bare Metal servers, it constructs a Ceph storage cluster from the disk drives attached to the servers designated as Kubernetes worker nodes. The Ceph cluster is then integrated with Kubernetes so that CNE common services and applications that run on CNE can use the Ceph cluster for persistent data storage.
Virtual Deployment Storage
When CNE is deployed on a virtual infrastructure, it delivers a cloud provider that corresponds to the Kubernetes cloud provider interface. For more information, see cloud provider interface. The cloud provider interacts with the Kubernetes cluster and the storage manager to provide storage when the applications request it. For example, OpenStack supports Cinder storage manager and VMware supports vSphere storage manager.
After Kubernetes has access to persistent storage, CNE then creates several
Kubernetes StorageClass
classes for the common services and applications to
use.
- CNE creates a storage class named
Standard
, which allows applications to store application-specific data.Standard
is the default storage class and is used when the applications create a Persistent Volume Claim (PVC) with no StorageClass reference. - CNE creates one storage class to allow Prometheus to store metrics data.
- CNE creates two storage classes to allow OpenSearch to store log and trace data.
3.1.4 Traffic Segregation
- Load Balancer Virtual Machine (LBVM)
- Cloud Native Load Balancer (CNLB)
Traffic Segregation Using Load Balancer Virtual Machine (LBVM)
When this option is selected, CNE uses Load Balancer Virtual Machines (LBVM) to segregate ingress and egress traffic.Each application must create a Kubernetes Service of the type LoadBalancer to allow the clients from an external network to connect with the applications within the CNE. Each LoadBalancer service is assigned to a service IP from an external network. CNE provides load balancers to ensure that ingress traffic from external clients is distributed evenly across Kubernetes worker nodes that host the application pods.
For BareMetal deployments, the top of rack (ToR) switches perform the load balancing function. For virtualized deployments, dedicated virtual machines perform the load balancing function.
For more information on selecting an external network for an NF application, see the installation instructions in NF-Specific Installation, Upgrade, and Fault Recovery Guide.
Note:
CNE does not support NodePort services because they are considered insecure in a production environment.externalTrafficPolicy
to control the traffic distribution.
This feature can be configured with two options:
externalTrafficPolicy=Local
: When this attribute is set toLocal
, the Kubernetes service routes traffic only to the nodes where the endpoint pods are running. For example, if a pod for the service is running on Node A, the external traffic will be sent to Node A. This setting is useful when you want to keep traffic local to the nodes running the service pods, potentially optimizing the traffic for reduced latency or specific node-level configurations.externalTrafficPolicy=Cluster
(default value): When this attribute is set toCluster
, external traffic is not restricted to the nodes running the service's pods. The traffic can be directed to any node in the Kubernetes cluster and then it is internally forwarded to the node where the pod is running.
externalTrafficPolicy
set to Local
, and
update their Helm charts and manifest file to include this
attribute.
CNE uses the Load Balancers used for ingress traffic distribution to deliver egress requests to servers outside the CNE cluster. CNE Load Balancers also perform Egress Network Address Translation (NAT) on the egress requests to ensure that the source IP field of all egress requests contains an IP address that belongs to the external network to which the egress request is delivered. This is done to ensure that the egress requests are not rejected by security rules on the external networks.
For more information on selecting an external network for an NF application, see the installation instructions in NF-Specific Installation, Upgrade, and Fault Recovery Guide.
Traffic Segregation Using Cloud Native Load Balancer (CNLB)
When this option is selected, CNE uses Cloud Native Load Balancer (CNLB)
for managing ingress and egress network. CNLB is as an alternate to the existing
LBVM, lb-controller, and egress-controller solutions. You can enable or disable this
feature only at the time of a fresh CNE installation. To use CNLB, you must
preconfigure the ingress and egress network details in the cnlb.ini
file before installing CNE. For more information about enabling and configuring
CNLB, see Oracle Communications Cloud Native Core, Cloud Native Environment
Installation, Upgrade, and Fault Recovery Guide.
3.1.5 DNS Query Services
CoreDNS is used to resolve Domain Name System (DNS) queries for services within the Kubernetes cluster. DNS queries for services outside the Kubernetes cluster are routed to DNS nameservers running on the Bastion Hosts, which in turn use customer-specified DNS servers outside the CNE to resolve these DNS queries.
3.1.5.1 Local DNS
Local DNS feature is a reconfiguration of core DNS (CoreDNS) to support external hostname resolution. Local DNS allows the pods and services running inside the CNE cluster to connect with the ones running outside the CNE cluster using core DNS. That is, when Local DNS is enabled, CNE routes the connection to external hosts through core DNS rather than the nameservers on the Bastion Hosts. For information about activating this feature, see Activating Local DNS.
- A Records: allows to define the hostname and the IP address to locate the service
- SRV Records: allows to define the location (hostname and port number) of a specific service and how your domain handles the service
For additional details about this feature, see Activating and Configuring Local DNS.
3.2 Automation
CNE provides considerable automation throughout the phases of an application's deployment, upgrade, and maintenance cycles.
3.2.1 Deployment
CNE provides the Helm application to automate the deployment of applications into the Kubernetes cluster. The applications must provide a Helm chart to perform automated deployment. For more information about deploying an application in CNE, see Maintenance Procedures section.
You can deploy CNE automatically through custom scripts provided with the CNE platform. For more information about installing CNE, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.
3.2.2 Upgrade
CNE provides Helm to automate the upgrade of applications in the Kubernetes cluster. The applications must provide a Helm chart to perform an automated software upgrade. For more information about an application upgrade in CNE, see Maintenance Procedures section.
You can upgrade the CNE platform automatically through the execution of a Continuous Delivery (CD) pipeline. For more information on CNE upgrade instructions, see Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide.
3.2.3 Maintenance
CNE provides a combination of automated and manual procedures for performing maintenance operations on the Kubernetes cluster and common services. For more instructions about performing maintenance operations, see Maintenance Procedures section.
3.3 Security
CNE provides multilevel security measures to protect against malicious attacks.
- All Kubernetes nodes and Bastion Hosts are security-hardened to prevent unauthorized access.
- Maintenance operations on the Kubernetes cluster can only be performed from the Bastion Hosts to prevent Denial of Service (DOS) attacks against the Kubernetes cluster.
- Kubernetes control plane and Kubernetes worker nodes communicate over secure communication channels to protect sensitive cluster data.
- Kyverno policy framework is deployed in CNE with base line policies. This ensures that the containers and resources running in CNE are compliant.
- Kyverno pod security policies ensure that malicious applications don't corrupt the Kubernetes controller, worker nodes, and cluster data. Kyverno provides the set of policies equivalent to those provided by the built-in Kubernetes PSP as Kubernetes PSP is deprecated.
- Bastion Host container registry is secured with TLS and can be accessed from CNE only.
- CNE supports both TLS 1.2 and TLS 1.3 for establishing secure connections. Currently, the minimum supported TLS version for CNE internal and external communication is TLSv12. However, in the upcoming releases, CNE will support only TLSv13 for security and governance compliance.
- Kubernetes secrets are encrypted before they are stored using secretbox. This ensures that the Kubernetes secrets are secure and accessible to authorized users only.
For more information on security hardening, see Oracle Communications Cloud Native Core, Security Guide.
3.4 Redundancy
This section provides detail about maintaining redundancy in Kubernetes, ingress load balancing, common services, and Bastion Host.
Infrastructure
In virtualized infrastructures, CNE requires the specification of antiaffinity rules to distribute Kubernetes controller and worker nodes across several physical servers. In Bare Metal infrastructures, each Kubernetes controller and worker node are placed on its own physical server.
Kubernetes
For the Kubernetes cluster, CNE runs three controller nodes with an etcd node on each controller node.
- sustain the loss of one controller node with no impact on the service.
- sustain the loss of two controller nodes without losing the cluster data in the etcd database. However, in cases where two controller nodes are lost, Kubernetes is placed in read-only mode. In the read-only mode, the creation of new pods and the deployment of new services are not possible.
- restore etcd data from the backup when all the three controller nodes fail. When all the three controller nodes fail, all the cluster data in etcd is lost, and it is essential to restore data from the backup to run the operations.
CNE uses internal load balancers to distribute API requests from applications running in worker nodes evenly across all controller nodes.
Common Services
- OpenSearch uses three master nodes, three ingress nodes, and five data nodes by default. The number of data nodes provisioned can be configured as per the requirement. For more details, see the "Common Installation Configuration" section in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide. The system can withstand failure of one node in each of these groups with no impact on the service.
- Fluentd OpenSearch runs as a
DaemonSet
, where one Fluentd OpenSearch pod runs on each Kubernetes worker node. When the worker node is up and running, Fluentd OpenSearch sends the logs for that node to OpenSearch. - Two Prometheus instances are run simultaneously to provide redundancy. Each Prometheus instance independently collects and stores metrics from all CNE services and all applications running in the CNE Kubernetes cluster. Automatic deduplication removes the repetitive metric information in Prometheus.
Note:
Theoccne-logstash
indices which used to collect occne-infra logs is
disabled from 23.1.x release onwards. By default, CNE supports the following indices:
logstash
: consists of 12 primary shards and 1 replica, and rotates dailyjaeger-span
: consists of 5 primary shards and 1 replica, and rotates dailyjaeger-service
: consists of 5 primary shards and 1 replica, and rotates daily
Kubernetes distributes common service pods across Kubernetes worker nodes such that even the loss of one worker node has no impact on the service. Kubernetes automatically reschedules the failed pods to maintain redundancy.
Ingress and Egress Traffic Delivery
- For virtual installations, the Load Balancer VMs (LBVM) perform both ingress and egress traffic delivery.
- Two Load Balancer VMs are run for each configured external network in virtual installations. These LB VMs run in an active or standby configuration. Only one LB VM (the active LB VM) processes the ingress traffic for a given external network.
- The active LB VM is continuously monitored. If the active LB VM fails, the standby LB VM is automatically promoted to be the active LB VM.
- If the virtual infrastructure is able to recover the failed LB VM, The recovered LB VM automatically becomes the standby LB VM, and load balancing redundancy is restored. If the infrastructure is NOT able to recover the failed LB VM, you must perform the Restoring a Failed Load Balancer fault recovery procedure described in Oracle Communications Cloud Native Core, Cloud Native Environment Installation, Upgrade, and Fault Recovery Guide to restore load balancing redundancy.
Bastion Hosts
- There are two Bastion Hosts per CNE.
- Bastion Hosts run in an active or standby configuration.
- Bastion Host health is monitored from an application that runs in the Kubernetes cluster, and instructs standby Bastion Host to take over when active Bastion Host fails.
- A DaemonSet running on each Kubernetes worker node ensures that the worker node always retrieves container images and Helm charts from the active Bastion Host.
- Container images, Helm charts, and other files essential to CNE internal operations are synchronized from the active Bastion Host to the standby Bastion Host periodically. This way, when an active Bastion Host fails, the standby appears as an identical replacement upon switchover.
3.5 Observability
CNE provides services to capture metrics, logs, and traces for both itself and the cloud native applications. You can use the observability data for problem detection, troubleshooting, and preventive maintenance. CNE also includes tools to filter, view, and analyze the observability data.
3.5.1 Metrics
Metrics collection
- The servers (Bare Metal) or VMs (virtualized) that host the Kubernetes cluster. The node-exporter service collects hardware and Operating system (OS) metrics exposed by the Linux OS.
- The kube-state-metrics service collects information about the state of all Kubernetes objects. Since Kubernetes uses objects to store information about all nodes, deployments, and pods in the Kubernetes cluster, kube-state-metrics effectively captures metrics on all aspects of the Kubernetes cluster.
- All of the CNE services generate metrics that Prometheus collects and stores.
Metrics storage
Prometheus stores the application metrics and CNE metrics in an internal time-series database.
Metrics filtering and viewing
Grafana allows you to view and filter metrics from Prometheus. CNE offers some default Grafana dashboards which can be cloned and customized as per your requirement. For more information about these default dashboards, see CNE Grafana Dashboards.
For more information about CNE Metrics, see Metrics.
3.5.2 Alerts
CNE uses AlertManager to raise alerts. These alerts notify the user when any of its common services are in abnormal conditions. CNE delivers alerts and Simple Network Management Protocol (SNMP) traps to an external SNMP trap manager. For more information about the detailed CNE definition and description of each alert, see Alerts section.
Applications deployed on CNE can define their alerts to inform the user of problems specific to each application. For instructions about applications loading the alerting rules, see Maintenance Procedures.
3.5.3 Logs
Logs collection
Fluentd OpenSearch collects logs for applications installed in CNE and CNE services logs.
Logs storage
CNE stores its logs in Oracle OpenSearch. Each log message is written to an OpenSearch index. The source of the log message determines which index the record is written to is as follows:
- If the applications in the CNE generates the logs, these logs are stored in the current day's application index named logstash-YYYY.MM.DD.
- If the CNE services generate the logs, these logs are stored in the current day's CNE index named occne-logstash-YYYY.MM.DD.
CNE creates new log indices each day. Each day's indices are uniquely named by appending the current date to the index name as a suffix. For example, all application logs generated on November 13, 2021, are written to the logstash-2021.11.13 index. By default, indices are retained for seven days.
Only current-day indices are writable and they are configured for efficient storage of new log messages. Current-day indices are called "hot" indices and stored on "hot" data nodes. At the end of each day, new current-day indices are created. At the same time, the previous day's indices are marked as read-only, so no new log messages are written to them, and compressed, to allow for more efficient storage. These previous-day indices are considered "warm" indices, and are stored on "warm" data nodes.
Log filtering and viewing
Oracle OpenSearch Dashboard filters and views logs that are available in OpenSearch.
3.5.4 Traces
Trace collection
Note:
Jaeger captures only a small percentage of all application message flows. The default capture rate is .01% (1 in 10,000 message flows).Traces are not collected from CNE services.
Trace storage
CNE stores its traces in Oracle OpenSearch. The applications deployed in CNE generate traces and stores them in the jaeger-trace-YYYY.MM.DD index. CNE creates a new Jaeger index each day. For example, all traces generated by applications on November 13, 2021, are written to the jaeger-trace-2021.11.13 index.
Trace filtering and viewing
Oracle OpenSearch Dashboard filters and views the traces that are available in Oracle OpenSearch.
3.6 Maintenance Access
To access the Kubernetes cluster for maintenance purposes, CNE provides 2 Bastion Hosts. Each Bastion Host provides command-line access to several troubleshooting and maintenance tools such as kubectl, Helm, and Jenkins.
3.7 Frequently Used Common Services
This section provides some of the frequently used common services and their version supported by CNE 25.1.1xx. You can find the complete list of third-party services in the dependencies TGZ file provided as part of the software delivery package.
Table 3-1 Frequently Used Common Services
Common Service | Supported Version | Usage |
---|---|---|
AlertManager | 0.27.0 | Alertmanager is a component that works in conjunction with Prometheus to manage and dispatch alerts. It handles the routing and notification of alerts to various receivers. |
Grafana | 9.5.3 | Grafana is a popular open-source platform for monitoring and observability. It provides a user-friendly interface for creating and viewing dashboards based on various data sources. |
Prometheus | 2.52.0 | Prometheus is a popular open-source monitoring and alerting toolkit. It collects and stores metrics from various sources and allows for alerting and querying. |
Calico | 3.28.1 | Calico provides networking and security for NFs in Kubernetes with scalable, policy-driven connectivity. |
cert-manager | 1.12.4 | cert-manager automates certificate management for secure NFs in Kubernetes. |
Containerd | 1.7.22 | Containerd manages container lifecycles for running NFs efficiently in Kubernetes. |
Fluentd - OpenSearch | 1.17.1 | Fluentd is an open-source data collector that streamlines data collection and consumption, allowing for improved data utilization and comprehension. |
HAProxy | 2.4.22 | HAProxy provides load balancing and high availability for 5G NFs, ensuring efficient traffic distribution and reliability. |
Helm | 3.16.2 | Helm, a package manager, simplifies deploying and managing network functions (NFs) on Kubernetes with reusable, versioned charts for easy automation and scaling. |
Istio | 1.18.2 | Istio extends Kubernetes to establish a programmable, application-aware network. Istio brings standard, universal traffic management, telemetry, and security to complex deployments. |
Jaeger | 1.60.0 | Jaeger provides distributed tracing for 5G NFs, enabling performance monitoring and troubleshooting across microservices. |
Kubernetes | 1.31.0 | Kubernetes orchestrates scalable, automated network function (NF) deployments for high availability and efficient resource utilization. |
Kyverno | 1.12.5 | Kyverno is a Kubernetes policy engine that helps manage and enforce policies for resource configurations within a Kubernetes cluster. |
MetalLB | 0.14.4 | MetalLB provides load balancing and external IP management for 5G NFs in Kubernetes environments. |
Oracle OpenSearch | 2.11.0 | OpenSearch provides scalable search and analytics for 5G NFs, enabling efficient data exploration and visualization. |
Oracle OpenSearch Dashboard | 2.11.0 | OpenSearch Dashboard visualizes and analyzes data for 5G NFs, offering interactive insights and custom reporting. |
Prometheus Operator | 0.76.0 | The Prometheus Operator is used for managing Prometheus monitoring systems in Kubernetes. Prometheus Operator, simplifies the configuration and management of Prometheus instances. |
Velero | 1.13.2 | Velero backs up and restores Kubernetes clusters for 5G NFs, ensuring data protection and disaster recovery. |
metrics-server | 0.7.2 | Metrics Server is used in Kubernetes for collecting resource usage data from pods and nodes. |
snmp-notifier | 1.5.0 | snmp-notifier sends SNMP alerts for 5G NFs, providing real-time notifications for network events. |