2 Enable NF Observability

The following are the five essential components necessary to enable NF on OCI:

  • OCI Adaptor
  • OCI Infrastructure Layer
  • User Management
  • OCI Observability and Management
  • CNC Applications (Cloud Native Core Network Functions, cnDBTier, CNC Console)
The following diagram represents the OCI Adaptor on OCI:

Figure 2-1 OCI Adaptor on OCI


OCI Adaptor on OCI

Note:

  • OCI Adaptor and CNC Applications (NFs, cnDBTier, and CNC Console) are deployed on a single OKE cluster.
  • The user must have OCI tenancy and CNC Applications with OCI Adaptor to deploy the CNC NF on the OCI.
  • The user can create a new compartment or use an existing compartment to deploy the OCI components.

OCI Adaptor

The OCI Adaptor acts as a channel to transfer information between the application and OCI observability management. It assists CNC NFs in achieving observability and monitoring functions on the public cloud OCI platform.

The OCI Adaptor components include:

  • Fluentd: It is a open source data collector software to collect log data.
  • Management Agent: Management Agent, Scrape Target Discovery Container, and Metric Server are the components used to derive metrics.

    This component consists of the following subcomponents:

    • OCI Management Agent
    • Scrape Target Discovery Container
    • Metric server
  • OpenTelemetry (OTEL) Collector: OpenTelemetry (OTEL) Collector is the component to collect traces.

    For more information about deploying the components, see the "Deploying OCI Adaptor" section in Oracle Communications Cloud Native Core, OCI Deployment Guide.

Fluentd

OCIs customized Fluentd is responsible for collecting the log data from the input sources, transforming the logs, and routing the log data to the OCI Logging Analytics (LA) service.

For more information about the Fluentd, see the "Introduction" section in Ingest logs to OCI Logging Analytics using Fluentd Documentation.

A dynamic group is created, allowing Fluentd to upload the logs to the OCI LA service (part of the OCI Observability Management layer) using the terraform scripts based on the following matching rule and policy.

For more information about deploying Fluentd, see the "Terraform Scripts for OCI Deployment" section in Oracle Communications Cloud Native Core, OCI Deployment Guide.

Matching Rule:
ALL {instance.compartment.id = '<CNC NF's compartment_ocid>'}
Policy:
Allow dynamic-group DYNAMIC_GROUP_NAME to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in compartment COMPARTMENT_NAME

For more information, see the "Managing Dynamic Groups" section in Oracle Cloud Infrastructure Documentation.

Management Agent, Scrape Target Discovery Container, and Metric Server

This section provides the information about the management agent, scrape target discovery container, and metric server. This is used for metrics.

Management Agent

The Management Agent is the central entity that collects metrics data from different services and sources that need to be monitored and uploads it into the OCI Monitoring Service over the REST interface.

For more information, see the "Management Agents" section in Oracle Cloud Infrastructure Documentation.

A dynamic group is created that allows the Management Agent to upload metrics data to OCI Monitoring Service using the terraform scripts based on the following matching rule and policy.

For more information, see the "Terraform Scripts for OCI Deployment" section in Oracle Communications Cloud Native Core, OCI Deployment Guide.

Matching Rule:
ALL {resource.type='managementagent', resource.compartment.id='<AGENT_COMPARTMENT_OCID>'}
Policy:
allow dynamic-group DYNAMIC_GROUP_NAME to use metrics in compartment <PROMETHEUS_METRIC_COMPARTMENT_NAME>
Scrape Target Discovery Container

Management Agent collects the metrics data as per the information of different sources provided to it.

The job of Scrape Target Discovery Container is to dynamically update the metrics source information for the Management Agent. Accordingly, the Management Agent can fetch the correct metric data.

Metric Server

It is required to collect cAdvisor metrices. cAdvisor provides various insights into containers.

CNC requires these details to monitor the health and performance of the containers running on a given node. For example, container_memory_usage_bytes, container_tasks_state.

OTEL Collector

OpenTelemetry (OTEL) is an observability framework which is used to capture and export traces. CNC applications use OpenTelemetry to create spans or traces. A trace is a collection of spans connected in a parent child relationship.

The traces specify how APIs or requests are propagated through the microservices and other components. OpenTelemetry Collector is used to collect traces from the NFs and send them to OCI Application Performance Monitoring (APM).

For more information, see the Application Performance Monitoring (APM) (Tracing).

User Management Layer

The user management layer allows you to create user groups and policies in a domain. There can be multiple identity domains in a tenancy:

  • Default domain: Available by default and should only have limited number of admins who have administrative privileges on the tenancy level.
  • Non-Default domains: All other application users.
The user needs to set up user groups in a domain.

There are three types of user groups:

  • Tenancy Admin
  • Compartment Admin
  • Non-Admin Users

User or Group Management

The user can setup either of the following user groups to create OCI Infrastructure:
  • Tenancy Admin
  • Compartment Admin

Only Tenancy Admin can install OCI Adaptor using terraform scripts.

A Tenancy Admin can create two user groups. One group is assigned with compartment administrative privileges, and the other is assigned non-administrative privileges.

Dynamic Group Management

OCI compute instances can be grouped using dynamic groups to better access OCI Platform as a service (PaaS) services.

For more information, see the "Managing Dynamic Groups" section in Oracle Cloud Infrastructure Documentation.

OCI Observability and Management

The OCI Observability and Management service provides dashboards and explorers for observing the metrics, logs, and traces of CNC Applications. These dashboards and explorers allow you to view, query, and study metrics, logs, and traces.

The Observability and Management module includes:

  • OCI Monitoring Service (Metrics and Alarms)
  • Application Performance Monitoring (APM) (Tracing)
  • Logging Analytics

Integration of OCIs observability and management with NF applications is achieved using OCI Adaptor.

OCI Monitoring Service (Metrics and Alarms)

Management Agent of OCI Adaptor fetches metrics from the scrape targets and publishes them towards the OCI Monitoring Service.

The following diagram briefly outlines the integration and use of Management Agent in fetching the metrics from the targets and publishing the same towards Monitoring Service.

Figure 2-2 OKE Cluster Monitoring with Management Agent


OKE Cluster Monitoring with Management Agent

The Oracle Cloud Infrastructure (OCI) Monitoring service actively and passively monitors NF applications and cloud resources using the metrics and alarms features. CNC shares metrics with the OCI Monitoring service. These metrics help render various dashboards on OCI that eventually help monitor the CNC Applications. These metrics also assist in generating alerts for the customer. The management agent collects the metrics data from different services and sources that have to be monitored and uploads it to the OCI Monitoring Service.

For more information, see the "Monitoring" section in Oracle Cloud Infrastructure Documentation.

Note:

To view metrics and create the alarms, write the metrics queries using the Monitoring Query Language (MQL) syntax.

For more information, see the "Monitoring Query Language (MQL) Reference" section in Oracle Cloud Infrastructure Documentation.

Viewing the Metrics

The user can view the metrics using the Metrics Explorer option in the OCI Monitoring Service.

To access the metrics explorer:

  1. Log in to OCI Console.

    For more information, see "Signing In to the OCI Console" section in Oracle Cloud Infrastructure Documentation.

  2. Open the navigation menu and click Observability and Management.
  3. Under Monitoring, click Metrics Explorer. The Metrics Explorer screen appears.
  4. The user can customize the metrics view by choosing the customizations options.

For more information, see the "Creating a Basic Query" section in Oracle Cloud Infrastructure Documentation.

Creating the Alarms or Alerts

The user can create the alarms using the OCI Monitoring Service.

For more information, see the "Configuring NF Alerts on OCI" section in Oracle Communications Cloud Native Core, OCI Deployment Guide.

Creating the Alarms

The user can create the alarms in the OCI Monitoring Service.

To create the alarms:
  1. Log in to OCI Console.

    For more information, see the "Signing In to the OCI Console" section in Oracle Cloud Infrastructure Documentation.

  2. Open the navigation menu and click Observability and Management.
  3. Under Monitoring, click Alarm Definitions.
  4. Click Create Alarm. The Create Alarm page opens in Basic mode (the default view).

    The following steps describe how to create an alarm in Basic mode:

    1. Enter a user-friendly name for the alarm. This name is sent as the title for notifications related to this alarm.
    2. For Alarm severity, select the perceived type of response required when the alarm is in the firing state.
    3. For Alarm body, provide user-readable notification content.
  5. The user can customize the alarms by choosing the provided customizations options.

For more information, see the "Managing Alarms" section in Oracle Cloud Infrastructure Documentation.

Application Performance Monitoring (APM) (Tracing)

The OCI Application Performance Monitoring (APM) is a service that provides deep visibility into the performance of applications and offers the ability to diagnose issues quickly. APM collects and processes the transaction instance trace data using Application Performance Monitoring data sources, open source tracers, or directly using API. It accepts OpenTelemetry spans and combines them into traces. CNC Applications microservices share spans or traces towards the OCI APM that help troubleshoot the application.

CNC Applications use OpenTelemetry to create spans or traces. A trace is a collection or list of spans connected in a parent or child relationship.

Traces specify how APIs or requests are propagated through the microservices and other components, assisting operations and developers in troubleshooting issues in a customer deployment.

OCI Application Performance Monitoring (APM) provides a comprehensive set of features to monitor applications and diagnose performance issues.

For more information, see the "Application Performance Monitoring" section in Oracle Cloud Infrastructure Documentation.

Figure 2-3 Trace Collection towards OCI APM Data Collector


Trace Collection towards OCI APM Data Collector

Monitoring the Traces

The user can monitor the traces using the Trace Explorer option in the Application Performance Monitoring service.

To access the Trace Explorer:
  1. Log in to OCI Console.

    For more information, see the "Signing In to the OCI Console" in Oracle Cloud Infrastructure Documentation.

  2. Open the navigation menu and click Observability and Management.
  3. Under Application Performance Monitoring, click Trace Explorer.
  4. The Tracing Explorer screen appears.
  5. The user can customize the view of traces view by choosing the different options available.

For more information, see the "Monitor Traces in Trace Explorer" section in Oracle Cloud Infrastructure Documentation.

Logging Analytics

OKE cluster environment on a OCI tenancy comprises of three tiers from which logs are generated:

  • Infrastructure tier comprising of worker nodes, networking resources.
  • Kubernetes platform tier comprising of Kubelet, Core DNS and so on.
  • Application tier comprising of applications and DB applications.

Oracle Cloud Infrastructure (OCI) provides two logging services, namely OCI Logging and OCI Logging Analytics.

Oracle Cloud Logging Analytics (LA), a cloud solution in OCI assists in indexing, aggregating, visualizing, searching and also monitoring all log data from your applications and system infrastructure.

OCI Logging Analytics supports integration or log ingestion by applications using either Management Agent or Fluentd. However, OCI recommends log ingestion using Fluentd.

Figure 2-4 Fluentd Plugin Architecture


Fluentd Plugin Architecture

The CNC shares logs towards OCI LA using Fluentd, the OCI LA is used to perform log analysis.

For more information, see the "Logging Analytics" section in Oracle Cloud Infrastructure Documentation.

Visualizing the Logs

The user can view the logs using the Log Explorer option in the Logging Analytics.

To access the Log Explorer:
  1. Log in to OCI Console.

    For more information, see the "Signing In to the OCI Console" section in Oracle Cloud Infrastructure Documentation.

  2. Open the navigation menu and click Observability and Management.
  3. Under Logging Analytics, Click Log Explorer. The Log Explorer screen appears.
  4. The user can customize the Logs view by choosing the Visualizations options.

For more information, see the "Visualize Data Using Charts and Controls" section in Oracle Cloud Infrastructure Documentation.

Infrastructure Layer (IAAC)

Oracle Cloud Infrastructure (OCI) provides high-performance computing capabilities (such as physical hardware instances) and storage capacity in a flexible overlay virtual network that is securely accessible from your on-premise network.

Oracle Cloud Infrastructure (OCI) consists of Compartments, a Network Load balancer, a Bastion Host, a Dynamic Routing Gateway (DRG), a Remote Peering Connection (RPC), a Service Gateway, an Internet Gateway, and an OKE cluster.

OKE Cluster

Oracle Kubernetes Engine (OKE) offers a carrier-grade container orchestration platform (based on Kubernetes) that supports the deployment requirements of cloud-native, containerized applications.

OKE is a fully managed and scalable service designed for deploying containerized applications in the cloud. It ensures high availability and reliability of cloud native applications. OKE utilizes Kubernetes to automate the deployment, scaling, and management of containerized applications.

OKE provides two types of clusters:

  • Basic cluster: Basic clusters support all the core functionalities provided by Kubernetes and Container Engine for Kubernetes, but none of the enhanced features that Container Engine for Kubernetes provides.
  • Enhanced cluster: Enhanced clusters support all the available features, such as virtual nodes, cluster add-on management, workload identity, and additional worker nodes per cluster.

CNC Applications can be deployed on basic or enhanced OKE clusters (through the terraform scripts) with managed nodes. Since the NFs cannot be deployed on virtual nodes and virtual node pools, enhanced cluster is not required for CNC NF deployment. However, the user can always upgrade any basic cluster to an enhanced cluster but enhanced cluster cannot be downgraded.

For more information, see the "Working with Enhanced Clusters and Basic Clusters" section in Oracle Cloud Infrastructure Documentation.

Compartment

A compartment is a logical resource that helps achieve security isolation and controls resource access. CNC utilizes this resource to group CNC NF and its corresponding OCI resources and assigns users or groups with policies to access these resources, thus isolating access from the rest of the OCI tenancy's users or groups.

When a tenancy is created in Oracle Cloud Infrastructure (OCI), a root compartment is automatically provisioned, which holds all the cloud resources. The root compartment is the top-level folder for organizing and managing resources within the tenancy, similar to a root folder in a file system. Compartments are also tenancy-wide resources used to help organize and isolate cloud resources.

They enable a hierarchical structure for the tenancy resources and allow better management and control of access. The tenancy explorer provides a complete view of all cloud resources in a specific compartment across all regions.

For more information, see the "Viewing All Resources in a Compartment" section in Oracle Cloud Infrastructure Documentation.

When a compartment is created, it is assigned a unique identifier called an Oracle Cloud ID (OCID).

For more information, see the "OCID Resource Identifiers" section in Oracle Cloud Infrastructure Documentation.

The subcompartments inherit access permissions from the compartments higher up in the hierarchy.

For more information, see the "Managing Compartments" section in Oracle Cloud Infrastructure Documentation.

Virtual Cloud Network

Virtual Cloud Network (VCN) is a networking service that helps create and manage a customizable and private network in the OCI cloud. A VCN service includes subnets, route tables, and security lists. CNC NF on OCI uses separate subnets for Bastion Service, CLI Server, Worker Nodes, and so on. Security lists define the Ingress and Egress rules for a subnet. Route tables assist in routing to a different network through gateways, such as sending CNC NF metrics data to the OCI Monitoring service.

For more information, see the "Networking Overview" in Oracle Cloud Infrastructure Documentation.

Network Load Balancer

The network load balancer enables creating network load balancers in Virtual Cloud Network (VCN) for reliable traffic balancing. Load balancers can be public or private. Multiple listeners can balance Layer 4 traffic. Both public and private load balancers can route traffic to any backend server within the Virtual Cloud Network (VCN).

To enhance the security of the network load balancer and make it less vulnerable to external threats, you can create a private network load balancer. By doing so, the network load balancer will be assigned a private IP address which will act as the entry point for all incoming traffic.

This private network load balancer can only be accessed within the VCN that includes the host regional subnet, or as per the restrictions set by your security rules.

Note:

CNC NF integration with OCI utilize the Private Network Load Balancer.

For more information, see the "Introduction to Network Load Balancer" section in Oracle Cloud Infrastructure Documentation.

The OCI Network Load Balancer (NLB) provides flexible automated traffic distribution from one entry point to multiple backend servers in your VCN. It operates at the connection level and load balances incoming client connections to healthy backend servers based on Layer 3 or Layer 4 (IP protocol) data.

CNC uses the following LB annotations to route the incoming request to the appropriate backend servers:

  • oci-network-load-balancer.oraclecloud.com/internal: "false"
  • oci-network-load-balancer.oraclecloud.com/security-list-management-mode: All
  • oci-network-load-balancer.oraclecloud.com/subnet: <your-lb-subnet-ocid>
  • oci.oraclecloud.com/load-balancer-type: nlb

For more information, see the "Introduction to Network Load Balancer" section in Oracle Cloud Infrastructure Documentation.

Dynamic Routing Gateway

A Dynamic Routing Gateway (DRG) acts as a virtual router, providing a path for traffic to flow between your on-premise networks and VCNs and can also route traffic between VCNs. CNC NF uses OCI DRG to send and receive messages to NFs residing on the customer's on-premise network and towards NFs residing in a different OCI Region. Each DRG attachment has an associated route table to route packets entering the DRG to their next hop.

For more information, see the "Dynamic Routing Gateways" section in Oracle Cloud Infrastructure Documentation.

Bastion Service

Bastions let authorized users to connect from specific IP addresses to target resources using Secure Shell (SSH) sessions. When connected, users can interact with the target resource by using any software or protocol supported by SSH. For example, you can use the Remote Desktop Protocol (RDP) to connect to a Windows host, or use Oracle Net Services to connect to a database.

The Bastion Service allows users to reach the OKE cluster using the CLI Server. Using the Bastion Session, User can SSH into the CLI Server.

Note:

Copy all the CNC software to CLI Server for installation.

Creating the Bastion Service Session

Bastion service is deployed as a part of the infrastructure terraform. User must create a Bastion Service Session to login to CLI Server.

Following are the steps to create the Bastion Service Session:

  1. Log in to the OCI Console.

    For more information, see the "Signing In to the OCI Console" section in Oracle Cloud Infrastructure Documentation.

  2. Open the navigation menu and select Identity and Security from the hamburger menu and select Bastion.

    Figure 2-5 Identity and Security


    Identity and Security

  3. Select the Bastion.

    Figure 2-6 Bastion


    Bastion

  4. Click Create session to create the session.

    Figure 2-7 Create Session


    Create Session

  5. On the Create session page, enter the value for the following fields:
    1. Session type as Managed SSH session.
    2. Enter the Session name as per user's choice.
    3. Type the Username as opc.
    4. Select Compute instance as your CLI Server.
    5. Enter the public key that will be used to login to the session.
    6. Click the Create session.

      Figure 2-8 Create Session


      Create Session

  6. Click the three dots of the session created and copy the SSH Command using the option "Copy SSH command". The SSH Command copied above will take us directly to the CLI Server.

    Figure 2-9 Copy SSH command


    Copy SSH command

Note:

After Deploying OCI Adaptor using the steps mentioned below, User needs to perform the same steps of creating Bastion Session and copy SSH Command to jump to the CLI Server.

From the CLI Server, all the Kubectl commands can be run.

Note:

The maximum life span of a Bastion Session is three hours. After that (If Required) end user will have to create a new Bastion Session.

For more information, see the "Bastion Overview" section in Oracle Cloud Infrastructure Documentation.

CLI Server

The CLI Server is a private VM that connects to the OKE Cluster. It allows you to perform user operations and run all kubectl commands. The user can log into the CLI Server only through the Bastion Session. The CLI Server uses OCI's instance principle mechanism to authenticate itself with the OKE cluster.

For more information, see the "Calling Services from an Instance" section in Oracle Cloud Infrastructure Documentation.

The terraform scripts create a dynamic group, which allows the CLI Server to use the Instance Principal Mechanism to authenticate itself with the following matching rule and policy.

Matching Rule
All {instance.compartment.id='COMPARTMENT_ID', tag.cli_server_tag='true'}
Policy
Allow dynamic-group DYNAMIC_GROUP_NAME to manage cluster-family in compartment COMPARTMENT_NAME

Node Pool

An OCI node pool is a group of nodes within a cluster that have the same configuration, such as a Linux image, number of OCPUs, RAM per node, boot volume, the applicable Kubernetes version, and so on. CNC uses this node pool to create compute for the NFs. The advantage of this node pool is that a new compute is created with the same attributes when it is resized. OKE worker nodes are created on this node pool.

For more information, see the "Node Pool Management" section in Oracle Cloud Infrastructure Documentation.

Service Gateway

The service gateway provides access from a VCN to other OCI services, such as Oracle Cloud Infrastructure (OCI) Object Storage. CNC uses Service Gateway to share metrics towards OCI Monitoring Services, share logs towards OCI Logging and Analytics and traces towards OCI APM data collector. The traffic from the VCN to the Oracle service does not use the internet. It uses the Oracle network fabric.

For more information, see the "Overview of Service Gateways" section in Oracle Cloud Infrastructure Documentation.