Monitor Kubernetes Log Data with OCI Logging Analytics
A Kubernetes based environment can be divided into three tiers, each consisting of numerous and continuously evolving components driven by business needs.
- Infrastructure tier: contains numerous components, including networking resources, compute instances, and the Kubernetes nodes hosts.
- Kubernetes platform tier: contains the various Kubernetes services such as network, kubelet service, and DNS, which power the kubernetes platform.
- Application tier: contains the different technologies, databases, and applications.
Architecture
This architecture shows how you can use Oracle Cloud Infrastructure Logging Analytics to monitor a Kubernetes platform and cloud native applications.
The following diagram is a sample topology of a Kubernetes Cluster in a single Oracle Cloud Infrastructure (OCI) region. It shows the infrastructure tier and the second diagram highlights the kubernetes and application tiers.

Description of the illustration kubernetes-master-worker-nodes.png
kubernetes-master-worker-nodes-oracle.zip
The following diagram illustrates Kubernetes monitoring for your on-premises Kubernetes clusters and Oracle Cloud Infrastructure Kubernetes Engine (also known as Kubernetes Engine or OKE) with OCI Logging Analytics. This solution offers a collection of various logs of a Kubernetes cluster into OCI Logging Analytics and offers rich analytics on top of the collected logs. You can customize the log collection by modifying the out-of-the box configuration.

Description of the illustration k8s-oke-monitoring.png
The architecture has the following components:
- Tenancy
A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.
- Region
An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).
- Compartment
Compartments are cross-regional logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize, control access, and set usage quotas for your Oracle Cloud resources. In a given compartment, you define policies that control access and set privileges for resources.
- Virtual cloud network (VCN) and subnets
A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.
- Load balancer
The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.
- Service
gateway
The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and does not traverse the internet.
- Logging Analytics
Oracle Cloud Infrastructure (OCI) Logging Analytics is a fully managed SaaS regional service available in more than 27 regions that provides collection, indexing, enrichment, query, visualization, and alerting for logs from any IT component running on on-premises, OCI or 3rd party cloud.
- Logging Analytics Source
Logging Analytics Source is a configuration resource that provides specifications for parsing, extractions, labeling, data masking, and other enrichment to ensure logs are properly ingested and indexed for analysis and monitoring. This architecture uses more than 30 pre-defined sources for Kubernetes services, applications, and objects. These sources are continuously enhanced to provide deeper analytics capabilities.
- Kubernetes System Pods
Kubernetes System Pods are small deployable units of computing that you can create and manage in Kubernetes. A Pod is one or more containers, with shared storage and network resources, and rules for running the containers.
- User Pods
Applications launched on the Kubernetes cluster. All the logs from application pods writing
STDOUT/STDERR
are typically available under/var/log/containers/
. Applications that have custom log handlers may route their logs differently, but in general are available on the node (through a volume). - Control Plane Services & Pods
Kubernetes platform Control Plane Services and pods. The Control Plane manages the worker nodes and the Pods in the Kubernetes cluster. The worker nodes run the containerized applications. Every cluster has at least one worker node. The worker node(s) host the Pods that are the components of the application workload.
- Node OS Services
Linux services running on the instance on which Kubernetes is installed. Logs are collected on OS services.
- Log and Object Collector Pods
Log and Object Collector Pods are made up of replica sets, FluentD, and daemon sets.
- FluentD Collector
FluentD is an open-source data collector that provides a unified logging layer between data sources and backend systems. It allows unified data collection and consumption for a building data processing pipelines. This architecture uses containerized FluentD container that runs as daemon set and replicat set on kubernetes cluster. It uses logging analytics fluentd output plugin to upload logs to OCI Logging Analytics.
- Logging Analytics FluentD Plugin
The FluentD output plugin that connects to OCI Logging Analytics service in your tenancy to upload or ingest logs collected by FluentD collector.
- Kubernetes Objects
Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster. In this architecture, the following kubernetes object states are collected as logs for historical analysis and troubleshooting:
- Kubernetes Daemon Set
A Kubernetes
DaemonSet
is a type of workload that runs on Kubernetes and ensures that all (or some) Nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. - Kubernetes Replica Set
A Kubernetes
ReplicaSet
is a type of workload that runs on Kubernetes. It maintains a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods
- FluentD Collector
- Kubernetes Engine
Oracle Cloud Infrastructure Kubernetes Engine (OCI Kubernetes Engine or OKE) is a fully managed, scalable, and highly available service that you can use to deploy your containerized applications to the cloud. You specify the compute resources that your applications require, and Kubernetes Engine provisions them on Oracle Cloud Infrastructure in an existing tenancy. OKE uses Kubernetes to automate the deployment, scaling, and management of containerized applications across clusters of hosts.
- Service connectors
Service Connector Hub is a cloud message bus platform. You can use it to move data between services in Oracle Cloud Infrastructure. Data is moved using service connectors. A service connector specifies the source service that contains the data to be moved, the tasks to perform on the data, and the target service to which the data must be delivered when the specified tasks are completed. One service connector is provisioned in this architecture to collect network and load-balancer logs.
- OCI Services
Oracle Cloud Infrastructure (OCI) services are a platform of cloud services that enable you to build and run a wide range of applications in a highly-available, consistently high-performance environment.
- Service and Audit Logs
Service and Audit Logs are captured in OCI Logging service. OCI Logging is a highly scalable and fully managed service that is used to access the VCN and Load Balancer service logs through the Service Connector.
By default, Kubernetes System Services Logs and Kubernetes object data are collected.
Oracle Cloud Infrastructure Kubernetes Engine has built-in services where each one has different responsibilities and
they run on one or more nodes in the cluster either as Deployments or
DaemonSets
.
Kubernetes System Services | Linux System Services | Kubernetes Control Plane | Kubernetes Objects (Default: every 15 mins) | Custom Application Logs |
---|---|---|---|---|
|
|
|
|
|
Note:
Kubernetes control plane logs are not covered as part of out of the box collection, as these logs are not exposed by OCI Kubernetes Engine (also known as OKE). You can enable control plane logs from non-OKE Kubernetes clusters.
Recommendations
Use the following recommendations as a starting point. Your requirements might differ from the architecture described here.
- Log Groups
Define Multiple Log Groups to provide write access permissions to different teams and avoid sharing sensitive data. Log Groups can be based on Oracle E-Business Suite, Database, OCI infrastructure, Hosts Logs.
- Cost Management
Oracle Cloud Infrastructure (OCI) Logging Analytics service is charged on the volume of data in active and archival storage. To allow troubleshooting of day-to- day issues and get benefits of anomaly detection, pattern detection and other ML capabilities, we recommend using an active storage period of 90 days and moving logs older than 90 days to archival storage. Logs from archival stored can be recalled on demand quickly.
- FluentD Multi-worker
Configure FluentD in multi-worker mode for time-sensitive logs.
- Custom Application Logs
This solution automatically captures all the logs generated by applications running in a Kubernetes cluster. By default, these logs are mapped to
Kubernetes Generic Container Logs
log source. Application logs specific parser, sources, and enrichment should be created in Oracle Cloud Infrastructure Logging Analytics to extract required fields and attach problem labels to logs. - Authentication
This architecture supports instance principal and Oracle Cloud Infrastructure
config
file based authentication. Instance principal based authentication is recommended for Oracle Cloud Infrastructure Kubernetes Engine (OKE).
Considerations
Consider the following points when deploying this reference architecture.
- Performance
Query performance is based on time-range and number of operations such as filters, group-by, and so on. For better query performance it is recommended to enrich logs with specific labels and fields at the time of ingestion. This is a part of continuous improvement for IT operations.
- Security and role-based access control (RBAC)
Customize Log Source definitions to filter any PII data and enable geolocation enrichment.
- Availability
Oracle Cloud Infrastructure Logging Analytics is a fully managed highly available SaaS service.
Deploy
The Kubernetes manifests and helm charts for deploying the Oracle Cloud Infrastructure Logging Analytics
DaemonSets
and ReplicaSets
are available in
GitHub.
- Go to GitHub.
- Clone or download the repository to your local computer.
- Follow the instructions in the
README
document.
Explore More
- Set up a Kubernetes cluster for deploying containerized applications on Oracle Cloud (solution playbook)
- Well-architected framework for Oracle Cloud Infrastructure
- Oracle Cloud Infrastructure Logging Analytics YouTube Channel
- Oracle Cloud Infrastructure Logging Analytics Documentation
- Analyze Sample Logs with OCI Logging Analytics Hands-on Lab
- Oracle Cloud Infrastructure Logging Analytics blogs