Note:

Use Fluentd to Ingest Oracle Cloud Infrastructure Logs into Elastic Cloud

Introduction

Oracle Cloud Infrastructure (OCI) provides comprehensive logging capabilities, but integrating these logs with external systems like Elastic requires a robust log shipper. There are many popular open-source data collectors that enable you to unify data collection and consumption.

For more information see: Ingest Oracle Cloud Infrastructure Logs into Third-Party SIEM Platforms using Log Shippers and Send Oracle Cloud Infrastructure Logs to New Relic using Vector to determine which solution best fits your use case.

In this tutorial, we will walk through the detailed tasks to use Fluentd as a log shipper to ingest OCI logs into Elastic Cloud.

Fluentd

Fluentd is a robust, open-source data collector developed by Treasure Data and now part of CNCF, designed to streamline log data collection, transformation, and distribution across various systems. It acts as a unified logging layer that gathers logs from diverse sources, processes them using parser and filter plugins, and forwards them to destinations like Elastic, Kafka, or cloud storage. Fluentd can be deployed as a log shipper on a compute instance to capture logs from OCI Logging service and deliver them to Elastic for enhanced monitoring and analysis.

Once Fluentd forwards logs to Elastic, the real power of log data is unlocked. Elastic security indexes and stores the log data in a structured format, enabling powerful search, analysis, and visualization capabilities. By leveraging Elastic’s full-text search engine, users can query and aggregate logs in real-time, uncover patterns, identify anomalies, and generate actionable insights. Additionally, with tools like Kibana, logs can be visualized through interactive dashboards, transforming raw log data into intuitive visual representations that aid in monitoring application performance, detecting security threats, and troubleshooting operational issues effectively.

Let us look at the high-level representation of the solution architecture as shown in the following image.

Edit button

Objectives

Prerequisites

Task 1: Prepare OCI for Log Streaming

  1. Enable logs in OCI Logging.

    For this tutorial, we will use Audit logs. You can also enable service or custom logs based on your use case. For more information, see Logging Overview.

  2. Create a Stream.

    Before Fluentd can start shipping logs, the data needs a consistent source. In OCI, that source is a Kafka-compatible stream. Imagine setting up a stream as a centralized data pipeline for logs. Every log event generated within OCI from compute instances to networking services can be directed to this stream. This not only consolidates log data but also ensures that Fluentd has a single endpoint to pull data from.

    1. To create a stream, see Creating a Stream.

      Edit button

    2. Navigate to Stream Pool and note down the Stream Name, Bootstrap Server, username from stream pool and generated auth token for the user. For more information, see auth token.

      We will need all these in our Fluentd configuration file.

      Edit button

  3. Create an OCI Connector Hub.

    OCI Connector Hub acts as the orchestrator, routing logs from various services to the stream. With OCI Connector Hub, you can define connectors that move logs from OCI Logging service which has Audit logs, Service logs and Custom logs and direct them to the stream. To create a connector hub, enter the following information.

    • Source: Select Logging.
    • Destination: Select Streaming (select the stream created in step 2).
    • Select Create policies automatically to generate required OCI IAM policies.

    Edit button

    Edit button

    Edit button

    For more information, see Creating a Connector with a Logging Source.

Task 2: Install and Configure Fluentd

With the stream up and running, it is time to set up Fluentd. Here, the objective is to deploy Fluentd on an OCI Compute instance and configure it to consume logs from the stream.

Why an OCI Compute instance? Think of it as the intermediary that bridges the gap between OCI logs and Elastic Cloud. It is where Fluentd will run, ingest data from the stream, and relay it to Elastic.

SSH into the instance and install Fluentd using the Treasure Data package manager.

curl -fsSL https://toolbelt.treasuredata.com/sh/install-redhat-fluent-package5-lts.sh | sh

Verify the version which confirms that Fluentd is installed.

fluentd --version

Note: If spinning up the compute instance in OCI, make sure the add-on for the Custom log is disabled.

Task 3: Install Plugins into Kafka and Elasticsearch

Now that Fluentd is ready, it needs to be equipped with plugins. In this architecture, Fluentd acts as both a consumer of stream data and a forwarder to Elasticsearch. This requires the installation of two key plugins:

Run the following command to install both plugins.

fluent-gem install fluent-plugin-kafka fluent-plugin-elasticsearch

Task 4: Verify the Output Logs using stdout

Before forwarding logs to Elasticsearch, it is a good practice to verify that the log ingestion flow is working. This is where stdout comes in. It ensures that the data is flowing correctly from the stream before sending it to Elastic.

  1. To implement this, update the Fluentd configuration located at /etc/fluent/fluentd.conf.

    	<source>
    	@type kafka_group
    	brokers <stream_endpoint>:9092
    	topics <stream_topic>
    	format json
    	username <username>
    	password <password>
    	ssl_ca_cert /etc/fluent/kafka_chain.pem
    	sasl_over_ssl true
    	consumer_group fluentd-group
    	<parse>
    		@type json
    	</parse>
    	</source>
    
    	<match **>
    	@type stdout
    	</match>
    
  2. Replace <stream_endpoint> and <stream_topic> with the bootstrap server and stream name accordingly. Also replace the <username> and <password> with the details from Kafka Connection Settings in OCI collected in Task 1.2. The username needs the domain appended as well, like <tenancy_name>/<domain_name>/<username>/ocid1.streampool.oc1.##############.

    Note:

    • The ssl_ca_cert should be a full certificate chain in PEM format, that should include server cert (OCI Streaming certs), intermediate and root certificates.

    • To establish a trusted TLS connection with OCI Streaming, begin by extracting the server and intermediate certificates using the openssl command: openssl s_client -showcerts -connect cell-1.streaming.us-ashburn-1.oci.oraclecloud.com:9092 -servername cell-1.streaming.us-ashburn-1.oci.oraclecloud.com < /dev/null | \\nsed -n '/-----BEGIN CERTIFICATE-----/,/-----END CERTIFICATE-----/p' > kafka_chain.pem.

    • This saves the certificate chain to a file named kafka_chain.pem. Next, download the DigiCert Global Root G2 certificate the trusted root certificate used by OCI Streaming from the DigiCert Root Certificates page in PEM format and save it as root.pem. Finally, append the root certificate to your chain file using the cat root.pem >> kafka_chain.pem command.

    • This results in a complete certificate chain in kafka_chain.pem, ready to be used by TLS clients for secure connectivity with OCI Streaming.

  3. Run the following command to start Fluentd and monitor the output.

    $ sudo systemctl restart fluentd.service
    $ sudo systemctl status fluentd.service
    $ sudo cat /var/log/fluent/fluentd.log
    

If everything is working, logs from the stream will start appearing in the Fluentd logs. This ensures that the data pipeline is functioning as expected before moving forward.

Task 5: Forward Logs to Elastic Cloud

With the pipeline verified, it is time to reconfigure Fluentd to forward logs to Elastic Cloud. This task transforms Fluentd from a simple log consumer to a full-fledged log shipper.

  1. Update the configuration to include the Elastic output plugin.

    <source>
      @type kafka_group
      brokers <stream_endpoint>:9092
      topics <stream_topic>
      format json
      username <tenancy_name>/<domain_name>/<username>/ocid1.streampool.oc1.iad.##########
      password <password>
      ssl_ca_cert /etc/fluent/kafka_chain.pem
      sasl_over_ssl true
      consumer_group fluentd-group
    </source>
    
    <match **>
      @type elasticsearch
      cloud_id ###########
      cloud_auth 'elastic:##########'
      logstash_prefix fluentd
      logstash_format true
      index_name fluentd
    </match>
    

    Note: The Cloud ID is a unique ID which gets assigned to your hosted Elasticsearch cluster on Elastic Cloud. All deployments automatically get a Cloud ID. To find your Cloud ID and password for the elastic user, see Find your Cloud ID.

  2. Restart Fluentd to apply the changes.

    sudo systemctl restart fluentd.service
    

Task 6: Validate and Unlock Insights in Elasticsearch

The logs are streaming into Elastic successfully, the data is indexed and structured for efficient querying. Elastic’s full-text search engine enables you to search, aggregate, and visualize data in real time.

It is essential to create a data view. This data view serves as a structured layer that organizes log data, enables you to seamlessly explore and extract valuable insights. For more information, see Data views.

Edit button

Edit button

Acknowledgments

More Learning Resources

Explore other labs on docs.oracle.com/learn or access more free learning content on the Oracle Learning YouTube channel. Additionally, visit education.oracle.com/learning-explorer to become an Oracle Learning Explorer.

For product documentation, visit Oracle Help Center.