Aggregate logs using OCI Search Service with OpenSearch

OCI Search Service with OpenSearch, Oracle's OpenSearch managed service, enables you to search large datasets and display results in milliseconds by leveraging a powerful index. OpenSearch is a community-driven, open-source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 and Kibana 7.10.2.

OCI Search Service with OpenSearch consists of:
  • a search engine daemon - OpenSearch
  • a visualization and user interface - OpenSearch Dashboards
With OCI Search Service with OpenSearch, you can leverage the same Elasticsearch and Kibana APIs, allowing for simplified migrations of existing stacks.
What can OCI Search Service with OpenSearch help with?
  • Analyze your logs
    • Store, investigate, and connect event data to find and fix issues as fast as possible and enhance application performance.
    • Seamlessly expand your cluster size for seasonal fluctuations in event data.
    For example: A travel incentive company can use OCI Search Service with OpenSearch to analyze call volumes across multiple providers to quickly target and resolve issues so their customer requests reach the intended team.
  • Application search

    Datasets constantly increase in size and a fast and customized search experience is necessary for your applications, websites, and large data repositories. OCI Search Service with OpenSearch can enable the fastest results based on the frequency of certain data/pages being sourced or based on time.

    For example: A photo archiving business could render picture results faster for those images that have been requested recently while moving older and less accessed images to warm storage to keep the indexing as fast as possible.

Architecture

This reference architecture shows a simple use case for server logging, processing and consolidation.

The following diagram illustrates this reference architecture.



oci-opensearch-log-analytics-arch-oracle.zip

The diagram above displays a simplified high-availability application environment on OCI, with a focus on two virtual machine instances behind a load balancer. The instances exist in two distinct availability domains.

Each virtual machine instance uses Filebeat for forwarding logs. Filebeat is a lightweight agent, installed on the virtual machine instances, which monitors the log files and locations specified, collects log events, and forwards them to Logstash.

Logstash, the data processing pipeline for log parsing and transformation, runs in instances placed in two distinct availability domains. Logstash, in turn, sends the transformed logs to OCI Search Service with OpenSearch

OCI Search Service with OpenSearch is a regional managed service, with built-in redundancy. OCI Search Service with OpenSearch features private endpoints for OpenSearch and OpenSearch Dashboards, so your traffic does not traverse the internet.

To connect to OCI Search Service with OpenSearch from a local machine, set up a Bastion host in a public subnet, and a route rule with Internet Gateway as target. You can use a Bastion host for port forwarding (also known as SSH tunneling), creating a secure connection between a port of your choice in your machine and the desired OpenSearch / OpenSearch Dashboards private endpoint destination ports:
  • 9200 for OpenSearch
  • 5601 for OpenSearch Dashboards
The application instances, which are present in a private subnet, can also be accessed via the Bastion host.

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Policies

    An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.

  • Compartments

    Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.

  • Virtual cloud network

    One of your first steps in OCI is to set up a virtual cloud network (VCN) for your cloud resources. A VCN is a software-defined network that you set up in an OCI region. VCNs can be segmented into subnets, which can be specific to a region or to an availability domain. Both region-specific and availability domain-specific subnets can coexist in the same VCN. A subnet can be public or private.

  • Availability domains

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Bastion host

    The Bastion host is a compute instance that serves as a secure, controlled entry point to the topology from outside the cloud. It enables you to access sensitive resources placed in private networks that can't be accessed directly from outside the cloud. The topology has a single, known entry point that you can monitor and audit regularly, so you can avoid exposing the more sensitive components of the topology without compromising access to them.

  • Load balancer

    The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.

  • Compute instances

    Oracle Cloud Infrastructure Compute lets you provision and manage compute hosts. You can launch compute instances with shapes that meet your resource requirements (CPU, memory, network bandwidth, and storage). After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you don't need it.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Dynamic routing gateway (DRG)

    The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider.

  • Vault

    Oracle Cloud Infrastructure Vault enables you to centrally manage the encryption keys that protect your data and the secret credentials that you use to secure access to your resources in the cloud.

  • Site-to-Site VPN

    Site-to-Site VPN provides IPSec VPN connectivity between your on-premises network and VCNs in Oracle Cloud Infrastructure. The IPSec protocol suite encrypts IP traffic before the packets are transferred from the source to the destination and decrypts the traffic when it arrives.

  • FastConnect

    Oracle Cloud Infrastructure FastConnect provides an easy way to create a dedicated, private connection between your data center and Oracle Cloud Infrastructure. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.

  • Web Application Firewall (WAF)

    Oracle Cloud Infrastructure Web Application Firewall (WAF) is a payment card industry (PCI) compliant, regional-based and edge enforcement service that is attached to an enforcement point, such as a load balancer or a web application domain name. WAF protects applications from malicious and unwanted internet traffic. WAF can protect any internet facing endpoint, providing consistent rule enforcement across a customer's applications.

Recommendations

Use the following recommendations as a starting point. Your requirements might differ from the architecture described here.
  • VCN

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

  • Security

    Use policies to restrict who can access the OCI resources that your company has and how they can access them. Specific policies are required for a successful OCI Search Service with OpenSearch cluster creation. Use Vault for additional protection of your keys, certificates and secrets.

    The Networking service offers two virtual firewall features that use security rules to control traffic at the packet level: security lists and network security groups (NSG). An NSG consists of a set of ingress and egress security rules that apply only to a set of VNICs of your choice in a single VCN. For example, you can choose all the compute instances that act as web servers in the web tier of a multi-tier application in your VCN.

    NSG security rules function the same as security list rules. However, for an NSG security rule's source or destination, you can specify an NSG instead of a CIDR block. So, you can easily write security rules to control traffic between two NSGs in the same VCN or traffic within a single NSG. When you create a database system, you can specify one or more NSGs. You can also update an existing database system to use one or more NSGs.

  • Compute

    Choose shapes with the appropriate OCPUs and memory combination, and provision local NVMe and/or block storage according to need, for each instance.

Considerations

When implementing this architecture, consider your requirements for the following parameters:

  • Logging

    In this reference architecture, we use Logstash for log parsing and transformation.

    Filebeat can also send logs directly to OpenSearch. This approach could be used in the case of a single kind of logs, or very uniform logs. Filebeat lacks the ability of advanced filtering and transformation of logs, making it difficult to aggregate logs of different kinds when their format is distinct.

  • Instances

    In this reference architecture, we use dedicated instances for Logstash. As an alternative, Logstash can run in each source instance or server. This results in a higher resource consumption at the source instances or servers.

    For high availability of Logstash, consider using multiple instances, spread across fault domains or availability domains. Filebeat can load balance between Logstash instances, without a separate load balancer.

  • Persistent queues

    Consider configuring persistent queues for Logstash. A Logstash persistent queue helps protect against data loss during abnormal termination by storing the in-flight message queue to disk.

Acknowledgments

  • Author: Nuno Goncalves
  • Contributors: Jordan Oliver, Hassan Ajan, Mark de Visser, Samuel Herman, Anupama Pundpal