OCI Search Service with OpenSearch, Oracle's OpenSearch managed service, enables you to search large datasets and display results in milliseconds by leveraging a powerful index. OpenSearch is a community-driven, open-source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 and Kibana 7.10.2.
- a search engine daemon - OpenSearch
- a visualization and user interface - OpenSearch Dashboards
- Analyze your logs
- Store, investigate, and connect event data to find and fix issues as fast as possible and enhance application performance.
- Seamlessly expand your cluster size for seasonal fluctuations in event data.
- Application search
Datasets constantly increase in size and a fast and customized search experience is necessary for your applications, websites, and large data repositories. OCI Search Service with OpenSearch can enable the fastest results based on the frequency of certain data/pages being sourced or based on time.
For example: A photo archiving business could render picture results faster for those images that have been requested recently while moving older and less accessed images to warm storage to keep the indexing as fast as possible.
This reference architecture shows a simple use case for server logging, processing and consolidation.
The following diagram illustrates this reference architecture.
The diagram above displays a simplified high-availability application environment on OCI, with a focus on two virtual machine instances behind a load balancer. The instances exist in two distinct availability domains.
Each virtual machine instance uses Filebeat for forwarding logs. Filebeat is a lightweight agent, installed on the virtual machine instances, which monitors the log files and locations specified, collects log events, and forwards them to Logstash.
Logstash, the data processing pipeline for log parsing and transformation, runs in instances placed in two distinct availability domains. Logstash, in turn, sends the transformed logs to OCI Search Service with OpenSearch
OCI Search Service with OpenSearch is a regional managed service, with built-in redundancy. OCI Search Service with OpenSearch features private endpoints for OpenSearch and OpenSearch Dashboards, so your traffic does not traverse the internet.
- 9200 for OpenSearch
- 5601 for OpenSearch Dashboards
The architecture has the following components:
A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.
An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.
Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.
- Virtual cloud network
One of your first steps in OCI is to set up a virtual cloud network (VCN) for your cloud resources. A VCN is a software-defined network that you set up in an OCI region. VCNs can be segmented into subnets, which can be specific to a region or to an availability domain. Both region-specific and availability domain-specific subnets can coexist in the same VCN. A subnet can be public or private.
- Availability domains
Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.
- Bastion host
The Bastion host is a compute instance that serves as a secure, controlled entry point to the topology from outside the cloud. It enables you to access sensitive resources placed in private networks that can't be accessed directly from outside the cloud. The topology has a single, known entry point that you can monitor and audit regularly, so you can avoid exposing the more sensitive components of the topology without compromising access to them.
- Load balancer
The Oracle Cloud Infrastructure Load Balancing service provides automated traffic distribution from a single entry point to multiple servers in the back end.
- Compute instances
Oracle Cloud Infrastructure Compute lets you provision and manage compute hosts. You can launch compute instances with shapes that meet your resource requirements (CPU, memory, network bandwidth, and storage). After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you don't need it.
- Internet gateway
The internet gateway allows traffic between the public subnets in a VCN and the public internet.
- Dynamic routing gateway (DRG)
The DRG is a virtual router that provides a path for private network traffic between VCNs in the same region, between a VCN and a network outside the region, such as a VCN in another Oracle Cloud Infrastructure region, an on-premises network, or a network in another cloud provider.
Oracle Cloud Infrastructure Vault enables you to centrally manage the encryption keys that protect your data and the secret credentials that you use to secure access to your resources in the cloud.
- Site-to-Site VPN
Site-to-Site VPN provides IPSec VPN connectivity between your on-premises network and VCNs in Oracle Cloud Infrastructure. The IPSec protocol suite encrypts IP traffic before the packets are transferred from the source to the destination and decrypts the traffic when it arrives.
Oracle Cloud Infrastructure FastConnect provides an easy way to create a dedicated, private connection between your data center and Oracle Cloud Infrastructure. FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.
- Web Application Firewall (WAF)
Oracle Cloud Infrastructure Web Application Firewall (WAF) is a payment card industry (PCI) compliant, regional-based and edge enforcement service that is attached to an enforcement point, such as a load balancer or a web application domain name. WAF protects applications from malicious and unwanted internet traffic. WAF can protect any internet facing endpoint, providing consistent rule enforcement across a customer's applications.
When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.
Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.
After you create a VCN, you can change, add, and remove its CIDR blocks.
When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.
Use policies to restrict who can access the OCI resources that your company has and how they can access them. Specific policies are required for a successful OCI Search Service with OpenSearch cluster creation. Use Vault for additional protection of your keys, certificates and secrets.
The Networking service offers two virtual firewall features that use security rules to control traffic at the packet level: security lists and network security groups (NSG). An NSG consists of a set of ingress and egress security rules that apply only to a set of VNICs of your choice in a single VCN. For example, you can choose all the compute instances that act as web servers in the web tier of a multi-tier application in your VCN.
NSG security rules function the same as security list rules. However, for an NSG security rule's source or destination, you can specify an NSG instead of a CIDR block. So, you can easily write security rules to control traffic between two NSGs in the same VCN or traffic within a single NSG. When you create a database system, you can specify one or more NSGs. You can also update an existing database system to use one or more NSGs.
Choose shapes with the appropriate OCPUs and memory combination, and provision local NVMe and/or block storage according to need, for each instance.
When implementing this architecture, consider your requirements for the following parameters:
In this reference architecture, we use Logstash for log parsing and transformation.
Filebeat can also send logs directly to OpenSearch. This approach could be used in the case of a single kind of logs, or very uniform logs. Filebeat lacks the ability of advanced filtering and transformation of logs, making it difficult to aggregate logs of different kinds when their format is distinct.
In this reference architecture, we use dedicated instances for Logstash. As an alternative, Logstash can run in each source instance or server. This results in a higher resource consumption at the source instances or servers.
For high availability of Logstash, consider using multiple instances, spread across fault domains or availability domains. Filebeat can load balance between Logstash instances, without a separate load balancer.
- Persistent queues
Consider configuring persistent queues for Logstash. A Logstash persistent queue helps protect against data loss during abnormal termination by storing the in-flight message queue to disk.
Review these additional resources to learn more about the features of this architecture: