OpenSearch Searchable Snapshots
Improve the efficiency of your clusters to query data directly from cost-effective, scalable remote storage.
The OpenSearch searchable snapshots feature lets you to mount index snapshots stored that are stored in Object Storage buckets, as live, searchable indexes. Instead of restoring data to primary nodes, searchable snapshots maintain a lightweight local cache of frequently queried segments. Searchable snapshots enables your cluster to query data directly from cost-effective, scalable remote storage while keeping performance high.
Here are some benefits to you by using searchable snapshots:
- Cost efficiency: You can offload older or infrequently accessed data to remote storage, reducing the need for expensive high-performance storage in your primary cluster.
- Simplified management: You can automatically manage historical data with policies that transition indexes from hot to cold storage without downtime.
- Improved scalability: You can leverage the virtually unlimited capacity of Object Storage while still providing near-real-time search capabilities through caching on dedicated search nodes.
- Optimized performance: By caching only the most often accessed parts of the data, searchable snapshots maintain fast query response times even when most of the data is stored remotely.
By combining dedicated search nodes and an Index State Management (ISM) policy, you can achieve cost efficiency, search performance, and automation for your organization.
The following sections describe how to configure and use searchable snapshots in OpenSearch.
Creating an OpenSearch Cluster with Dedicated Search Nodes
In Search with OpenSearch, search nodes handle searchable snapshots. For example, they cache data locally so that searches can be served from disk/memory rather than from object storage alone. When you create an OpenSearch cluster, you must specify the search node configuration, including:
- The shape used by the search node.
- The number of data nodes you want to specify as search nodes.
- The number of number of search node OCPUs.
- The amount of memory for the search nodes.
- The amount of storage for the search nodes.
All search nodes must have a host type of FLEX.
For more information on configuring search nodes within a cluster, see Creating an OpenSearch Cluster.
Deciding the Search Node Count
We recommend you specify between 20% and 40% your data nodes as search nodes. The following table shows examples of minimum and maximum ranges based on your data node count:
Data Nodes | Minimum Search Nodes | Maximum Search Nodes |
---|---|---|
1-2 | 1 | 1 |
3-5 | 1 | 2 |
6-7 | 2 | 3 |
8-10 | 2 | 4 |
11-12 | 3 | 5 |
13-15 | 3 | 6 |
50 | 10 | 20 |
100 | 20 | 40 |
Registering an Object Storage Repository for Snapshots
To perform searchable snapshots, you need a snapshot repository pointing to Object Storage. In OpenSearch, you typically register the repository using an S3-compatible plugin endpoint.
The following example shows how you might register a repository. Adjust the parameter values for your Object Storage configuration:
PUT _snapshot/my_snapshot_repository
{
"type": "oci",
"settings": {
"client": "default",
"endpoint": "<region-endpoint>",
"bucket": "<object-storage-bucket>",
"namespace": "<object-storage-namespace>",
"authType": "RESOURCE_PRINCIPAL",
"bucket_compartment_id": "<bucket_compartment_ocid>"
}
}
Defining an ISM Policy for Automatic Searchable Snapshots
- Waits 90 days in the hot state.
- Takes a snapshot of the index.
- Converts the index to a remote searchable snapshot (using the dedicated search nodes you configured).
The following example shows an ISM policy with a 90-day trigger:
PUT _plugins/_ism/policies/searchable_snapshots_policy
{
"policy": {
"description": "Policy to snapshot and convert index to remote searchable snapshot after 90 days",
"default_state": "hot_state",
"states": [
{
"name": "hot_state",
"actions": [],
"transitions": [
{
"state_name": "snapshot_state",
"conditions": {
"min_index_age": "90d"
}
}
]
},
{
"name": "snapshot_state",
"actions": [
{
"snapshot": {
"repository": "my_snapshot_repository",
"snapshot": "{{ctx.index}}"
}
}
],
"transitions": [
{
"state_name": "cold_state",
"conditions": {}
}
]
},
{
"name": "cold_state",
"actions": [
{
"convert_index_to_remote": {
"repository": "my_snapshot_repository"
}
}
],
"transitions": []
}
]
}
}
Here, the index stays in this state for 90 days in the hot state. After 90 days, ISM takes a snapshot of the index and then transitions to the cold state, where the index is converted into a remote searchable snapshot. It now resides primarily in Object Storage, but can be searched using the search nodes.
Applying the Policy to an Index
After the policy is created, you can apply it to any new or existing index by updating that index's settings as shown in the following example:
PUT my_index/_settings
{
"opendistro.index_state_management.policy_id": "searchable_snapshots_policy"
}
Or you can try to create index template so that ism policy can be applied throughout all index patterns that it matches.
PUT _index_template/<template_name>
{
"index_patterns": [
"index_name-*"
],
"template": {
"settings": {
"opendistro.index_state_management.policy_id": "policy_id"
}
}
}
You can verify the policy application by checking:
GET my_index/_settings
Validation and Monitoring
Regularly monitor the OpenSearch cluster to ensure your search nodes are active. Check your Object Storage repository to confirm snapshots are being created.
To verify the ISM transitions, run the following command to review the current state of your index:
GET _plugins/_ism/explain/my_index
After 90 days, the index should transition from a hot state to a cold state.