Elasticsearch Concepts and Terminology

The Elasticsearch search engine uses the following concepts and terminology.

Descriptions of the concepts and terminology listed in the following table are from the Elasticsearch website (www.elastic.co). Refer to Elasticsearch Reference [6.1], Basic Concepts.

Elasticsearch Terminology

Description

Cluster

“A cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes. A cluster is identified by a unique name which by default is ‘elasticsearch’. This name is important because a node can only be part of a cluster if the node is set up to join the cluster by its name.”

“Make sure that you don’t reuse the same cluster names in different environments, otherwise you might end up with nodes joining the wrong cluster. For instance you could use logging-dev, logging-stage, and logging-prod for the development, staging, and production clusters.”

In a PeopleSoft implementation, you can specify a cluster name when you install Elasticsearch, and if you need to change the name, you need to edit the elasticsearch.yml configuration file.

Node

“A node is a single server that is part of your cluster, stores your data, and participates in the cluster’s indexing and search capabilities. Just like a cluster, a node is identified by a name. You can define any node name you want if you do not want the default. This name is important for administration purposes where you want to identify which servers in your network correspond to which nodes in your Elasticsearch cluster.”

Index

“An index is a collection of documents that have somewhat similar characteristics. In a single cluster, you can define as many indexes as you want.”

An index is an equivalent of a relational database.

In a PeopleSoft implementation, by default, each search definitions/search categories is deployed as an individual index.

Type

Type is the Elasticsearch meta object where the mapping for an index is stored.

In a PeopleSoft implementation, each search definition corresponds to a type in Elasticsearch.

Alias

Alias is a reference to an Elasticsearch index. An alias can be mapped to more than one index.

In a PeopleSoft implementation, a search category is mapped as an Alias on the Elasticsearch server.

Document

“A document is a basic unit of information that can be indexed. This document is expressed in JavaScript Object Notation (JSON) format.”

Connected query returns parent and child rows. In a PeopleSoft implementation, each row returned from the main query corresponds to a document in Elasticsearch and child information is attached to the main query and is sent as one document.

Shards and Replicas

“Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. When you create an index, you can simply define the number of shards that you want. Each shard is in itself a fully-functional and independent ‘index’ that can be hosted on any node in the cluster.

Elasticsearch allows you to make one or more copies of your index’s shards into what are called replica shards, or replicas for short.

After the index is created, you may change the number of replicas dynamically anytime but you cannot change the number of shards after-the-fact.”