You can define several data domain profiles, each serving a
typical service use case.
Before you define a data domain profile, it is
useful to know the following about the Endeca Server cluster and its data
domains:
- Determining query load
distribution in the data domain. The Dgraph nodes in the data domain
cluster can process two types of queries — read queries and updating queries.
Read queries can be processed by any Dgraph node (follower or leader node);
they represent responses to Guided Navigation requests or search requests from
the end users. Updating queries represent data updates to the index, changes in
the records schema, or changes in the Dgraph configuration. Updating requests
must be processed only by the leader node and cannot be processed by follower
nodes.
In the data domain profile, you can decide whether your data domain
requires a dedicated leader node for handling updates, or the leader node that
shares the regular query load with follower nodes.
For example, if the frequency and size of your updating requests are
high, it is recommended to dedicate the leader node to process only the
updating requests and not to process regular queries — this way the follower
nodes process read query requests. On the contrary, if the index for the data
domain is updated rarely, you can configure the leader node in the data domain
to share the regular non-updating query load with other nodes, along with
processing updates. The data domain profile allows you to specify this option.
- Determining the desired
number of query processing nodes in the data domain. When creating data
domain profiles, you can specify the number of follower nodes (these are the
nodes that process only read queries, as opposed to the leader node that can
process both queries and updates). The number of follower nodes you need
depends on the usage patterns for the end users of the data domain. A data
domain cluster may need more query processing Dgraph nodes if the number of end
users is high, and they issue a high number of queries, often concurrently.
Keep in mind that you can only create a data domain with a certain
number of Dgraph nodes if you have a sufficient number of Endeca Server
instance nodes in the Endeca Server cluster. For each data domain, the Endeca
Server creates only one Dgraph node on a specific Endeca Server instance. In
other words, if you want to create a five-node data domain, the Endeca Server
cluster hosting it should have at least five Endeca Server nodes.
- Determining the
allocation of processing hardware resources in the Endeca Server cluster.
When you create a new data domain, the Endeca Server cluster allocates the CPU
resources from its servers to meet the needs of the data domain based on the
configuration specified in the data domain profile for the number of threads
required for each data domain node.
When defining a data domain profile, you can choose whether the
Endeca Server cluster should use one of the following hardware utilization
patterns:
- Dedicate 100% of its
nodes capacity to one hosted data domain.
- Share its capacity with
other data domains but remain within its total capacity.
- Is allowed to
oversubscribe — it can start multiple Dgraph nodes (for different data domains)
on its Endeca Server nodes, where the total number of CPU threads requested by
data domains may exceed the total amount of CPU available to each Endeca Server
node.
A data domain profile relies on the characteristics defined in the
Endeca Server node profile. In particular, the characteristics in the node
profile determine the potential limit on the number of dedicated data domains
that could be hosted on the node (dedicated data domains are those for which
the Endeca Server nodes dedicate 100% of their capacity).
- Determining whether the
data domain should be read-only. When defining a data domain profile, you
can specify whether the data domain should be created as read-only. This is
useful in the development environment or for demonstration purposes. For
example, you can export an existing data domain and then import its index using
a read-only data domain profile. This way, an imported data domain will have an
index with the same data in it, but its Dgraph nodes will be read-only
(follower nodes), thus preventing end-users from modifying its configuration or
index in any way.
Note that when you initially create a new data domain that is empty
of source data, its profile should not be configured as read-only, because its
index needs to be populated with data.
You define these characteristics when configuring data domain profiles.