The Endeca Server query performance is dependent on many characteristics of your specific deployment, such as query workload, query complexity, data domain configuration, and the characteristics of the loaded records, as well as the size of the data domain's index. In view of these characteristics, a hardware sizing must be performed prior to deployment, to assess memory consumption and other hardware needs of your deployment.
One of the characteristics that affect the sizing is estimating the projected memory consumption by the Endeca Server. The following sections in this topic describe different aspects of memory consumption by the Endeca Server:
Summary: Initial memory consumption by the Dgraph is not indicative or predictive estimate for the number of data domains that can be provisioned.
When the Endeca Server Java application is started, its Dgraph process initially claims a significant amount of RAM on the system for its use. The Dgraph process allocates considerable amounts of virtual memory while ingesting data or executing complex queries. This is observable if you run operating system diagnostic tools. However, the usage of physical memory by the Dgraph also depends on the demands of other processes on the system. When other memory-intensive processes, including other Dgraph processes, are present in the operating system, the Dgraph releases a significant portion of its physical memory quickly. Without such pressure, it may retain the physical memory indefinitely. This is an expected behavior.
To conclude, measurements of physical memory usage on a machine with few data domains are not a predictive estimate of the memory requirements for a larger number of data domains. You should not rely on these estimates for predicting how much memory the Dgraph actually requires to run to support multiple configured data domains.
The Dgraph cache size should be configured to be large enough to allow Endeca Server to operate smoothly under normal query load. You configure the Dgraph cache size through the data domain profile's --compute-cache-size parameter of endeca-cmd. For information, see Data domain profile operations.
While the Dgraph typically operates within the limits of its configured Dgraph cache size, it is possible for the cache to become over-subscribed for short periods of time. During such periods, the Dgraph may use up to 1.5 times more cache than it has configured. It is important to note that Endeca Server does not expect to routinely reach an increase in its configured Dgraph cache usage. When the cache size reaches the 1.5 times threshold, the Dgraph starts to more aggressively evict processes that consume its cache, so that the cache memory usage can be reduced to its configured limits.
Data Enrichment plugins (used via Studio as Enrichments), require adding memory on each machine hosting Endeca Server.
If you are planning to use data enrichment plugins (such as term extraction) in Studio, consider adding additional memory of about 10GB per each instance of Data Enrichment plugin that is expected to run concurrently in the data domain. In other words, if users in the data domain plan to run term extraction, for each such process, additional memory should be provisioned on all Endeca Server machines hosting this data domain.