This section defines a number of terms that you need to understand when read this guide.
The following terms are used to discuss the performance of the MDEX Engine:
Throughput is the number of requests processed by the MDEX Engine per unit of time. In this guide, unless otherwise specified, it is expressed as query operations per second (ops/sec). Throughput is measured with the performance tool Eneperf using an MDEX Engine request log.
Dgraph sustained throughput is the measure of query capacity, that is, the maximum number of requests that can be consistently processed by the MDEX Engine per second.
Latency is how fast the MDEX Engine responds to queries, or the time it takes for a query to be returned by the Engine, typically in milliseconds.
Maximum latency is the maximum time it takes for the longest query to be returned by the MDEX Engine.
Note
Although latency and throughput are related, they cannot be directly derived from each another. The inverse of the average latency is a lower bound on the maximum throughput. For example, if the average latency for a shopper in a supermarket checkout line is five minutes, we know that the checkout throughput of the store must be at least 0.2 shoppers per minute. In addition, latency and throughput are tied together by concurrency. Using the same example, the real maximum throughput may be 10 shoppers per minute because there are many checkout lanes.
An operation is defined as a single request to the MDEX Engine.
Such a request may have one of the following types:
Memory bandwidth is the rate at which data can be read from or stored in memory by a processor. It is measured in bytes per second. In relation to MDEX Engine performance, you may be interested in the memory bandwidth that a system can sustain while running a Dgraph or multiple Dgraphs.
The virtual process size (or address space) for the Dgraph is the total amount of virtual memory allocated by the operating system to the MDEX Engine process at any point in time. This includes the Dgraph code, the MDEX Engine data as represented on disk, the Dgraph cache and any temporary work space.
Resident set size (RSS) is the amount of physical memory currently allocated and used by the MDEX Engine process. As the MDEX Engine process runs, the active executable code and data are brought into RAM, becoming part of the RSS for the MDEX Engine.
You can view the resident set size of a process on Linux by using
ps -o pid
,ucomm
, orrss
commands,ucomm
, or by using thetop
program which reports the RSS size.The working set size (WSS) of the MDEX Engine process is the amount of physical memory needed for those parts of the process that have been most recently and frequently accessed. In other words, the Dgraph WSS is the amount of memory a Dgraph process is consuming now and that is needed to avoid paging.
The WSS of the Dgraph process directly affects RAM usage. As the working set increases, the Dgraph process memory demand increases. With a larger WSS, a process needs more memory to run with acceptable performance.
You cannot measure the WSS, but you can make assumptions about it when you measure the resident set size and observe performance; performance tends to degrade if the RSS cannot equal the WSS.
The Dgraph cache is an area of memory set aside for dynamically saving the partial and complete results of processing queries.
Warming is the process during which the MDEX Engine performance gradually increases to a steady state. A gradual increase in performance takes place either as the MDEX Engine starts up and processes queries or following a partial update.
Utilization is the percentage of the total capacity of a resource that is actually being used.
The number of concurrent users is the number of site users engaging the MDEX Engine at any given time. When planning for Dgraph capacity based on the number of concurrent users, remember that users do not issue queries continuously. Typically, a user takes time to think after making one query before making the next one.