This section discusses various tuning changes on the operating system level that you can perform on the server running the MDEX Engine to optimize its performance.
If you experience poor performance on an Intel Xeon processor-based servers running Windows Server 2008, Oracle recommends changing the default BIOS setting for power management from "Dynamic" mode to "Static High Performance" mode.
The BIOS has a mode setting that controls the power regulator. In the default "Dynamic" mode, the system attempts to balance high performance with power savings. Setting the regulator to "Static High Performance" mode forces the system to always favor performance.
This issue has been observed only on some Xeon-based servers.
This topic discusses performance expectations of MDEX Engine deployments on VMware (all supported versions) and provides recommendations for such deployments.
Virtualizing Guided Search deployments on VMware is motivated by cost management reduction that is typically associated with server consolidation, as well as by human cost reduction associated with simplified server administration and maintenance.
See the "Supported operating systems" section of the MDEX Engine Installation Guide for supported guest operating systems.
Overall, for server-level performance, the average and sustained throughput decrease in a VM environment, while the latency and the warmup time increase.
If you consider deploying an MDEX Engine with the Dgraph that is configured with four threads and where the MDEX Engine is assumed to be utilized at full capacity, expect a 10-30% performance overhead with a VMware-based deployment compared with a non-VM deployment. The indexing performance is also expected to be in the range of 10-30% overhead above the non-VM deployment. In some deployments, depending on your hardware, storage and implementation strategy, performance overhead can be up to 50%.
These performance expectations manifest in the decrease in sustained throughput, increase in average latency, increase in the amount of time it takes the Dgraph to reach 80% of its expected level of throughput, and increase in the latency of the longest query (99% of queries perform better than this query).
Performance risk associated with virtualizing the MDEX Engine is directly related to the performance and scalability requirements of your application. While Oracle recommends virtualization, customers interested in virtualizing HPC (high-performance computing) applications should analyze the risk associated with such projects and seek IT support with strong virtualization skills and experience. Oracle believes that virtualization of the MDEX Engine on VMware is most appropriate at smaller data scale.
Oracle recommends the following practices to ensure adequate performance on VMware:
Implement vendor best practices for tuning performance of network and storage in a VM environment. For example, be aware of the limitation of four virtual CPUs per virtual machine.
Be aware of the virtualization performance tax. The performance overhead, or "tax", of virtualizing the MDEX Engine varies by data set and by performance metric. When a deployment is properly configured and sized, the performance overhead is generally about 10%-30%. Oracle expects that the virtualization performance tax will exceed the range of 10%-30% and may reach up to 50% in the following situations:
Improperly configured or improperly sized deployments. Adequate memory allocation is especially important. Plan for additional memory and storage requirements due to index replication.
Write-heavy workloads. In particular, the following Guided Search configurations are susceptible: (1) deployments where Dgidx and Forge are used heavily, and (2) Dgraphs under extensive and sustained partial update load.
Rely on a robust deployment architecture. Most of the initial performance problems associated with deploying VMware occur due to mis-configurations or inadequate system resources.
The approach to disk storage can be a significant factor in performance. Both locally-attached storage and network-attached storage solutions are supported. To ensure adequate performance, pay special attention to testing and tuning the bandwidth and latency of your storage solution with VMware. Consult with the documentation for your storage manufacturer for information on tuning your storage configuration for VMware.
Expect that lower throughput will lead to longer warmup periods.
Plan for lower ratio of query threads to update threads for applications leveraging frequent partial updates. Frequent partial updates are recommended in such implementations because each Dgraph is limited to four threads by the virtual machine limit of four virtual CPUs. On non-VM platforms, a Dgraph can be configured with significantly more threads, improving the ratio of query threads to update threads during partial update processing.
This section lists recommended tuning changes on RHEL 4 and RHEL 5 configurations for the MDEX Engine.
Starting with the MDEX Engine version 6.0, the MDEX Engine takes advantage of the readahead function.
Readahead is a technique employed by the Linux kernel that can improve file reading performance. If the kernel assumes that a particular file is being read sequentially, it attempts to read subsequent blocks from the file into memory before the application requests them. Setting the readahead can speed up the system's throughput, since the reading application does not have to wait as long for its subsequent requests, since they are served from cache in RAM, not from disk. However, in some cases the readahead setting generates unnecessary I/O operations and occupies memory pages which are needed for some other purpose. Therefore, tuning readahead for best performance is recommended.
You can tune readahead for optimum performance based on the settings recommended by Oracle.
Oracle recommends setting the
read_ahead_kb
kernel parameter to 64 kilobytes on all
Linux machines (RHEL 5). This setting controls how much extra data the
operating system reads from disk when performing I/O operations.
Reducing this value from the default typically increases sustained throughput for the MDEX Engine while also increasing its warmup time. Warmup is defined as initial performance of the MDEX Engine after startup (throughput and query latency), until the sustained level of performance is reached. Therefore, if you decide to tune this parameter, choose a value to balance these concerns.
Reducing
read_ahead_kb
has a noticeable effect and increases
throughput for the MDEX Engine only in cases where a large data set may not fit
into the MDEX Engine memory.
In cases when the index fits into memory, reducing
read_ahead_kb
from its default has no noticeable
effect on the MDEX Engine performance.
When operating the MDEX Engine on a large data set that is running out
of memory, consider adding more memory in addition to tuning
read_ahead_kb
to improve performance.
Setting
read_ahead_kb
to 64 kilobytes is a reasonable choice
for most applications running on Linux.
To tune the
read_ahead_kb
kernel parameter on RHEL 5:
Oracle recommends changing the default I/O scheduler that the Linux kernel uses from CFQ to DEADLINE.
This dramatically speeds up performance of Guided Search applications with large data sets in cases where both the amount of physical memory available to the MDEX Engine and disk I/O are limited. This recommendation applies to Guided Search implementations on both RAID disk arrays and individual disks.
To adjust the I/O scheduler on a device:
Oracle recommends disabling the swap token timeout by setting it to zero. The swap token is a mechanism in Linux that allows some processes to make progress when the total working set size of all processes exceeds the size of physical RAM.
In situations when only one process is active, and the virtual memory size of that process gets close to, or exceeds the size of the available RAM, enabling the swap token negatively affects performance. In the context of the Dgraph, this can happen if the physical server is dedicated exclusively to running the MDEX Engine, and the index size is close to, or exceeds the size of the available RAM.
Oracle recommends disabling the swap token for those MDEX Engine configurations running on Linux that serve large data sets and are memory- and disk-bound.
If you choose not to disable the swap token, and experience erratic Dgraph performance, you may wish to examine the system to determine whether the swap token is causing problems. The swap token can cause "direct steal" operations.
To measure "direct steal" operations, check the contents of
/proc/vmstat
, adding
pgsteal_dma32
and
pgsteal_normal
values and subtracting
kswapd_steal
.
Note
Oracle recommends that you disable the swap token explicitly for the MDEX Engine disk devices even though you can obtain a patch for the Linux kernel that disables it.
To disable the swap token timeout on RHEL 5: