Oracle Commerce Guided Search - Dgraph performance issues

Dgraph performance issues

This section discusses locating and addressing Dgraph performance issues.

Improving the speed of Dgraph startup

Starting with the 6.1.x version of the MDEX Engine, Web services are loaded by default at startup. For this reason, Dgraph startup takes slightly longer than it did in the version 6.0.1. The Dgraph startup is typically faster than in Endeca IAP 5.1.

In most cases this increase in startup time is not an issue. However, if you find the startup time a problem and you are not planning to use Web services, you can turn off Web services and thus avoid the startup penalty. To do this, start the Dgraph with the --disable_web_services flag. (This flag is particularly useful during development, when you might be starting and stopping the Dgraph frequently.)

Note

When web services are disabled, every process that writes to dgraph will fail. This includes Workbench features such as thesaurus entries, automatic phrases, keyword redirects, and stopwords.

Tips for troubleshooting long processing time

You can use the Request Log Analyzer, installed with the MDEX Engine, to determine whether the performance bottleneck is caused by the Dgraph by comparing the statistics for “Engine-Only Processing Time” with “Round-Trip Response Time”.

If "Engine-Only Processing Time" as returned by the Request Log Analyzer tool is long, look further into specific query features to identify possible causes of the problem. This list identifies which problems you may want to isolate first:

Is the long processing time for the Engine caused by limitations of hardware resources? Identify whether long query time is caused by CPU, memory, or disk I/O utilization.
Is a high number of records being returned by the MDEX Engine? Identify how many records are being returned per query by looking for large nbins values in queries as reported by the Request Log Analyzer. This value indicates the maximum number of records that can be returned in the query. If this number is high, this can be expensive to compute and affects performance. Consider implementing paging control methods. For information on using paging control methods, see the MDEX Engine Developer's Guide.
Are all dimension refinements (dimension values) exposed for navigation? That is, examine whether your queries are spending most of their time in refinement computation. Identify whether all dimension refinements are exposed by looking for allgroups=1 in the Dgraph request log (request URL parameter) or in Request Log Analyzer reports.
This setting corresponds to NavAllRefinements value of the ENEQuery method.
If the allgroups=1 setting is present in the URL parameter, review this configuration setting for your application to decide whether it is necessary. Exposing all refinements for navigation can decrease performance because the MDEX Engine has to examine each dimension value in the dimensions and determine whether or not that dimension value is a valid refinement given a current navigation state. Exposing all dimension refinements for navigation is not recommended.
For dimensions with many dimension values, Oracle recommends introducing a hierarchy (for example, a sift dimension hierarchy for automatically generated dimensions), so that the MDEX Engine has fewer dimension values to consider at one time.
Are your longest queries similar? Check the longest queries for similarities, such as whether they all use the same search interface with relevance ranking, wildcard search, or record filters. See the sections in this guide about tuning performance of each of these features.
Is record search being used? Identify whether a record search is being used by any queries by looking for “attrs=search_interface_name” in a query. This indicates that a record search is being used which means that possibly expensive relevance ranking modules can be contributing to high computation time.
Which relevance ranking strategies are being used? Check the app_prefix.relrank_strategies.xml file for the presence of Exact, Phrase and Proximity ranking modules and test the same query with these modules removed.
Is sorting enabled for properties or dimensions? Identify whether sorting with sort keys is enabled, for which properties and dimensions it is being used and whether it is needed. The first time a sort key is issued to a Dgraph after startup the key must be computed which can slow down performance. To isolate this problem, test the query in the staging environment by removing the sort key. If you confirm sort keys are the issue, consider using sort keys in a representative batch of queries used to warm up the Dgraph after startup. The sorts will become cached and these queries will be faster.

Note
Also, identify if sorting for properties and dimensions is necessary. In particular, it is not necessary to flag all sortable properties as sort keys in the project. This is often a performance problem itself.

Related links

Warming performance vs. steady state performance

When a Dgraph starts, its performance will gradually increase until it reaches a steady state. This process is known as Dgraph warming.

It is important to distinguish between the warming performance of the Dgraph and the steady state performance. Many of the techniques discussed in this guide address either one or the other, while others address both types of performance diagnostics and optimization.

The following considerations apply specifically to diagnosing and optimizing the warming performance of the Dgraph:

Disk I/O problems can sometimes cause slow warming.
It is helpful to run a Dgraph warming script at startup. For example, you can use a request log of characteristic queries played against the Dgraph to help warm it to a steady state.

About planning for peak Dgraph load

It is important that you plan your capacity to handle peak load. Sustained load above the projected peak load results in requests being queued for a long time. The system cannot keep up, and as a result, site performance (in particular latency) degrades.

About tuning the number of threads

Standard system diagnostic tools can tell you how busy CPUs on the machine are. If performance is poor and the CPUs are not very busy, try to increase the number of threads.

By default, starting with the MDEX Engine version 6.0, the Dgraph is running in multithreaded mode, with the --threads setting set to 1.

If increasing the number of threads does not help, one of the following is happening:

You are using too many threads in one process. This is unlikely unless you exceed four threads, in which case consider using multiple Dgraphs.
You have an I/O problem.
There is an underlying network problem that needs to be investigated.

Multithreaded Dgraphs on machines with multithreaded processors

Processors with multithreading is a feature that allows a single microprocessor to act like two or more separate processors to the operating system and the application programs that use it.

Hyperthreading is a feature of Intel® Xeon® processors, as well as of Pentium 4® processors that support this technology.

Similarly, SPARC® Chip Multithreading (CMT) processors provide the technology for processor multithreading.

If your machine features hyperthreading or CMT, adding threads to your Dgraph can improve peak throughput by up to 30% per processor.

Multiple Dgraphs on one machine vs. multithreaded Dgraphs

You can run more than one Dgraph on a single machine, add additional threads to a single Dgraph, or run several Dgraphs with several threads enabled for each. Depending on your application, one choice might be better than the other.

The following use cases describe these choices:

In most cases, the following recommendation applies: Dgraphs with a large memory footprint, especially in search-intensive applications, should be run in multithreaded mode with the number of threads greater than one for best performance.
For example, suppose you have a four-processor 16GB machine and a 3GB Dgraph. You could run four identical separate Dgraphs. A better alternative is to run one four-threaded Dgraph and thus reap the benefits of having more disk cache.
By running with more than one thread, I/O and computation can be overlapped. Although the time to process an individual request isn’t improved (and can actually increase slightly due to contention for shared resources), overall throughput is significantly boosted.
Likewise, in many cases it is appropriate to run two or more Dgraphs on one machine, each with several threads. Two four-threaded Dgraphs on one machine is an especially common configuration. The trade-off between thread contention and memory depends on the memory footprint that you estimate is needed for each Dgraph and the amount of memory available on the machine that will host multiple Dgraphs.

Disk access recommendations for optimizing performance

To optimize disk access performance, consider the following recommendations.

Use a dedicated storage device with low latency and high IO ops/sec for all your indexes and files. Locally-attached storage with a RAID controller is preferred. Only in cases where that is not possible, SAN using a Fibre Channel will typically provide strong performance assuming it has been configured correctly.
If you are using an array controller, Oracle recommends using a striped disk configuration, such as RAID 5/6 or RAID 0+1 that enable you to avoid having redundant disks but ensures fault tolerance.
Do not use disks with NFS, or other file system protocols. They are known to slow down performance.
Ensure that the log files are saved locally. Turning off verbose mode, which prints information about each request to stdout, can sometimes help performance.
Ensure that you have a fast disk subsystem and plenty of memory available for disk cache managed by the operating system, since the Dgraph keeps its various text search indices on disk, including search and navigation indexes.

CPU recommendations for optimizing performance

Use the following recommendations to optimize CPU performance.

If the CPU is under-utilized, increase the number of threads for the Dgraph.
If the CPU is over-utilized and you are not satisfied with throughput, investigate which activities make it busy. Add machines or make the queries less taxing by tuning individual features.

Related links

Dgraph Analysis and Tuning

I/O recommendations for optimizing performance

If you are testing the Dgraph maximum throughput using Eneperf with an adequate num connections and the CPU is still not fully utilized, I/O could be a problem, especially if your application is search intensive but light on other features.

There is no absolute threshold that indicates that an application is I/O bound, but typical symptoms include very high numbers of I/O hits per second or KB per second. If I/O is below the specifications for the hardware, it is less likely to be a problem. In some cases, it is even possible to go beyond a device’s theoretical maximum because of disk caching.

To determine the level of I/O activity, use the following tools:

On Solaris, run iostat -2
On Linux, run sar -b
On Windows, do the following:
On the Task Manager, open the Processes tab.
From the menus, select View → Select Columns.
Check I/O Reads, I/O Read Bytes, I/O Writes, and I/O Write Bytes. These options enable new columns in the Processes pane that provide similar information to sar -b on UNIX.

Copyright © Legal Notices