Often the diagnosis of slow performance comes from a query load
played against the front-end application. The front-end application, or the
configuration of its application server, may be the reason for the poor
performance.
Alternatively, the
network may be the problem, although this is less likely. (In the case of a
Dgraph, unlike an Agraph, it is unusual for the network to be the bottleneck.)
To identify whether the network is a performance issue:
- Compare Eneperf performance
on the local host and a remote host. First, run Eneperf against the Dgraph on
the Dgraph machine. Next, run the same Eneperf against the same Dgraph, but
from the front-end machine (if possible), or somewhere on the other side of the
network. If the difference is negligible, the network is not a problem. If
Eneperf across the network is slow, you need to consider both the network
itself and the application configuration.
- Alternatively, you can run
the Cheetah tool and compare the “Round-Trip Response Time” with the
“Engine-Only Processing Time”. If “Round-Trip Response Time” is long but the
“Engine-Only Processing Time” is short, this can indicate a network problem or
a configuration of an application server for the front-end application.
- Measure network performance
using Netperf, a freely available tool that can be used to measure bandwidth.
Alternatively, you can FTP some large files across the network link. If these
tools show poor throughput across the network, this can indicate a network
hardware problem such as a failing network interface card (NIC) or cable.
- In addition, check Eneperf
statistics, the Dgraph request logs, or the Dgraph Stats page to see how much
data is being transmitted back from the Dgraph on an average request. Large
average result page size can saturate the network.
If it seems as if your application is trying to move too much data, it
is likely that you may need to change the configuration of your application. To
determine if changes are needed, consider the following:
- Is all of the data actually
being used by the application? In other words, does the MDEX Engine return
record fields that are then ignored by the front-end application? This is an
especially serious problem with large documents.
- Is your application
returning unnecessary fields with the Select feature (described in “Controlling
Record Values with the Select Feature” in the
Endeca Advanced Development Guide)?
- Is your application
returning navigation pages that are too large? (Navigation pages are result
list pages, as opposed to record detail pages.) If the application returns a
lot of detailed information in the result list pages, consider reserving the
details for a click-through and reducing the size of the result list pages your
application returns on initial requests.
- Is your application
returning large numbers of records without using the bulk record API (described
in “Bulk Export of Records” in the
Endeca Advanced Development Guide)?
- Is the network saturated?
Upgrade to Gigabit Ethernet and identify the transmission speed being used.
Ensure there is ample network bandwidth between the front-end application and
the Dgraph. To identify Gigabit Ethernet transmission speeds, work with your
network administrator.
- What is the configuration of
NIC cards? Ensure that NIC duplex settings match between the Dgraph host and
the web application client host and that both are set to full duplex. A
mismatch can cause latency issues.
- Could large response sizes
returned by the Dgraph be saturating the network? Use Cheetah analysis to
confirm large response s izes returned by the Dgraph, which can be caused by
the query features you use. The way certain features are used can cause slow
processing time and also saturate the network.
- Do you have queries waiting
in the Dgraph queue to be processed? Check "Threading/Queuing Information"
summary in Cheetah for the number of items experiencing queue issues and the
number of HTTP Error request 408 timeouts. Review the Dgraph setting for the
number of worker threads and consider increasing it, if it is set to 1. Queuing
can also be caused by spikes in traffic.
- Does the front-end
application process the responses returned by the Dgraph quickly enough? Check
CPU, memory, and disk I/O utilization on the front-end application server.
Ensure the application server does not need to be tuned and that large
responses are not being returned by the Dgraph.