This section discusses the performance implications of some aspects of Endeca record configuration.

Aggregated records are not necessarily an expensive feature in the MDEX Engine. However, use them only when necessary, because they add organizational and implementation complexity to the application (particularly if the rollup key is different from the display information).

Using aggregated records slows down the performance of sorting and paging.

Note also that dynamic statistics on regular and aggregated records (controlled with the --stat-abins Dgraph flag) are expensive computations for the MDEX Engine. See the topic in this section for more details.

You can use the Np parameter to specify the number of records to be returned with an aggregated records. For example, Np=1 means that a single representative record is returned with each aggregate record, and Np=2 brings back all records.

Utilizing Np=2 may adversely affect your performance, as it causes the MDEX Engine to serialize more records for each query. The degree to which performance is affected is proportional to the number of base records for each aggregate record that is returned.

In most cases, it is not recommended to bring back all records in each query and aggregate all records with Np=2 as this computation could be expensive for the MDEX Engine to serialize the result. However, Np=2 can be useful in some cases. The impact on performance is proportional to the number of records that will be returned as aggregates.

For example, if each aggregate record contains only 2 records, the record serialization time is only twice the time as it is for Np=1. If, however, each aggregated record has 100 records associated with it, it is 100 times more expensive to perform the record serialization for Np=2 than for Np=1.

Record serialization time is typically only a large portion of the query processing time in very low latency applications or with very large numbers of returned records.

Note also that in many cases, a 100-fold increase in record serialization time is barely noticeable. You can examine the Prefetching horizontal records statistics in the Hotspot Analysis section of the Stats page to determine whether their performance issue is due to returning many records.

For example, if you have a very small data set with queries served almost entirely from the cache, where most of the computation done by the Dgraph for each query consists of assembling the records to be returned, the negative effect on performance is reflected in the Prefetching horizontal records statistics being very large in this case which indicates that Np=2 should not be used.


Copyright © Legal Notices