Data Domain profile parameters

The endeca-cmd put-dd-profile command, or the Cluster Web Service putDataDomainProfile operation, let you configure a data domain profile and its parameters. This topic lists these parameters and provides their descriptions.

Each data domain profile you create in the Oracle Endeca Server using the Cluster Web Service has the following parameters:

Note: The following table lists the parameters as they appear in the Cluster Web Service WSDL. Therefore, if you are using the Cluster Web Service requests directly, use this format. The endeca-cmd utility utilizes calls to the Cluster Web Service and therefore lets you specify the same characteristics with a slightly different format. For example, the allowQueriesOnLeader parameter from the Cluster Web Service is equivalent to the endeca-cmd put-dd-profile --query-leader command. Whenever possible, both formats are included in the table.

If you would like to use default values for any of the parameters, depending on the tool you use for sending web service requests, you may or may not need to specify the default values explicitly. For example, soapUI fills in the default if it is not specified, by looking it up in the web service's WSDL. However, other clients may not fill in the default if you do not specify the parameter and its value explicitly.

Parameter Type Description
name string Name of the data domain profile. The endeca-cmd command does not have an equivalent option; the name is specified to endeca-cmd on the command line.
description string Description of the data domain profile. The equivalent command using this parameter in endeca-cmd is:
endeca-cmd put-dd-profile --description

If you use endeca-cmd, then if the description has spaces, it must be enclosed in double quotes.

allowQueriesOnLeader boolean If set to true, indicates that the leader node should handle read (non-updating) queries in addition to handling updating queries. The default is true. If set to false, indicates that the leader node should be dedicated to handling updating queries only. The equivalent command using this parameter in endeca-cmd is:
endeca-cmd put-dd-profile --query-leader true
Note: Non-updating queries represent read requests to the index. Updating queries change the index or other configuration settings in the Dgraph process; they represent write requests to the index.
numFollowers integer Specifies the desired number of follower nodes to handle non-updating query load. The default is 0 (in this case, the data domain cluster consists of one leader node). If you set allowQueriesOnLeader to false, numFollowers must be equal to or greater than one.
Note: Only one Dgraph node for each data domain could be hosted by each Endeca Server instance. Therefore, you must have a sufficient number of Endeca Server nodes before adding follower nodes to a data domain. For example, you cannot create a four-node data domain in an Endeca Server cluster with only three Endeca Server nodes.
The equivalent command with this parameter is:
endeca-cmd put-dd-profile --num-followers 2
readOnly boolean Indicates whether the Dgraph nodes in the data domain should be read-only. The default is false.
Note: If you set allowQueriesOnLeader to true, readOnly must be false.
The equivalent command with this parameter is:
endeca-cmd put-dd-profile --read-only false
allowOversubscribe boolean If set to true, the Endeca Server Cluster can exceed its capacity while sharing nodes hosting this data domain with other data domains. The default is true. The setting for this flag works together with numComputeThreads.

When this setting is true, the data domain cluster nodes may still use less capacity than is available on the Endeca Server node. However, this flag allows the Dgraph nodes to compete for processing threads configured in the data domain profile, even if the Endeca Server node may not have these threads available. The thread allocation in this case is handled by the operating system. Allowing the node to oversubscribe can be useful if you are setting data domain profiles for the development environment and would like to conserve hardware resources.

If a data domain cluster is allowed to oversubscribe, the Endeca Server cluster may host the nodes for this data domain on a node that does not have enough capacity. For example, an Endeca Server node with 12 CPUs can host 7 Dgraph nodes (for 7 different data domain clusters), each configured with --num-compute-threads 2 and --oversubscribe true. The Endeca Server nodes in this data domain's configuration are considered "oversubscribed".

If set to false, the nodes in the Endeca Server cluster hosting this data domain can share their resources only within their capacity. For example, if the Endeca Server node has 16 CPUs, and is hosting 2 Dgraph nodes each configured with 8 threads, it is sharing its capacity equally between two data domain nodes, utilizing 100% of its CPU, but not oversubscribing. Configuring the data domain profile that is not allowed to oversubscribe is useful when setting a data domain with high hardware-utilization requirements in a production environment. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --oversubscribe true
numComputeThreads integer The number of threads to allocate for processing requests on each Dgraph node serving a data domain using this profile. The number of threads should be equal to or greater than 4. The default is 4. The equivalent command with this parameter is:
endeca-cmd put-dd-profile ---num-compute-threads 4
computeCacheSizeMB integer The amount of RAM, in MB, to allocate to the result cache on each Dgraph node in the data domain. The default is 0, which is interpreted as follows:
When an absolute value is 0, the default Dgraph cache size is computed as 10% of the amount of RAM available on the Endeca Server node hosting the Dgraph node. The equivalent command with this parameter is similar to this example that specifies 1MB as value of the Dgraph cache:
endeca-cmd put-dd-profile --compute-cache-size 1
startupTimeoutSeconds integer The time for the Dgraph nodes to start up in the data domain. The default is 600 seconds. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --startup-timeout <num_sec>
shutdownTimeoutSeconds integer The time for the Dgraph nodes to gracefully shut down. The default is 30 seconds. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --shutdown-timeout <num_sec>
enableAncestorCounts boolean Optional. If enabled, computes counts for root managed attribute values and any intermediate managed attribute value selections. If the flag is not specified, or if it is specified, but the value is not specified, the default is false. In this case, the Dgraph only computes refinement counts for actual managed attribute values. It does not compute counts for root managed attribute values, or for any intermediate managed attribute value selections. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --ancestor-counts false
backlogTimeout integer Optional. The maximum number of seconds that a query is allowed to spend waiting in the processing queue before the Dgraph responds with a timeout message. If the flag is specified, but the value is not specified, the default value is 0 seconds.
minRefinementSamples integer Optional. The minimum number of records to sample during refinement computation. If the flag is specified, but the value is not specified, the default is 0. For most applications, larger values reduce performance without improving dynamic refinement ranking quality. For some applications with extremely large, non-hierarchical managed attributes (if they cannot be avoided), larger values can meaningfully improve dynamic refinement ranking quality with minor performance cost. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --refinement-sampling-min 0
enableExactImplicits boolean Optional. Disables approximate computation of implicit refinements. Use of this option is not recommended. If this option is not enabled (this is the default), managed attribute values without full coverage of the current result record set may sometimes be returned as implicit refinements, although the probability of such "false" implicit refinements is minuscule. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --implicit-exact false
numImplicitSamples integer Optional. Sets the maximum number of records to sample when computing implicit refinements (which are a performance tuning parameter). If the flag is specified, but the value is not specified, the default value is 1024. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --implicit-exact 1024
netTimeoutSeconds integer Optional. Specifies the maximum number of seconds the Dgraph waits for the client to download data from queries across the network. If the flag is specified, but the value is not specified, the default network timeout value is 30 seconds. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --net-timeout 30
maxSearchTerms integer Optional. Specifies the maximum number of terms for text search. If the flag is specified, but the value is not specified, the default value is 10. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --search-max 10
searchCharLimit integer Optional. Sets the maximum length (in characters) of a search term for record and value searches. The default is 132 characters. Any term exceeding this length will not be indexed, and thus will not be found in record and value searches. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --search-char-limit 132
snippetCutoff integer Optional. Limits the number of words in an attribute that the Dgraph evaluates to identify the snippet. If a match is not found within <num> words, the Dgraph does not return a snippet, even if a match occurs later in the attribute value. If the flag is specified, but <num> is not specified, the default is 500. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --snippet-cutoff 500
disableSnippets boolean Optional. Globally disables snippeting. The default is false, meaning snippeting is enabled. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --snippet-disable false
enableAllDynamicsMAs boolean Optional. Enable all available dynamic attribute value characteristics. If the flag is specified, but the value is not specified, the default value is false. Note that this option has performance implications and is not intended for production use. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --dynamic-category-enable false
disableContraction boolean Optional. Specifies to the Dgraph not to compute implicit managed attributes, and to only compute and present explicitly specified managed attributes, when displaying refinements in navigation results. If the flag is specified, but the value is not specified, the default is false. Specifying this flag does not reduce the size of the resulting record set that is being displayed; however, it improves run-time performance of the Dgraph process.
Be aware that if you use this flag, in order to receive meaningful navigation refinements, you need to make top-level precedence rules work for all outbound queries. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --contraction-disable false
maxWildcards integer Optional. Specifies the maximum number of terms that can match a wildcard term in a wildcard query that contains punctuation, such as ab*c.def*. If the flag is specified, but the value is not specified, the default value is 100. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --wildcard-max 100
sessionIdType string Specifies the method you would like to use for handling session affinity, when routing requests to this data domain. The default method is header — the HTTP header is used for session affinity if you don't specify sessionIdType in the data domain profile.
Available options are:
  • HEADER. The HTTP headers will be used for session affinity.
  • PARAMETER. The URL parameters will be used.
  • COOKIE. Cookies will be used.
The values for these options are not case-sensitive. The equivalent endeca-cmd command with this parameter is:
endeca-cmd put-dd-profile --session-id-type header
This example specifies header as the session ID key.
sessionIdKey string Specifies the name of the object to be checked, for establishing session affinity through one of the methods specified with sessionIdType. The value of sessionIdKey can be any string that is allowed to be used as an HTTP header, a URL parameter, or a cookie name. The default is X-Endeca-Session-ID. If you don't specify the value for sessionIdKey, it is used either as a header, a URL parameter, or a cookie. The equivalent endeca-cmd command with this parameter is:
endeca-cmd put-dd-profile --session-id-key X-Endeca-Session-ID
autoIdle boolean Indicates whether to idle a data domain after a timeout if no queries are issued for this data domain during the timeout period. The default is false. If set to false, the data domain is never made idle. If set to true, the data domain will be made idle after a timeout specified in the idleTimeoutMinutes setting of the data domain profile. When the data domain is made idle, the Endeca Server stops its Dgraph processes, and stops allocating resources to them. However, if end-users issue a query to such a data domain, the data domain is activated and its Dgraph processes are restarted by the Endeca Server. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --auto-idle false
idleTimeoutMinutes string (Optional). Indicates the timeout to use for idling, in minutes. The default is 15 minutes, it is used if you do not specify the timeout. 15 minutes is the lowest idle timeout value you can set. This timeout allows for long-running queries to complete successfully before the data domain is turned idle. If you have any queries running longer than 15 minutes, increase the idle timeout.
If a data domain is set to auto-idle and for the specified timeout period no queries arrive, it turns idle, and its Dgraph processes are stopped gracefully. If queries arrive, the data domain is activated. The timeout starts for a data domain that is set to auto-idle as soon as it is created and enabled. The timer is reset each time the data domain receives a query (and its Dgraph process is restarted) after being idle. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --auto-idle true --idle-timeout 30
args string List of flags to specify to the Dgraph processes serving a data domain using this profile. If not specified, no Dgraph flags are used. The equivalent command with this parameter is:
endeca-cmd put-dd-profile --args <list_of_args>

To obtain a list of flags, use: endeca-cmd put-dd-profile --args --usage. For a list of Dgraph flags, see the Oracle Endeca Server Administrator's Guide.