Data domain profile operations

These commands operate on Endeca data domain profiles.

A data domain profile is a named template that provides configuration settings to be used for the creation of Endeca data domains. The configuration settings for new data domains include:
  • The number of follower nodes required.
  • Whether the leader is dedicated to updating requests or is sharing a regular query load.
  • Whether the Endeca Server can oversubscribe its nodes while sharing them with other data domains.
  • Whether the Dgraph processes should be read-only.
  • Session affinity configuration.
  • The hardware characteristics of Dgraph processes (the number of threads and the cache size).
  • A set of Dgraph flags to use to start up the Dgraph processes. These flags help fine-tune performance of search and other computations.
  • Additional, low-level Dgraph flags.

The commands described in this topic create and delete data domain profiles, as well as return information about them.

Important: Once you create a data domain profile, you cannot change its configuration. However, you can create a different data domain profile, and use endeca-cmd update-dd command to update the existing data domain with this profile.
The commands for managing data domain profiles are:

put-dd-profile

The put-dd-profile command creates a new data domain profile with the specified name. Note that a default data domain profile (named default) always exists in the Endeca Server and does not need to be created with this command.

The syntax for this command is:
endeca-cmd put-dd-profile <new-profile-name> [global-options] [create-options]
where new-profile-name is the name of the new data domain.
The following additional command options can be used to change the created data domain profile's configuration (defaults are used otherwise):
Create Option Description
--description <text> A full description of the data domain profile. If the description contains spaces, it must be enclosed within double quotes. If this option is not specified, the description defaults to an empty string.
--query-leader <bool> Whether the leader node can accept read-only queries. If not specified, defaults to true.
--num-followers <int> How many follower nodes (Dgraph processes) should be configured in the data domain. If not specified, defaults to 0.
--read-only <bool> Whether a data domain should be created as read-only and not accept updating requests, including data loading requests. If not specified, defaults to false.
--oversubscribe <bool> Whether a data domain (or any of its nodes) can be placed on an oversubscribed Endeca Server instance. If not specified, defaults to true.
--num-compute-threads <int> Specifies the number of computational threads in the Dgraph node's threading pool. The value must be a positive integer that is equal or greater than 4. The recommended number of computational threads for the Dgraph process is typically equal to the number of CPU cores on the host Endeca Server node. If no value is specified or 0 is specified, this defaults to the larger of 4 or the number of available processors.
--compute-cache-size <int> Specifies an absolute value in MB for the Dgraph process cache, for each Dgraph node in the data domain.

If no value is specified or 0 is specified, this defaults to the Dgraph default cache size computed as 10% of the amount of RAM available on the Endeca Server node hosting the Dgraph node.

--startup-timeout <int> Specifies the maximum length of time (in seconds) that is allowed for the data domain's Dgraph process to start up. Default is 600 seconds.
--shutdown-timeout <int> Specifies the maximum length of time (in seconds) that is allowed for the data domain's Dgraph process to shut down. Default is 30 seconds.
--ancestor-counts <bool> Optional. If enabled, computes counts for root managed attribute values and any intermediate managed attribute value selections. The default is false. If this option is not enabled, the Dgraph only computes refinement counts for actual managed attribute values. It does not compute counts for root managed attribute values, or for any intermediate managed attribute value selections.
--backlog-timeout <int> Optional. The maximum number of seconds that a query is allowed to spend waiting in the processing queue before the Dgraph responds with a timeout message. The default value is 0 seconds.
--refinement-sampling-min <int> Optional. The minimum number of records to sample during refinement computation. The default is 0. For most applications, larger values reduce performance without improving dynamic refinement ranking quality. For some applications with extremely large, non-hierarchical managed attributes (if they cannot be avoided), larger values can meaningfully improve dynamic refinement ranking quality with minor performance cost.
--implicit-exact <bool> Optional. Disables approximate computation of implicit refinements. Use of this option is not recommended. If this option is not enabled (this is the default), managed attribute values without full coverage of the current result record set may sometimes be returned as implicit refinements, although the probability of such "false" implicit refinements is minuscule.
--implicit-sample <int> Optional. Sets the maximum number of records to sample when computing implicit refinements (which are a performance tuning parameter). The default value is 1024.
--net-timeout <int> Optional. Specifies the maximum number of seconds the Dgraph waits for the client to download data from queries across the network. The default network timeout value is 30 seconds.
--search-max <int> Optional. Sets the maximum number of terms for text search. The default value is 10.
--search-char-limit <int> Optional. Sets the maximum length (in characters) of a search term for record and value searches. The default is 132 characters. Any term exceeding this length will not be indexed, and thus will not be found in record and value searches.
--snippet-cutoff <int> Optional. Limits the number of words in an attribute that the Dgraph evaluates to identify the snippet. If a match is not found within <num> words, the Dgraph does not return a snippet, even if a match occurs later in the attribute value. If the flag is not specified, or <num> is not specified, the default is 500.
--snippet-disable <bool> Optional. Globally disables snippeting. The default is false, meaning snippeting is enabled.
--dynamic-category-enable <bool> Optional. Enable all available dynamic attribute value characteristics. The default is false. Note that this option has performance implications and is not intended for production use.
--contraction-disable <bool> Optional. Specifies to the Dgraph not to compute implicit managed attributes, and to only compute and present explicitly specified managed attributes, when displaying refinements in navigation results. The default is false. Specifying this flag does not reduce the size of the resulting record set that is being displayed; however, it improves run-time performance of the Dgraph process.

Be aware that if you use this flag, in order to receive meaningful navigation refinements, you need to make top-level precedence rules work for all outbound queries.

--wildcard-max <int> Optional. Specifies the maximum number of terms that can match a wildcard term in a wildcard query that contains punctuation, such as ab*c.def*. The default is 100.
--auto-idle <bool> Specifies whether the Endeca Server should turn this data domain idle, if the data domain receives no queries during the specified --idle-timeout. If auto-idle is set to false, the data domain never turns idle. If set to true, the data domain turns idle after idle-timeout expires and if no queries arrive during the timeout period. The data domain is considered idle if it is set to auto-idle in its data domain profile, and if Endeca Server stops its Dgraph process. An idle data domain automatically activates if it receives a query. The timer for the timeout is reset once the idle data domain is activated.
--idle-timeout <int> Optional. Specifies the timeout period, in minutes, after which the data domain that is configured to auto-idle is turned idle by the Endeca Server, if the data domain receives no queries during the timeout period. The default is 10 minutes, and it is used if the timeout is not specified.

This setting is not used for data domains for which auto-idle is set to false.

--session-id-type <type> The method to use for establishing session affinity.

The available options are: header (for HTTP headers), parameter (for URL parameters), or cookie. The default method is header.

--session-id-key <name> The name of the key to use for maintaining affinity.

The default name is X-Endeca-Session-ID.

--args <dgraph-flags> Specifies a list of the additional Dgraph flags that will be used for the data domain's Dgraph process. The --args flag must be the last flag on the command line, as all of its arguments are passed on to the Dgraph process.
--args --usage Provides a list of the available Dgraph process flags. See Dgraph flags.
This example:
endeca-cmd put-dd-profile MyProfile --description "group profile" --oversubscribe false --net-timeout 60
creates the data domain profile named MyProfile, which cannot be placed on an oversubscribed Endeca Server instance, and starts the Dgraph with a network timeout value of 60 seconds. MyProfile uses the default values for the other configuration settings.

get-dd-profile

The get-dd-profile command lists the characteristics of the data domain profile with the specified name.

The syntax for this command is:
endeca-cmd get-dd-profile <profile-name> [global-options]
where profile-name is the name of an existing data domain profile. This command has no options.
This example:
endeca-cmd get-dd-profile MyProfile
returns the following details for the data domain profile named MyProfile:
MyProfile
Description: group profile
AllowQueriesOnLeader: true
AllowOverSubscribe: false
NumFollowers: 0
ReadOnly: false
NumComputeThreads: 4
ComputeCacheSizeMB: 0
StartupTimeoutSeconds: 600
ShutdownTimeoutSeconds: 30
AutoIdle: false
SessionIdType: HEADER
SessionIdKey: X-Endeca-Session-ID
Args: []

list-dd-profiles

The list-dd-profiles command returns a list of data domain profiles configured in the Endeca Server cluster. For each data domain profile, this command returns its name, description, and other characteristics, such as the number of nodes, the number of query processing threads, and the list of arguments (if any) that are sent to the Dgraph processes for this profile.

The syntax for this command is:
endeca-cmd list-dd-profiles [--verbose] [global-options]

The --verbose option includes additional status information for each data domain.

delete-dd-profile

The delete-dd-profile command deletes the data domain profile with the specified name. This command does not affect any data domains that may be using this profile.

The syntax for this command is:
endeca-cmd delete-dd-profile <profile-name> [global-options]
where profile-name is the name of an existing data domain profile. This command has no options.
This example:
endeca-cmd delete-dd-profile MyProfile
deletes the data domain profile named MyProfile.
Note that you cannot delete the default data domain profile (named default). If you attempt to do so, the command fails with this error:
OES-000107: Cannot delete the default data domain profile.

You can delete a data domain profile that was used to create an existing data domain — when the data domain is created, the configuration information from the profile is copied into the data domain's state. Therefore, the data domain does not use the profile after creation.