You can specify default settings for the Dgraphs in an application, as well as group them into Dgraph clusters in order to easily perform operations on multiple Dgraphs.
The
dgraph-defaults
element in the
DgraphDefaults.xml
file defines shared settings
that are inherited (or overridden) by each Dgraph in an application. These
default properties are used in the baseline and partial update scripts to
define operational functionality.
Properties specify general Dgraph properties.
srcIndexDir
- Location from which a new index will be copied to a local directory on the Dgraph host.srcIndexHostId
- Host from which a new index will be copied to a local directory on the Dgraph host.srcPartialsDir
- Location from which a new partial update will be copied to a local directory on the Dgraph host.srcPartialsHostId
- Host from which partial updates will be copied to a local directory on the Dgraph host.srcCumulativePartialsDir
- Location from which all partial updates accumulated since the last baseline update will be copied to a local directory on the Dgraph host.srcCumulativePartialsHostId
- Host from which all partial updates accumulated since the last baseline update will be copied to a local directory on the Dgraph host.srcDgraphConfigDir
- Location from which Dgraph configuration files will be copied to a local directory on the Dgraph host.srcDgraphConfigHostId
- Host from which Dgraph configuration files will be copied to a local directory on the Dgraph host.shutdownTimeout
- Number of seconds to wait for a component to stop (after receiving a stop command).numIdleSecondsAfterStop
- Number of seconds to pause/sleep after a component is stopped. Typically, this will be used to ensure that log file locks are release by the component before proceeding.
Directories specify local directories where generated Dgraph files are copied.
localIndexDir
- Local directory to which a single copy of a new index is copied from the source index directory on the source index host.localCumulativePartialsDir
- Local directory to which partial updates are copied from the source (cumulative) partials directory on the source partials host.localDgraphConfigDir
- Local directory to which Dgraph configuration files are copied from the source Dgraph config directory on the source Dgraph config host.
Args specify arguments passed in as command-line flags. Flags
that require arguments include successive
<arg>
elements that specify those arguments.
Note that these settings are not cumulative. Specifying
<args>
for an individual Dgraph completely
overrides any default settings.
<!-- ####################################################################### # Global Dgraph settings, inherited by all dgraphs # --> <dgraph-defaults> <properties> <property name="srcIndexDir" value="./data/dgidx_output" /> <property name="srcIndexHostId" value="ITLHost" /> <property name="srcPartialsDir" value="./data/partials/forge_output" /> <property name="srcPartialsHostId" value="ITLHost" /> <property name="srcCumulativePartialsDir" value="./data/partials/cumulative_partials" /> <property name="srcCumulativePartialsHostId" value="ITLHost" /> <property name="srcDgraphConfigDir" value="./data/workbench/dgraph_config" /> <property name="srcDgraphConfigHostId" value="ITLHost" /> <property name="numLogBackups" value="10" /> <property name="shutdownTimeout" value="30" /> <property name="numIdleSecondsAfterStop" value="0" /> </properties> <directories> <directory name="localIndexDir">./data/dgraphs/local_dgraph_input</directory> <directory name="localCumulativePartialsDir">./data/dgraphs/local_cumulative_partials</directory> <directory name="localDgraphConfigDir">./data/dgraphs/local_dgraph_config</directory> </directories> <args> <arg>--threads</arg> <arg>2</arg> <arg>--whymatch</arg> <arg>--spl</arg> <arg>--dym</arg> <arg>--dym_hthresh</arg> <arg>5</arg> <arg>--dym_nsug</arg> <arg>3</arg> <arg>--stat-abins</arg> </args> <startup-timeout>120</startup-timeout> </dgraph-defaults>
The
<dgraph>
element defines settings for a specific
Dgraph. Individual Dgraphs are typically defined in the same file that defines
the Dgraph clusters which contain them.
<dgraph id="AuthoringDgraph" host-id="AuthoringMDEXHost" port="15002"> <properties> <property name="DgraphContentGroup" value="Authoring" /> </properties> <log-dir>./logs/dgraphs/AuthoringDgraph</log-dir> <input-dir>./data/dgraphs/AuthoringDgraph/dgraph_input</input-dir> <update-dir>./data/dgraphs/AuthoringDgraph/dgraph_input/updates</update-dir> </dgraph>
Each Dgraph takes the following attributes:
It also takes the following nested elements:
Dgraph clusters apply actions to an entire cluster of Dgraphs, rather than manually iterating over a number of Dgraphs. They contain logic associated with Dgraph restart strategies, and can also be configured to copy data in parallel or serially. This setting applies to copies that are performed to distribute a new index, partial updates or configuration updates to each server that hosts a Dgraph.
You can define multiple clusters, with no restriction around which
Dgraphs belong to each cluster or how many clusters a Dgraph belongs to.
Typically, clusters are defined with the
<dgraph-cluster>
element in
AuthoringDgraphCluster
and
LiveDgraphCluster
XML files, with references to
all Dgraphs that belong to that cluster. Each file also includes host
information for the cluster. Each host must be given a unique ID. The port
specified for each host is the port on which the EAC Agent is listening, which
is the Endeca HTTP Service port on that server.:
<!-- ######################################################################## # Authoring MDEX Hosts - The machines used to host all MDEX processes # for the 'authoring environment' MDEX cluster. # --> <host id="AuthoringMDEXHost" hostName="myhost1.company.com" port="8888" /> <!-- ######################################################################## # Authoring Dgraph Cluster - The 'authoring environment' MDEX cluster. # --> <dgraph-cluster id="AuthoringDgraphCluster" getDataInParallel="true" enabled="true" configSnapshotDir="./data/dgraphcluster/AuthoringDgraphCluster/config_snapshots"> <dgraph ref="AuthoringDgraph" /> </dgraph-cluster>
Each Dgraph cluster takes the following attributes:
In addition to standard Dgraph configuration and process arguments in
a
<dgraph>
element, the following custom
properties define restart, update, and promotion behavior:
The
restartGroup
property indicates the Dgraph's
membership in a restart group. When applying a new index or configuration
updates to a cluster of Dgraphs (or when updating a cluster of Dgraphs with a
provisioning change such as a new or modified process argument), the Dgraph
cluster object applies changes simultaneously to all Dgraphs in a restart
group.
Similarly, the
updateGroup
property indicates the Dgraph's membership
in an update group. When applying partial updates, the Dgraph cluster object
applies changes simultaneously to all Dgraphs in an update group.
This means that a few common restart strategies can be applied as follows:
To restart/update all Dgraphs at the same time: specify the same restartGroup/updateGroup value for each Dgraph.
To restart/update Dgraphs one at a time: specify a unique restartGroup/updateGroup value for each Dgraph, or omit one or both of the custom properties on all Dgraphs (causing the template to assign a unique group to each Dgraph).
To restart/update Dgraphs on each server simultaneously: specify the same restartGroup/updateGroup value for each Dgraph on a physical server.
To restart Dgraphs one at a time but apply partial updates to all Dgraphs at once: specify a unique restartGroup value for each Dgraph and specify the same updateGroup value for each Dgraph.
Restart and update group values are arbitrary strings. The DgraphCluster will iterate through the groups in alphabetical order, though non-standard characters may result in groups being updated in an unexpected order.
The
DgraphContentGroup
property indicates whether a Dgraph
belongs to a "Live" or "Authoring" environment.
Dgraph components can specify the name of a script to invoke prior to shutdown and the name of a script to invoke after the component is started. These optional attributes must specify the ID of a Script defined in the XML file(s). These BeanShell scripts are executed just before the Dgraph is stopped or just after it is started. The scripts behave identically to other BeanShell scripts, except that they have an additional variable, invokingObject, which holds a reference to the Dgraph that invoked the script. This functionality is typically used to implement calls to a load balancer, adding or removing a Dgraph from the cluster as it is updated.
The following example shows two dummy scripts (which just log a message, but could be extended to call out to a load balancer) provisioned to run pre-shutdown and post-startup for Dgraph1.
<dgraph id="Dgraph1" host-id="MDEXHost" port="15000" pre-shutdown-script="DgraphPreShutdownScript" post-startup-script="DgraphPostStartupScript"> <properties> <property name="restartGroup" value="A" /> </properties> <log-dir>./logs/dgraphs/Dgraph1</log-dir> <input-dir>./data/dgraphs/Dgraph1/dgraph_input</input-dir> <update-dir>./data/dgraphs/Dgraph1/dgraph_input/updates</update-dir> </dgraph> <script id="DgraphPreShutdownScript"> <bean-shell-script> <![CDATA[ id = invokingObject.getElementId(); hostname = invokingObject.getHost().getHostName(); port = invokingObject.getPort(); log.info("Removing dgraph with id " + id + " (host: " + hostname + ", port: " + port + ") from load balancer cluster."); ]]> </bean-shell-script> </script> <script id="DgraphPostStartupScript"> <bean-shell-script> <![CDATA[ id = invokingObject.getElementId(); hostname = invokingObject.getHost().getHostName(); port = invokingObject.getPort(); log.info("Adding dgraph with id " + id + " (host: " + hostname + ", port: " + port + ") to load balancer cluster."); ]]> </bean-shell-script> </script>
The following log excerpt shows these scripts running when a new index is being applied to the dgraph:
[03.10.08 10:03:28] INFO: Applying index to dgraphs in restart group 'A'. [03.10.08 10:03:28] INFO: [MDEXHost] Starting shell utility 'mkpath_dgraph-input-new'. [03.10.08 10:03:30] INFO: [MDEXHost] Starting copy utility 'copy_index_to_temp_new_dgraph_input_dir_for_Dgraph1'. [03.10.08 10:03:35] INFO: Removing dgraph with id Dgraph1 (host: mdex1.mycompany.com, port: 15000) from load balancer cluster. [03.10.08 10:03:35] INFO: Stopping component 'Dgraph1'. [03.10.08 10:03:37] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input_to_dgraph-input-old'. [03.10.08 10:03:39] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input-new_to_dgraph-input'. [03.10.08 10:03:40] INFO: [MDEXHost] Starting backup utility 'backup_log_dir_for_component_Dgraph1'. [03.10.08 10:03:42] INFO: [MDEXHost] Starting component 'Dgraph1'. [03.10.08 10:03:45] INFO: Adding dgraph with id Dgraph1 (host: mdex1.mycompany.com, port: 15000) to load balancer cluster. [03.10.08 10:03:45] INFO: [MDEXHost] Starting shell utility 'rmdir_dgraph-input-old'.
Note that the
dgraph-default
element can also specify the use of
pre-shutdown and post-startup scripts as attributes, allowing all Dgraphs in an
application to execute the same scripts. For example:
<dgraph-defaults pre-shutdown-script="DgraphPreShutdownScript" post-startup-script="DgraphPostStartupScript"> ... </dgraph-defaults>
You can configure the Dgraph for SSL by using the following elements to define the certificates to use for SSL:
cert-file
specifies the path of theeneCert.pem
certificate file that is used by the Dgraph to present to any client. This is also the certificate that the Application Controller Agent should present to the Dgraph when trying to talk to the Dgraph.ca-file
specifies the path of theeneCA.pem
Certificate Authority file that the Dgraph uses to authenticate communications with other Guided Search components.cipher
specifies one or more cryptographic algorithms, one of which Dgraph will use during the SSL negotiation. If you omit this setting, the Dgraph chooses a cryptographic algorithm from its internal list of algorithms. See the Endeca Commerce Security Guide for more information
All three elements are first-level children of the
<dgraph-defaults>
element.
The following example shows the three SSL elements being used within
the
dgraph-default
element:
<dgraph-defaults> ... <cert-file> C:\Endeca\PlatformServices\workspace\etc\eneCert.pem </cert-file> <ca-file> C:\Endeca\PlatformServices\workspace\etc\eneCA.pem </ca-file> <cipher>AES128-SHA</cipher> </dgraph-defaults>