Dgraphs

If a Dgraph deployment type is chosen, a Dgraph cluster component is defined.

This object is used to apply actions to an entire cluster of Dgraphs, rather than manually iterating over a number of Dgraphs. In addition, the object contains logic associated with Dgraph restart strategies, which are described below. Multiple Dgraph clusters can be defined, with no restriction around which Dgraphs belong to each cluster or how many clusters a Dgraph belongs to.

A Dgraph cluster is configured (via the dgraph-cluster element) with references to all Dgraphs that belong to that cluster. In addition, the cluster can be configured to copy data in parallel or serially. This setting applies to copies that are performed to distribute a new index, partial updates or configuration updates to each server that hosts a Dgraph. By default, the template sets this value to true.
<!--
########################################################################
# Dgraph Cluster
#
-->
<dgraph-cluster id="DgraphCluster" getDataInParallel="true">
  <dgraph ref="Dgraph1" />
  <dgraph ref="Dgraph2" />
</dgraph-cluster>

Two Dgraphs are defined by the template by default.

Global Dgraph settings

In order to avoid defining shared configuration for multiple Dgraphs in each Dgraph's XML configuration, the document provides the dgraph-defaults element, where shared settings can be configured and inherited (or overridden) by each Dgraph defined in the document. This defaults object specifies a number of custom configuration properties that are used by the update scripts to define operational functionality.
  • numLogBackups - Number of log directory backups to store.
  • shutdownTimeout - Number of seconds to wait for a component to stop (after receiving a stop command).
  • numIdleSecondsAfterStop - Number of seconds to pause/sleep after a component is stopped. Typically, this will be used to ensure that log file locks are release by the component before proceeding.
  • srcIndexDir - Location from which a new index will be copied to a local directory on the Dgraph's host.
  • srcIndexHostId - Host from which a new index will be copied to a local directory on the Dgraph's host.
  • localIndexDir - Local directory to which a single copy of a new index is copied from the source index directory on the source index host.
  • srcPartialsDir - Location from which a new partial update will be copied to a local directory on the Dgraph's host.
  • srcCumulativePartialsDir - Location from which all partial updates accumulated since the last baseline update will be copied to a local directory on the Dgraph's host.
  • srcPartialsHostId - Host from which partial updates will be copied to a local directory on the Dgraph's host.
  • localCumulativePartialsDir - Local directory to which partial updates are copied from the source (cumulative) partials directory on the source partials host.
  • srcDgraphConfigDir - Location from which Dgraph configuration files will be copied to a local directory on the Dgraph's host.
  • srcDgraphConfigHostId - Host from which Dgraph configuration files will be copied to a local directory on the Dgraph's host.
  • localDgraphConfigDir - Local directory to which Dgraph configuration files are copied from the source Dgraph config directory on the source Dgraph config host.
  • srcXQueryHostId - Host from which XQuery modules will be copied to a local directory on the Dgraph's host.
  • srcXQueryDir - Location from which XQuery modules will be copied to a local directory on the Dgraph's host.
  • localXQueryDir - Local directory to which XQuery modules are copied from the source Dgraph XQuery directory on the source Dgraph XQuery modules host.
  • skipTestingForFilesDuringCleanup - Used for directory-cleaning operations. If set to "true", will skip the directory-contents test and instead proceed directly to cleaning the directory. The default behavior is to test the directory contents and skip cleanup if the directory is not empty.
  • The properties documented in the "Fault tolerance and polling interval properties" topic.
<!--
#######################################################################
# Global Dgraph settings, inherited by all dgraphs
#
-->
<dgraph-defaults>
  <properties>
    <property name="srcIndexDir" value="./data/dgidx_output" />
    <property name="srcIndexHostId" value="ITLHost" />
    <property name="srcPartialsDir" value="./data/partials/forge_output" />
    <property name="srcPartialsHostId" value="ITLHost" />
    <property name="srcCumulativePartialsDir" value="./data/partials/cumulative_partials" />
    <property name="srcCumulativePartialsHostId" value="ITLHost" />
    <property name="srcDgraphConfigDir" value="./data/web_studio/dgraph_config" />
    <property name="srcDgraphConfigHostId" value="ITLHost" />
    <property name="srcXQueryHostId" value="ITLHost" />
    <property name="srcXQueryDir" value="./config/lib/xquery" />
    <property name="numLogBackups" value="10" />
    <property name="shutdownTimeout" value="30" />
    <property name="numIdleSecondsAfterStop" value="0" />
  </properties>
  <directories>
    <directory name="localIndexDir">./data/dgraphs/local_dgraph_input</directory>
    <directory name="localCumulativePartialsDir">./data/dgraphs/local_cumulative_partials</directory>
    <directory name="localDgraphConfigDir">./data/dgraphs/local_dgraph_config</directory>
    <directory name="localXQueryDir">./data/dgraphs/local_xquery</directory>
  </directories>
  <args>
    <arg>--threads</arg>
    <arg>2</arg>
    <arg>--spl</arg>
    <arg>--dym</arg>
    <arg>--xquery_path</arg>
    <arg>./data/dgraphs/local_xquery</arg>
  </args>
  <startup-timeout>120</startup-timeout>
</dgraph-defaults>

Each Dgraph defined in the document (via the dgraph element) inherits from the settings defined in the dgraph-defaults element, and also specifies settings that are unique to the Dgraph.

Note: As of version 3.1 of the Deployment Template, the numCacheWarmupSeconds and offlineUpdate properties are ignored (and warning messages generated) because they are not supported in the 6.1.x MDEX Engine.

Restart and update custom properties

In addition to standard Dgraph configuration and process arguments, the dgraph element adds two custom properties that define restart and update strategies:
  • restartGroup
  • updateGroup

The restartGroup property indicates the Dgraph's membership in a restart group. When applying a new index or configuration updates to a cluster of Dgraphs (or when updating a cluster of Dgraphs with a provisioning change such as a new or modified process argument), the Dgraph cluster object applies changes simultaneously to all Dgraphs in a restart group.

Similarly, the updateGroup property indicates the Dgraph's membership in an update group. When applying partial updates, the Dgraph cluster object applies changes simultaneously to all Dgraphs in an update group.

This means that a few common restart strategies can be applied as follows:
  • To restart/update all Dgraphs at once: specify the same restartGroup/updateGroup value for each Dgraph.
  • To restart/update Dgraphs one at a time: specify a unique restartGroup/updateGroup value for each Dgraph, or omit one or both of the custom properties on all Dgraphs (causing the template to assign a unique group to each Dgraph).
  • To restart/update Dgraphs on each server simultaneously: specify the same restartGroup/updateGroup value for each Dgraph on a physical server.
  • To restart Dgraphs one at a time but apply partial updates to all Dgraphs at once: specify a unique restartGroup value for each Dgraph and specify the same updateGroup value for each Dgraph.
<dgraph id="Dgraph1" host-id="MDEXHost" port="15000">
  <properties>
    <property name="restartGroup" value="A" />
    <property name="updateGroup" value="a" />
  </properties>
  <log-dir>./logs/dgraphs/Dgraph1</log-dir>
  <input-dir>./data/dgraphs/Dgraph1/dgraph_input</input-dir>
  <update-dir>./data/dgraphs/Dgraph1/dgraph_input/updates</update-dir>
</dgraph>

Restart and update group values are arbitrary strings. The DgraphCluster will iterate through the groups in alphabetical order, though non-standard characters may result in groups being updated in an unexpected order.

Running scripts

Dgraph components can specify the name of a script to invoke prior to shutdown and the name of a script to invoke after the component is started. These optional attributes must specify the ID of a Script defined in the XML file(s). These BeanShell scripts are executed just before the Dgraph is stopped or just after it is started. The scripts behave identically to other BeanShell scripts, except that they have an additional variable, invokingObject, which holds a reference to the Dgraph that invoked the script. This functionality is typically used to implement calls to a load balancer, adding or removing a Dgraph from the cluster as it is updated.

The following example shows two dummy scripts (which just log a message, but could be extended to call out to a load balancer) provisioned to run pre-shutdown and post-startup for Dgraph1.

<dgraph id="Dgraph1" host-id="MDEXHost" port="15000" 
    pre-shutdown-script="DgraphPreShutdownScript"
    post-startup-script="DgraphPostStartupScript">
  <properties>
    <property name="restartGroup" value="A" />
  </properties>
  <log-dir>./logs/dgraphs/Dgraph1</log-dir>
  <input-dir>./data/dgraphs/Dgraph1/dgraph_input</input-dir>
  <update-dir>./data/dgraphs/Dgraph1/dgraph_input/updates</update-dir>
</dgraph>

<script id="DgraphPreShutdownScript">
  <bean-shell-script>
    <![CDATA[ 
    id = invokingObject.getElementId();
    hostname = invokingObject.getHost().getHostName();
    port = invokingObject.getPort();
    log.info("Removing dgraph with id " + id + " (host: " + hostname + 
      ", port: " + port + ") from load balancer cluster.");
    ]]>
  </bean-shell-script>
</script>

<script id="DgraphPostStartupScript">
  <bean-shell-script>
    <![CDATA[ 
    id = invokingObject.getElementId();
    hostname = invokingObject.getHost().getHostName();
    port = invokingObject.getPort();
    log.info("Adding dgraph with id " + id + " (host: " + hostname + 
      ", port: " + port + ") to load balancer cluster.");
    ]]>
  </bean-shell-script>
</script>
The following log excerpt shows these scripts running when a new index is being applied to the dgraph:
[03.10.08 10:03:28] INFO: Applying index to dgraphs in restart group 'A'.
[03.10.08 10:03:28] INFO: [MDEXHost] Starting shell utility 'mkpath_dgraph-input-new'.
[03.10.08 10:03:30] INFO: [MDEXHost] Starting copy utility 'copy_index_to_temp_new_dgraph_input_dir_for_Dgraph1'.
[03.10.08 10:03:35] INFO: Removing dgraph with id Dgraph1 (host: mdex1.mycompany.com, port: 15000) from load balancer cluster.
[03.10.08 10:03:35] INFO: Stopping component 'Dgraph1'.
[03.10.08 10:03:37] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input_to_dgraph-input-old'.
[03.10.08 10:03:39] INFO: [MDEXHost] Starting shell utility 'move_dgraph-input-new_to_dgraph-input'.
[03.10.08 10:03:40] INFO: [MDEXHost] Starting backup utility 'backup_log_dir_for_component_Dgraph1'.
[03.10.08 10:03:42] INFO: [MDEXHost] Starting component 'Dgraph1'.
[03.10.08 10:03:45] INFO: Adding dgraph with id Dgraph1 (host: mdex1.mycompany.com, port: 15000) to load balancer cluster.
[03.10.08 10:03:45] INFO: [MDEXHost] Starting shell utility 'rmdir_dgraph-input-old'.
Note that the dgraph-default element can also specify the use of pre-shutdown and post-startup scripts as attributes, allowing all Dgraphs in an application to execute the same scripts. For example:
<dgraph-defaults pre-shutdown-script="DgraphPreShutdownScript"
    post-startup-script="DgraphPostStartupScript">

  ...

</dgraph-defaults>

Deploying XQuery modules

The Deployment Template supports the distribution of XQuery modules to each Dgraph in the group. The [appdir]config/lib/xquery directory is provided for users to store their XQuery modules. In addition, a LoadXQueryModules script (in the AppConfig.xml file) distributes the XQuery modules to Dgraph servers and instructs the Dgraphs to load the modules.

The procedure to deploy the XQuery modules is:
  1. Make certain that the dgraph-defaults section of the AppConfig.xml file has the XQuery properties set. These global Dgraph setting properties are srcXQueryHostId, srcXQueryDir, and localXQueryDir.
  2. Make certain that the Dgraph --xquery_path flag is specified as an argument in the dgraph-defaults section.
  3. Place all the XQuery code in the [appdir]/config/lib/xquery and [appdir]/config/lib/xquery/lib directories.
  4. Execute the runcommand script with the LoadXQueryModules argument, as in this Windows example:
    C:\Endeca\Apps\control>runcommand LoadXQueryModules

The XQuery modules are distributed to the Dgraphs in the deployment and they are instructed to reload/compile the modules.

Specifying arguments for the Dgraphs

Both the dgraph and dgraph-defaults elements allow you to use the args sub-element to pass command-line flags to the Dgraphs. However, if you use an args section in both the dgraph and dgraph-defaults configurations, the results are not cumulative.

Instead, the args section for an individual Dgraph completely overrides the dgraph-defaults definition (i.e., it does not inherit the parameters that are specified in the dgraph-defaults section and then add the ones that are unique for that Dgraph).

Enabling SSL for the Dgraph

You can configure the Dgraph for SSL by using the following elements to define the certificates to use for SSL:
  • cert-file specifies the path of the eneCert.pem certificate file that is used by the Dgraph to present to any client. This is also the certificate that the Application Controller Agent should present to the Dgraph when trying to talk to the Dgraph.
  • ca-file specifies the path of the eneCA.pem Certificate Authority file that the Dgraph uses to authenticate communications with other Oracle Endeca components.
  • cipher specifies an optional cipher string (such as RC4-SHA) that specifies the minimum cryptographic algorithm that the Dgraph uses during the SSL negotiation. If you omit this setting, the SSL software tries an internal list of ciphers, beginning with AES256-SHA. See the Oracle Endeca Platform Services Security Guide for more information.

All three elements are first-level children of the <dgraph-defaults> element.

The following example shows the three SSL elements being used within the dgraph-default element:
<dgraph-defaults>
...
   <cert-file>
      C:\Endeca\PlatformServices\workspace\etc\eneCert.pem
   </cert-file>
   <ca-file>
      C:\Endeca\PlatformServices\workspace\etc\eneCA.pem
   </ca-file>
	  <cipher>AES128-SHA</cipher>
</dgraph-defaults>