The baseline
update script defined in the
DataIngest.xml
document for a Dgraph deployment is
included in this section, with numbered steps indicating the actions performed
at each point in the script.
<script id="BaselineUpdate"> <![CDATA[ log.info("Starting baseline update script.");
Obtain lock. The baseline update attempts to set an
"update_lock"
flag in the EAC to serve as a lock or mutex. If the flag is already set, this step fails, ensuring that the update cannot be started more than once simultaneously, as this would interfere with data processing. The flag is removed in the case of an error or when the script completes successfully.// obtain lock if (LockManager.acquireLock("update_lock")) {
Validate data readiness. Check that a flag called "
baseline_data_ready
" has been set in the EAC. This flag is set as part of the data extraction process to indicate that files are ready to be processed (or, in the case of an application that uses direct database access, the flag indicates that a database staging table has been loaded and is ready for processing). This flag is removed as soon as the script copies the data out of thedata/incoming
directory, indicating that new data may be extracted.// test if data is ready for processing if (Forge.isDataReady()) {
Clean processing directories. Files from the previous update are removed from the
data/processing
,data/forge_output
,data/temp
,data/dgidx_output
anddata/partials/cumulative_partials
directories.// clean directories Forge.cleanDirs(); PartialForge.cleanCumulativePartials(); Dgidx.cleanDirs();
Copy data to processing directory. Extracted data in
data/incoming
is copied todata/processing
.// fetch extracted data files to forge input Forge.getIncomingData();
Release Lock. The "
baseline_data_ready
" flag is removed from the EAC, indicating that the incoming data has been retrieved for baseline processing.LockManager.releaseLock("baseline_data_ready");
Copy config to processing directory. Configuration files are copied from
data/complete_index_config
todata/processing
.// fetch config files to forge input Forge.getConfig();
Archive Forge logs. The
logs/forges/Forge
directory is archived, to create a fresh logging directory for the Forge process and to save the previous Forge run's logs.// archive logs Forge.archiveLogDir();
Forge. The Forge process executes.
Forge.run();
Archive Dgidx logs. The
logs/dgidxs/Dgidx
directory is archived, to create a fresh logging directory for the Dgidx process and to save the previous Dgidx run's logs.// archive logs Dgidx.archiveLogDir();
Dgidx. The Dgidx process executes.
Dgidx.run();
Distribute index to each server. A single copy of the new index is distributed to each server that hosts a Dgraph. If multiple Dgraphs are located on the same server but specify different
srcIndexDir
attributes, multiple copies of the index are delivered to that server.Update MDEX Engines. The Dgraphs are updated. Engines are updated according to the
restartGroup
property specified for each Dgraph. The update process for each Dgraph is as follows:This somewhat complex update functionality is implemented to minimize the amount of time that a Dgraph is stopped. This restart approach ensures that the Dgraph is stopped just long enough to rename two directories.
// distributed index, update Dgraphs DistributeIndexAndApply.run();
<script id="DistributeIndexAndApply"> <bean-shell-script> <![CDATA[ DgraphCluster.cleanDirs(); DgraphCluster.copyIndexToDgraphServers(); DgraphCluster.applyIndex(); ]]> </bean-shell-script> </script>
The latest dimension values generated by the Forge process (including values for autogen and external dimensions) are copied to configDir defined in Forge component.
Forge.getPostForgeDimensions()
Archive index and Forge state. The newly created index and the state files in Forge's state directory are archived on the indexing server.
// archive state files, index Forge.archiveState(); Dgidx.archiveIndex();
Cycle LogServer. The LogServer is stopped and restarted. During the downtime, the LogServer's error and output logs are archived.
// cycle LogServer LogServer.cycle();
Release Lock. The "
update_lock
" flag is removed from the EAC, indicating that another update may be started.// release lock LockManager.releaseLock("update_lock"); log.info("Baseline update script finished."); } else { log.warning("Failed to obtain lock."); } ]]> </bean-shell-script> </script>