The partial update script defined in the DataIngest.xml document for a Dgraph deployment is included in this section, with numbered steps indicating the actions performed at each point in the script.

<script id="PartialUpdate">
  <bean-shell-script>
    <![CDATA[

  1. Obtain lock. The partial update attempts to set an "update_lock" flag in the EAC to serve as a lock or mutex. If the flag is already set, this step fails, ensuring that the update cannot be started more than once simultaneously, as this would interfere with data processing. The flag is removed in the case of an error or when the script completes successfully.

        log.info("Starting partial update script.");
          // obtain lock
          if (LockManager.acquireLock("update_lock")) {
  2. Validate data readiness. Test that the EAC contains at least one flag with the prefix "partial_extract::". One of these flags should be created for each successfully and completely extracted file, with the prefix "partial_extract::" prepended to the extracted file name (e.g. "partial_extract::adds.txt.gz"). These flags are deleted during data processing and must be created as new files are extracted.

        // test if data is ready for processing
        if (PartialForge.isPartialDataReady()) {
    
  3. Archive partial logs. The logs/partial directory is archived, to create a fresh logging directory for the partial update process and to save the previous run's logs.

        // archive logs
        PartialForge.archiveLogDir();
    
  4. Clean processing directories. Files from the previous update are removed from the data/partials/processing, data/partials/forge_output, and data/temp directories.

        // clean directories
        PartialForge.cleanDirs();
    
  5. Move data and config to processing directory. Extracted files in data/partials/incoming with matching "partials_extract::" flags in the EAC are moved to data/partials/processing. Configuration files are copied from config/pipeline to data/processing.

        // fetch extracted data files to forge input
        PartialForge.getPartialIncomingData();
    
        // fetch config files to forge input
        PartialForge.getConfig();
    
  6. Forge. The partial update Forge process executes.

        // run ITL
        PartialForge.run();
    
  7. Apply timestamp to updates. The output XML file generated by the partial update pipeline is renamed to include a timestamp, to ensure it is processed in the correct order relative to files generated by previous or following partial update processes.

        // timestamp partial, save to cumulative partials dir
        PartialForge.timestampPartials();
    
  8. Copy updates to cumulative updates. The timestamped XML file is copied into the cumulative updates directory.

        PartialForge.fetchPartialsToCumulativeDir();
  9. Distribute update to each server. A single copy of the partial update file is distributed to each server specified in the configuration.

        // distribute partial update, update Dgraphs
        DgraphCluster.copyPartialUpdateToDgraphServers();
    
  10. Update MDEX Engines. The Dgraph processes are updated. Engines are updated according to the updateGroup property specified for each Dgraph. The update process for each Dgraph is as follows:

        DgraphCluster.applyPartialUpdates();
  11. Archive cumulative updates. The newly generated update file (and files generated by all partial updates processed since the last baseline) are archived on the indexing server.

        // archive partials
        PartialForge.archiveCumulativePartials();
    
  12. Release Lock. The "update_lock" flag is removed from the EAC, indicating that another update may be started.

        // release lock
        LockManager.releaseLock("update_lock");
        log.info("Partial update script finished.");
          }
          else {
            log.warning("Failed to obtain lock.");
          }
        ]]>
      </bean-shell-script>
    </script>
    


Copyright © Legal Notices