Oracle Commerce Guided Search - Dgraph baseline update script using Forge

Dgraph baseline update script using Forge

The baseline update script defined in the DataIngest.xml document for a Dgraph deployment is included in this section, with numbered steps indicating the actions performed at each point in the script.

<script id="BaselineUpdate">
    <![CDATA[ 
  log.info("Starting baseline update script.");

Obtain lock. The baseline update attempts to set an "update_lock" flag in the EAC to serve as a lock or mutex. If the flag is already set, this step fails, ensuring that the update cannot be started more than once simultaneously, as this would interfere with data processing. The flag is removed in the case of an error or when the script completes successfully.
```
    // obtain lock
    if (LockManager.acquireLock("update_lock")) {
```
Validate data readiness. Check that a flag called "baseline_data_ready" has been set in the EAC. This flag is set as part of the data extraction process to indicate that files are ready to be processed (or, in the case of an application that uses direct database access, the flag indicates that a database staging table has been loaded and is ready for processing). This flag is removed as soon as the script copies the data out of the data/incoming directory, indicating that new data may be extracted.
```
    // test if data is ready for processing
    if (Forge.isDataReady()) {
```
Clean processing directories. Files from the previous update are removed from the data/processing, data/forge_output, data/temp, data/dgidx_output and data/partials/cumulative_partials directories.
```
    // clean directories
    Forge.cleanDirs();
    PartialForge.cleanCumulativePartials();
    Dgidx.cleanDirs();
```
Copy data to processing directory. Extracted data in data/incoming is copied to data/processing.
```
    // fetch extracted data files to forge input
    Forge.getIncomingData();
```
Release Lock. The "baseline_data_ready" flag is removed from the EAC, indicating that the incoming data has been retrieved for baseline processing.
```
    LockManager.releaseLock("baseline_data_ready");
```
Copy config to processing directory. Configuration files are copied from data/complete_index_config to data/processing.
```
    // fetch config files to forge input
    Forge.getConfig();
```
Archive Forge logs. The logs/forges/Forge directory is archived, to create a fresh logging directory for the Forge process and to save the previous Forge run's logs.
```
    // archive logs
    Forge.archiveLogDir();
```
Forge. The Forge process executes.
```
    Forge.run();
```
Archive Dgidx logs. The logs/dgidxs/Dgidx directory is archived, to create a fresh logging directory for the Dgidx process and to save the previous Dgidx run's logs.
```
    // archive logs
    Dgidx.archiveLogDir();
```
Dgidx. The Dgidx process executes.
```
    Dgidx.run();
```
Distribute index to each server. A single copy of the new index is distributed to each server that hosts a Dgraph. If multiple Dgraphs are located on the same server but specify different srcIndexDir attributes, multiple copies of the index are delivered to that server.
Update MDEX Engines. The Dgraphs are updated. Engines are updated according to the restartGroup property specified for each Dgraph. The update process for each Dgraph is as follows:
This somewhat complex update functionality is implemented to minimize the amount of time that a Dgraph is stopped. This restart approach ensures that the Dgraph is stopped just long enough to rename two directories.
```
    // distributed index, update Dgraphs
    DistributeIndexAndApply.run();
```
```
<script id="DistributeIndexAndApply">
    <bean-shell-script>
      <![CDATA[ 
    DgraphCluster.cleanDirs();
    DgraphCluster.copyIndexToDgraphServers();
    DgraphCluster.applyIndex();
      ]]>
    </bean-shell-script>
  </script>
```
The latest dimension values generated by the Forge process (including values for autogen and external dimensions) are copied to configDir defined in Forge component.
```
 		      Forge.getPostForgeDimensions()
```
Archive index and Forge state. The newly created index and the state files in Forge's state directory are archived on the indexing server.
```
    // archive state files, index
    Forge.archiveState();
    Dgidx.archiveIndex();
```
Cycle LogServer. The LogServer is stopped and restarted. During the downtime, the LogServer's error and output logs are archived.
```
    // cycle LogServer
    LogServer.cycle();
```

Release Lock. The "update_lock" flag is removed from the EAC, indicating that another update may be started.

    // release lock
    LockManager.releaseLock("update_lock");

    log.info("Baseline update script finished.");
      } else {
    log.warning("Failed to obtain lock.");
      }
    ]]>
  </bean-shell-script>
</script>

Dgraph baseline update script using Forge

Guided Search Administrator's Guide