The Dgraph is a component of Big Data Discovery that runs search analytical processing of the data sets. It handles query requests users make to data sets.
The Dgraph uses data structures and algorithms to provide real-time responses to client requests for analytic processing and data summarization. When source data is loaded into Big Data Discovery, the Dgraph creates a separate Dgraph database for each of the data sets. When the Dgraph receives a client request through Studio, the Dgraph queries the appropriate database and returns the results.
An Oracle Big Data Discovery cluster has one or more Dgraph processes that handle end-user query requests accessing the Dgraph databases on shared storage. One of the Dgraphs in a Big Data Discovery cluster is the leader for a particular database and therefore is responsible for handling all write operations (updates, configuration changes) for that database, while the remaining Dgraphs may serve as read-only followers.
<dataset>_indexes
edp_cli_edp_256b0c6b-cacf-478c-80bf-b5332f4f37ae_indexes
Each data set has its own Dgraph database, and there is only one data set per Dgraph database. The databases are stored in the directory you specify for the DGRAPH_INDEX_DIR property in the bdd.conf file. This directory is called the Dgraph databases directory.
For example, if you create two data sets, Wine and Weather, in Studio, the Dgraph databases directory creates five databases (one for each of the two data sets and three internal databases). You may also see other databases in the Dgraph databases directory; they may be created as a result of committing a transformed data set.
This diagram illustrates this example:
DGRAPH NOTIFICATION {database} [0] Mounting database edp_cli_edp_256b0c6b-cacf-478c-80bf-b5332f4f37ae
Note that the entry is made by the Dgraph database log subsystem.
EDP: ProvisionDataSetFromHiveConfig{hiveDatabaseName=default, hiveTableName=warrantyclaims, newCollectionId=MdexCollectionIdentifier{databaseName=edp_cli_edp_256b0c6b-cacf-478c-80bf-b5332f4f37ae, collectionName=edp_cli_edp_256b0c6b-cacf-478c-80bf-b5332f4f37ae}}
You should also see database names in the logs for Studio, Dgraph HDFS Agent, and Transform Service.
The HDFS Data at Rest Encryption feature, when enabled, allows data to be stored in encrypted HDFS directories called encryption zones. All files within an encryption zone are transparently encrypted and decrypted on the client side. Decrypted data is therefore never stored in HDFS.
If you have enabled HDFS Data at Rest Encryption, you can store your Dgraph databases in an encryption zone in HDFS. For details on enabling HDFS Data at Rest Encryption, see the Installation Guide.
The Dgraph Tracing Utility is a Dgraph diagnostic program used by Oracle Support. It stores the Dgraph trace data, which are useful in troubleshooting the Dgraph. It starts when the Dgraph starts, and keeps track of all Dgraph operations. It stops when the Dgraph shuts down. You can save and download trace data to share it with Oracle Support.
The Tracing Utility stores the Dgraph target trace data it collects in *.ebb files, which are useful in analyzing Dgraph crashes. The files are intended for use by Oracle Support. The files are saved in the $DGRAPH_HOME/bin directory. You can also manually generate and save the trace data with the bdd-admin script's get-blackbox command, as described in get-blackbox.