The Dgraph HDFS Agent writes its stdout/stderr output to a log file.
The Dgraph HDFS Agent --out flag specifies the file name and path of the Dgraph HDFS Agent's stdout/stderr log file. This log file is used for both import (ingest) and export operations.
The name and location of the output log file is set at installation time via the AGENT_OUT_FILE
parameter of the bdd.conf
configuration file. Typically, the log name is dgraphHDFSAgent.out
and the location is the $BDD_HOME/logs
directory.
The Dgraph HDFS Agent log is especially important to check if you experience problems with loading records at the end of a Data Processing workflow. Errors received from the Dgraph (such as rejected records) are logged here.
Ingest operation messages
New import request received: MdexCollectionIdentifier{ databaseName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c, collectionName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c}, ... requestOrigin: FROM_DATASET Received request for database edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c Starting ingest for: MdexCollectionIdentifier{ databaseName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c, collectionName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c}, ... requestOrigin: FROM_DATASET Finished reading 9983 records for MdexCollectionIdentifier{ databaseName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c, collectionName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c}, ... requestOrigin: FROM_DATASET createBulkIngester edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c sendRecordsToIngester 9983 closeBulkIngester Ingest finished with 9983 records committed and 0 records rejected. Status: INGEST_FINISHED. Request info: MdexCollectionIdentifier{ databaseName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c, collectionName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c}, location: /user/bdd/edp/data/.dataIngestSwamp/..., user name: fcalvill, notification: {"workflowName":"CLIDataLoad", "sourceDatabaseName":null, "sourceDatasetKey":null, "targetDatabaseName": "edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c", "targetDatasetKey":"edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c", "ecid":"0000LMSUWCm7ADkpSw4Eyc1NSxM1000000", "status":"IN_PROGRESS", "startTime":1467209085630, "timestamp":1467209136298, "progressPercentage":0.0, "errorMessage":null, "trackingUrl":null, "properties":{"dataSetDisplayName":"WarrantyClaims", "isCli":"true"}}, actualEcid: 0000LMSUWCm7ADkpSw4Eyc1NSxM1000000, requestOrigin: FROM_DATASET Notification server url: http://busgg2014.us.oracle.com:7003/bdd/v1/api/workflows About to send notification Terminating Notification{workflowName=CLIDataLoad, sourceDatabaseName=null, sourceDatasetKey=null, targetDatabaseName=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c, targetDatasetKey=edp_cli_edp_4dd5ac28-2e85-4efc-a3c2-391b6a78f69c, ecid=0000LMSUWCm7ADkpSw4Eyc1NSxM1000000, status=SUCCEEDED, startTime=1467209085630, timestamp=1467209222088, progressPercentage=100.0, errorMessage=null, properties={dataSetDisplayName=WarrantyClaims, isCli=true}} Notification sent successfully Terminating ...
/user/bdd/edp/data/.dataIngestSwamp
directory in HDFS.createBulkIngester
operation is used to instantiate a Bulk Load ingester instance for the data set.sendRecordsToIngester
operation sends the 9983 records to the Dgraph's ingester.closeBulkIngester
operation.Status: INGEST_FINISHED
message signals the end of the ingest operation. The message also lists the number of successfully committed records and the number of rejected records. In addition, the Dgraph HDFS Agent notifies Studio that the ingest has finished, at which point Studio updates the status
attribute of the DataSet Inventory with the final status of the ingest operation. The status should be FINISHED
for a successful ingest or ERROR
if an error occurred.SUCCEEDED
.Note that throughout the workflow, Dgraph HDFS Agent constantly sends notification updates to Studio, so that Studio can report on the progress of the workflow to the end user.
Rejected records
Received error message from server: Record rejected: Character <c> is not legal in XML 1.0
A source record can also be rejected if it is too large. There is a limit of 128MB on the maximum size of a source record. An attempt to ingest a source record larger than 128MB fails and an error is returned (with the primary key of the rejected record), but the bulk load ingest process continues after that rejected record.
Logging for new and deleted attributes
Finished reading 499 records for Collection name: default_edp_2a0122f2-4d15-46bf-9669-21333442f10b Adding attributes to collection: default_edp_2a0122f2-4d15-46bf-9669-21333442f10b [NumInStock] Added attributes to collection: default_edp_2a0122f2-4d15-46bf-9669-21333442f10b ... Deleting attributes from collection: default_edp_2a0122f2-4d15-46bf-9669-21333442f10b [OldPrice2] Deleted attributes from collection: default_edp_2a0122f2-4d15-46bf-9669-21333442f10b
In the example, the NumInStock attribute was added to the data set and the OldPrice2 attribute was deleted.