This topic describes how to run a Refresh update operation.
To run a Refresh update on a data set:
[2016-06-24T09:56:22.963-04:00] [DataProcessing] [INFO] [] [org.apache.spark.Logging$class] [tid:main] [userID:fcalvill]
client token: N/A
diagnostics: N/A
ApplicationMaster host: 10.152.105.219
ApplicationMaster RPC port: 0
queue: root.fcalvill
start time: 1466776490743
final status: SUCCEEDED
tracking URL: http://bus2014.example.com:8088/proxy/application_1466716670116_0002/A
user: fcalvill
Refreshing existing collection: MdexCollectionIdentifier{
databaseName=edp_cli_edp_ad9a93eb-fbec-49ca-bdc9-8ac897dd5c8f,
collectionName=edp_cli_edp_ad9a93eb-fbec-49ca-bdc9-8ac897dd5c8f}
Collection key for new record: MdexCollectionIdentifier{
databaseName=refreshed_edp_a284bd0c-23fe-4d26-9e92-cbfc22b1555e,
collectionName=refreshed_edp_a284bd0c-23fe-4d26-9e92-cbfc22b1555e}
data_processing_CLI finished with state SUCCESS
EDP: DatasetRefreshConfig{hiveDatabase=, hiveTable=,
collectionToRefresh=MdexCollectionIdentifier{databaseName=edp_cli_edp_ad9a93eb-fbec-49ca-bdc9-8ac897dd5c8f,
collectionName=edp_cli_edp_ad9a93eb-fbec-49ca-bdc9-8ac897dd5c8f},
newCollectionId=MdexCollectionIdentifier{databaseName=refreshed_edp_a284bd0c-23fe-4d26-9e92-cbfc22b1555e,
collectionName=refreshed_edp_a284bd0c-23fe-4d26-9e92-cbfc22b1555e},
op=REFRESH_DATASET}
You can also check the Dgraph HDFS Agent log for the status of the Dgraph ingest operation.
Note that future Refresh updates on this data set will continue to use the same Data Set Logical Name. You will also use this name if you set up a Refresh update cron job for this data set.