The Dgraph HDFS Agent is the conduit for exporting data from a Studio project.
From within a project in Studio, you can export data as a new Avro file (.avro
extension), CSV file (.csv
extension), or text file (.txt
extension). Files can be exported to either an external directory on your computer, or to HDFS. For details on the operation, see the Studio User's Guide.
When a user exports a data set to a file in HDFS from Studio, the exported file's owner will always be the owner of HDFS agent process (or the HDFS agent principal owner in a Kerberized cluster). That is, the Dgraph HDFS Agent uses the username from the export request to create a FileSystem object. That way, BDD can guarantee that a file will not be created if the user does not have permissions, and if the file it created, it is owned by that user. The group is assign automatically by Hadoop.
.csv
file..txt
file.If you export to HDFS, you also have the option of creating a Hive table from the data. After the Hive table is created, a Data Processing workflow is launched to create a new data set.
The following diagram illustrates the process of exporting data from Studio into HDFS:
Errors that may occur during the export are entered into the Dgraph HDFS Agent's log.