Exporting data from Studio

The Dgraph HDFS Agent is the conduit for exporting data from a Studio project.

From within a project in Studio, you can export data as a new Avro file (.avro extension), CSV file (.csv extension), or text file (.txt extension). Files can be exported to either an external directory on your computer, or to HDFS. For details on the operation, see the Studio User's Guide.

When a user exports a data set to a file in HDFS from Studio, the exported file's owner will always be the owner of HDFS agent process (or the HDFS agent principal owner in a Kerberized cluster). That is, the Dgraph HDFS Agent uses the username from the export request to create a FileSystem object. That way, BDD can guarantee that a file will not be created if the user does not have permissions, and if the file it created, it is owned by that user. The group is assign automatically by Hadoop.

As part of the export operation, the user specifies the delimiter to be used in the exported file:
  • If the delimiter is a comma, the export process creates a .csv file.
  • If the delimiter is anything except a comma, the export process creates a .txt file.

If you export to HDFS, you also have the option of creating a Hive table from the data. After the Hive table is created, a Data Processing workflow is launched to create a new data set.

The following diagram illustrates the process of exporting data from Studio into HDFS:

This diagram shows the process of exporting data from Studio (in Big Data Discovery) into HDFS.

In this diagram, the following actions take place:
  1. From Transform in Studio, you can select to export the data into HDFS. This sends an internal request to export the data to the Dgraph.
  2. The Dgraph communicates with the Dgraph HDFS Agent, which launches the data exporting process and writes the file to HDFS.
  3. Optionally, you can choose to create a Hive table from the data. If you do so, the Hive table is created in HDFS.

Errors that may occur during the export are entered into the Dgraph HDFS Agent's log.