Data loading options

BDD offers several options for data loading. You can load data by running the data loading workflow with DP CLI. Also, in Studio, you can upload a personal file or import data from a JDBC source.

For initial loading of data into BDD, three methods exist:
  1. Load of Hive tables. When you run DP CLI, it runs the data processing workflow and loads data from Hive tables into BDD.
  2. Personal file upload. In Studio you can upload a data set from a personal file.
  3. Import from a JDBC source. In Studio you can import a data set from a JDBC source.

This diagram illustrates data loading options:

Data loading options

Also, you can load a sample or full data. Or, you can also start with a sample and then load full data. See Data loading and sample size.