You can load either a sample or a full data set. If you load a sample, you can go to a full data set later. This topic summarizes how to get from a sample to a full data set.
--Incremental update
flag, or you can use Load Full Data Set in Studio, to load the entire source data set from Hive. You will then have a full data set in BDD.For detailed information on specifying the sample size with DP CLI, see the Data Processing Guide.
If you load a data set from a personal file or import it from a JDBC source, then all data is loaded. However, it may still be a sample if you compare it to the source data you may also have elsewhere on your system.
If you later want to add full data from the source, you can locate the Hive data set that BDD created when you loaded a file. Next, use the drop
command to place that data set in Hue, and replace it with a production Hive table. You can then run Load Full Data Set on this table in Studio. This will load a full data set.
This process is known as creating a BDD application. For detailed steps on this procedure, see the topic on creating a BDD application in the Studio User's Guide.