Refresh flag syntax

This topic describes the syntax of the --refreshData flag.

The DP CLI flag syntax for a Refresh update operation has one of the following syntaxes:
./data_processing_CLI --refreshData <logicalName>
or
./data_processing_CLI --refreshData <logicalName> --table <tableName>
or
./data_processing_CLI --refreshData <logicalName> --table <tableName> --database <dbName>
where:
  • --refreshData (abbreviated as -refresh) is mandatory and specifies the logical name of the data set to be updated.
  • --table (abbreviated as -t) is optional and specifies a Hive table to be used for the source data. This flag allows you to override the source Hive table that was used to create the original data set (the name of the original Hive table is stored in the data set's metadata).
  • --database (abbreviated as -d) is optional and specifies the database of the Hive table specified with the --table flag. This flag allows you to override the database that was used to create the original data set). The --database flag can be used only if the --table flag is also used.

The logicalName value is available in the Data Set Logical Name property in Studio. For details, see Obtaining the Data Set Logical Name.

Use of the --table and --database flags

When a data set is first created, the names of the source Hive table and the source Hive database are stored in the DSI (DataSet Inventory) metadata for that data set. The --table flag allows you to override the default source Hive table, while the --database flag can override the database set in the data set's metadata.

Note that these two flags are ephemeral. That is, they are used only for the specific run of the operation and do not update the metadata of the data set.

If these flags are not specified, then the Hive table and Hive database that are used are the ones in the data set's metadata.

Use these flags when you want to temporarily replace the data in a data set with that from another Hive table. If the data change is permanent, it is recommended that you create a new data set from desired Hive table. This will also allow you to create a Transformation script that is exactly tailored to the new data set.