The DP CLI has a configuration file, edp-cli.properties, that sets some Data Processing properties for Provisioning and Refresh update workflows.
The edp-cli.properties file is located in the $BDD_HOME/dataprocessing/edp_cli/config directory. Some of the default values for the properties are populated from the bdd.conf installation configuration file. After installation, you can change the CLI configuration parameters by opening the edp-cli.properties file with a text editor.
These workflows use a combination of the above properties from the edp-cli.properties file and the rest of the properties from the Workflow Manager's edp.properties file. Therefore, you can change the edp.properties file for other properties used by these workflows (such as Kerberos properties) and also for properties used by other types of workflows (such as Incremental update workflows).
Data Processing Property | Description |
---|---|
maxRecordsForNewDataSet | Specifies the maximum number of records in
the sample size of a new data set (that is, the number of sampled records from
the source Hive table). In effect, this sets the maximum number of records in a
BDD data set. Note that this setting controls the sample size for all new data
sets and it also controls the sample size resulting from transform operations
(such as during a Refresh update on a data set that contains a transformation
script).
The default is set by the MAX_RECORDS property in the bdd.conf file. The CLI --maxRecords flag can override this setting. |
runEnrichment | Specifies whether to run the Data
Enrichment modules. The default is set by the
ENABLE_ENRICHMENTS property in the
bdd.conf file.
You can override this setting by using the CLI --runEnrichment flag. The CLI --excludePlugins flag can also be used to exclude some of the Data Enrichment modules. |
defaultLanguage | Sets the language for all attributes in the created data set. The default is set by the LANGUAGE property in the bdd.conf file. For the supported language codes, see Supported languages. |
datasetAccessType | Sets the access type for the data set,
which determines which Studio users can access the data set in the Studio UI.
This property takes one of these case-insensitive values:
|