The DP CLI has a number of runtime flags that control its behavior. You can list these flags if you use the --help flag.
You can use these flags if you run the CLI without any arguments. Note that each flag has a full name that begins with two dashes, such as --maxRecords, and an abbreviated version that uses one dash, such as -m.
| CLI flag | Description |
|---|---|
| -a, --all | Runs data processing on all Hive tables in all Hive databases. |
| -bl, --blackList<bl-file> | Specifies the file name for the blacklist used to filter out Hive tables. The tables in this list are ignored by Data Processing and not provisioned. |
| -d, --database<db-name> | Runs Data Processing using the specified Hive database. If a Hive table is not specified, runs on all Hive tables in the Hive database. |
| -e, --runEnrichment | Runs the Data Enrichment modules (except for the modules that never automatically run during the sampling phase). |
| -h, --help | Displays usage information. |
| -kryo, --kryoModeFlag | Activates kryoMode for an optimized serialization. This should be tested on specific data sets. |
| -m, --maxRecords <num> | Sets maximum number of records to process. Overrides the CLI script's configuration setting. |
| -mwt, --maxWaitTime <secs> | Specifies the maximum waiting time (in
seconds) for each table processing to complete. The next table is processed
after this interval or as soon as the data ingesting is completed.
This flag controls the pace of the table processing, and prevents Hadoop and Spark cluster nodes, as well as the Dgraph cluster nodes from being flooded with a large number of simultaneous requests. |
| -nr,
--nonRandomizedCollectionNameFlag |
Does not randomize the data set names. This flag is intended for specific testing purposes. |
| -p, --collectionPrefix <prefix> | Specifies the name prefix for data sets. Overrides the script configuration setting. |
| -perf, --perfDataCollection | Used only for Oracle internal use. |
| -t, --table <name> | Runs data processing on the specified Hive table. If a Hive database is not specified, assumes the default database set in the script configuration. Note that the table is skipped in these cases: it does not exist, is empty, or has the table property skipAutoProvisioning set. |
| -v, --versionNumber | Prints the version number of the current iteration of the Data Processing component within Big Data Discovery. |
| -wl, --whiteList <wl_file> | Specifies the file name for the whitelist used to select qualified Hive tables for processing. Each table on this list is processed by the Data Processing component and is ingested into the Dgraph as a BDD data set. |