9 Using the Oracle Big Data Manager bdm-cli Utility
Use the bdm-cli (Oracle Big Data Manager Command Line Interface) utility to copy data and manage copy jobs at the command line.
               
bdm-cli  has several commands that duplicate odcp commands , but bdm-cli also includes additional commands for scheduling and managing copy jobs and other administrative tasks.
                  
You have to download and install bdm-cli yourself, either on a node of the cluster or on a remote operating system. If you install it on your cluster, you must use SSH to connect to the cluster. If you install it on a remote system, you can run the commands without SSH. See Installing the bdm-cli Utility.
                  
bdm-cli when it’s installed outside the cluster. 
               - Installing the bdm-cli Utility
 Thebdm-cli(Big Data Command Line Interface) is a command line utility for copying data and managing copy jobs. You can download and installbdm-clifrom GitHub. You can install it on a remote operating system, so you don’t have to use SSH to connect to the cluster.
- Usage
 You can usebdm-cliat the command line to create and manage copy jobs.
- Options
 Options that can be used by all bdm-cli commands are explained below.
- Subcommands
 The following table summarizes the bdm-cli subcommands. For more details on each, click the name of the command.
- bdm-cli abort_job
 Abort a running job.
- bdm-cli copy
 Execute a job to copy sources to destination.
- bdm-cli create_job
 Execute a new job from an existing template.
- bdm-cli create_job_template
 Create a new job template.
- bdm-cli get_data_source
 Find a data source by name.
- bdm-cli get_job
 Get a job by UUID.
- bdm-cli get_job_log
 Get a job log.
- bdm-cli list_all_jobs
 List all jobs from the execution history.
- bdm-cli list_template_executions
 List all jobs from the execution history for the given template.
- bdm-cli ls
 List files from a specific location.
9.1 Installing the bdm-cli Utility
The bdm-cli (Big Data Command Line Interface) is a command line utility for copying data and managing copy jobs. You can download and install bdm-cli from GitHub. You can install it on a remote operating system, so you don’t have to use SSH to connect to the cluster.
                  
To install bdm-cli:
                     
- 
                           If you use a proxy server, first call: export http_proxy="your_proxy_server" export https_proxy="your_proxy_server"
- 
                           Then call: curl -L https://github.com/jazeman/bdm-python-cli/blob/1.0/install-rpm?raw=true | bash
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.2 Usage
You can use bdm-cli at the command line to create and manage copy jobs. 
                  
Syntax
bdm-cli [global_options] subcommand [options][arguments]...Supported Storage Protocols and Paths
The protocols and paths to the file systems and storage services supported by bdm-cli are:
                     
- 
                              HDFS: hdfs:///
- 
                              Oracle Cloud Infrastructure Object Storage Classic (formerly known as Oracle Storage Cloud Service): swift://container.provider/
- 
                              Oracle Cloud Infrastructure Object Storage (formerly known as Oracle Bare Metal Cloud Object Storage Service): oss:///containerFor operations with Oracle Cloud Infrastructure Object Storage, you must specify the provider by using the options src-provideranddst-provider. For example, those options are used withbdm-cli create_jobwhen used with Oracle Cloud Infrastructure Object Storage.
Finding a Job’s UUID
A number of bdm-cli subcommands require that you identify a job by its Universally Unique Identifier (UUID). To find UUIDs, execute bdm-cli list_all_jobs.
Specifying Source and Destination Paths
When specifying sources and destinations, fully qualify the paths:
- 
                           source ...File name qualified by protocol and full path, for example: hdfs:///user/oracle/test.raw
- 
                           destinationDirectory name qualified by protocol and full path, for example: swift://container.storagename/test-dir
Setting Environment Variables
bdm-cli options as environment variables. For example, you can set Oracle Big Data Manager URL and user password  file, as follows:export BDM_URL=https://hostname:8888/bdcs/api && export BDM_PASSWORD=/tmp/password_fileAll the bdm-cli options that can be set as environment variables are documented in the sections below.
                     
Getting Help
bdm-cli  use: bdm-cli --helpbdm-cli command --helpbdm-cli edit_job_template --helpParent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.3 Options
Options that can be used by all bdm-cli commands are explained below.
| Option | Description | 
|---|---|
| --bdm-passwd path_to_password_file | Path to the Oracle Big Data Manager user password file. Environment variable:  | 
| --bdm-url bdm_url | Oracle Big Data Manager server URL. Environment variable:  | 
| --bdm-username username | Oracle Big Data Manager server user name. Default value:  Environment variable:  | 
| -f [table|csv|json] | Specify the output format: 
 | 
| --fields fields | Specifies comma-separated fields depending on the type of object. | 
| 
 
 | Show this message and exit. | 
| --no-check-certificate | Don't validate the server's certificate. | 
| --proxy proxy | Proxy server. | 
| --tenant-name tenant_name | Name of the tenant. Default value:  | 
| -v | Print the REST request body. | 
| --version | Show the Oracle Big Data Manager version and exit. | 
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.4 Subcommands
The following table summarizes the bdm-cli subcommands. For more details on each, click the name of the command.
| Command | Description | 
|---|---|
| bdm-cli abort_job | Abort a running job. | 
| bdm-cli copy | Execute a job to copy sources to destination. | 
| bdm-cli create_job | Execute a new job from an existing template. | 
| bdm-cli create_job_template | Create a new job template. | 
| bdm-cli get_data_source | Find a data source by name. | 
| bdm-cli get_job | Get a job by UUID. | 
| bdm-cli get_job_log | Get a job log. | 
| bdm-cli list_all_jobs | List all jobs from the execution history. | 
| bdm-cli list_template_executions | List all jobs from the execution history for the given template. | 
| bdm-cli ls | List files from a specific location. | 
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.5 bdm-cli abort_job
Abort a running job.
Syntax
bdm-cli abort_job [options] job_uuidOptions
| Option | Description | 
|---|---|
| 
 | Force abort job. | 
| 
 
 | Show this message and exit. | 
Example
Abort a job.
/usr/bin/bdm-cli -f json --no-check-certificate --bdm-url ${DATA_HOST}:8888/bdcs/api --bdm-username ${DATA_USER} --bdm-passwd ${USER_PASSWORD_FILE} abort_job 24ef30e8-913b-4402-baf8-74b99c211f50Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.6 bdm-cli copy
Execute a job to copy sources to destination.
Syntax
bdm-cli copy [options] source... destinationOptions
| Option | Description | 
|---|---|
| 
 | Specify the block size in bytes. | 
| 
 | Data source description. | 
| 
 | Specify the maximum amount of memory for the Oracle Storage Cloud Service driver. | 
| 
 | Specify the provider of the destination, when using Oracle Cloud Infrastructure Object Storage Classic destination. | 
| 
 
 | Show this message and exit. | 
| 
 | Specify the Spark executors memory limit in GB per node, for example,  | 
| 
 | Specify the maximum number of Spark executors per node, for example,  | 
| 
 | Specify the maximum number of threads per node. | 
| 
 | Specify the part size in bytes. | 
| 
 
 | Recursively copy (enabled by default). | 
| 
 
 | Retry data transfer in case of failure. | 
| 
 | Specify the provider of the source, when using for Oracle Cloud Infrastructure Object Storage Classic. | 
| 
 
 | Synchronize the source with the destination. | 
Example
Copy a file from HDFS to Oracle Storage Cloud Service:
/usr/bin/bdm-cli  -f json  --no-check-certificate  --bdm-url ${DATA_HOST}:8888/bdcs/api --bdm-username ${DATA_USER} --bdm-passwd ${USER_PASSWORD_FILE}   copy hdfs:///user/${DATA_USER}/1MFile.raw oss:///${DATA_USER} --dst-provider ${OSS_PROVIDER}
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.7 bdm-cli create_job
Execute a new job from an existing template.
Syntax
bdm-cli create_job [options] job_template_nameOptions
| Option | Description | 
|---|---|
| 
 | Execute job immediately if job scheduling is set. Ignored otherwise. | 
| 
 | Source file, for example: 
 | 
| 
 | The destination directory, for example:  | 
| 
 | Specify the maximum amount of memory for an Oracle Storage Cloud Service driver. | 
| 
 | Specify the Spark executors memory limit in GB per node, for example:  | 
| 
 | Specify the maximum number of Spark executors per node, for example:  | 
| 
 | Specify the maximum number of threads per node. | 
| 
 | Specify the block size in bytes. | 
| 
 | Specify the part size in bytes. | 
| 
 
 | Retry data transfer in case of failure. | 
| 
 
 | Synchronize the source with the destination. | 
| 
 
 | Recursively copy (enabled by default). | 
| 
 | Main Java class used for the Spark job execution. | 
| 
 | Specify the provider of the source, when using an Oracle Cloud Infrastructure Object Storage Classic source. | 
| 
 | Specify the provider of the destination, when using an Oracle Cloud Infrastructure Object Storage Classic destination. | 
| 
 
 | Show this message and exit. | 
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.8 bdm-cli create_job_template
Create a new job template.
Syntax
bdm-cli create_job_template [options] job_template_name source ... destinationOptions
| Option | Description | 
|---|---|
| 
 
 | Abort an already running execution if the next scheduled execution is started. | 
| 
 | Specify block size in bytes. | 
| 
 | Job's data source name. | 
| 
 | Job template description. | 
| 
 | Specify for  | 
| 
 | Environment in JSON format: 
 | 
| 
 
 | Show this message and exit. | 
| 
 | Count of executions history log. | 
| 
 | Main Java class used for the Spark job execution. | 
| 
 | Specify cron-like job schedule, for example: 
 | 
| 
 | Specify job template type. Allowed values are: 
 | 
| 
 |  Hadoop libraries, for example:  This option can have multiple values, for example: 
 | 
| 
 | Specify the Spark executors memory limit in GB per node, for example:  | 
| 
 | Specify the maximum number of Spark executors per node, for example:  | 
| 
 | Specify the maximum of threads per node. | 
| 
 | Specify part size in bytes. | 
| 
 
 | Recursively copy (enabled by default). | 
| 
 
 | Retry data transfer in case of failure. | 
| 
 | Specify the provider of the source, when using for Oracle Bare Metal Cloud Object Storage Service. | 
| 
 
 | Synchronize source with destination. | 
| 
 | User defined tag. This option can have multiple values, for example: 
 | 
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.9 bdm-cli get_data_source
Find a data source by name.
Syntax
bdm-cli get_data_source [options] data_source_nameOptions
| Option | Description | 
|---|---|
| 
 
 | Show this message and exit. | 
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.10 bdm-cli get_job
Get a job by UUID.
Syntax
bdm-cli get_job [options] job_uuidOptions
| Option | Description | 
|---|---|
| 
 
 | Show this message and exit. | 
Example
Get information on a job.
/usr/bin/bdm-cli  -f json  --no-check-certificate  --bdm-url ${DATA_HOST}:8888/bdcs/api --bdm-username ${DATA_USER} --bdm-passwd ${USER_PASSWORD_FILE}   get_job ${JOB_UUID}
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.11 bdm-cli get_job_log
Get a job log.
Syntax
bdm-cli get_job_log [options] job_uuidOptions
| Option | Description | 
|---|---|
| 
 
 | Show this message and exit. | 
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.12 bdm-cli list_all_jobs
List all jobs from the execution history.
Syntax
bdm-cli list_all_jobs [options]Options
| Option | Description | 
|---|---|
| 
 
 | Show this message and exit. | 
| 
 | Specify the size of the page. | 
| 
 | Specify the paging offset. | 
Example
List all jobs.
/usr/bin/bdm-cli  -f json  --no-check-certificate  --bdm-url ${DATA_HOST}:8888/bdcs/api --bdm-username ${DATA_USER} --bdm-passwd ${USER_PASSWORD_FILE}   list_all_jobsUse the --offset and --limit options to restrict the results. For example to get the eighth page when there are 20 rows per page, do the following:
                     
bdm-cli list_all_jobs --offset 8 --limit 20Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.13 bdm-cli list_template_executions
List all jobs from the execution history for the given template.
Syntax
bdm-cli list_template_executions  [options] job_uuidOptions
| Option | Description | 
|---|---|
| 
 
 | Show this message and exit. | 
Parent topic: Using the Oracle Big Data Manager bdm-cli Utility
9.14 bdm-cli ls
List files from a specific location.
Syntax
bdm-cli ls [options] path_1 ... path_nOptions
| Option | Description | 
|---|---|
| 
 
 | Human readable file sizes. | 
| 
 
 | List directories only. | 
| 
 | Specify for Oracle Bare Metal Cloud Object Storage Service paths. | 
| 
 
 | Show this message and exit. | 
Examples
List HDFS content under selected user.
/usr/bin/bdm-cli  -f json  --no-check-certificate  --bdm-url ${DATA_HOST}:8888/bdcs/api --bdm-username ${DATA_USER} --bdm-passwd ${USER_PASSWORD_FILE}   ls hdfs:///user/${DATA_USER}/integration_in --provider hdfsList Oracle Cloud Infrastructure Object Storage Classic content under selected user.
/usr/bin/bdm-cli  -f json  --no-check-certificate  --bdm-url ${DATA_HOST}:8888/bdcs/api --bdm-username test20170324113533 --bdm-passwd ${USER_PASSWORD_FILE}    ls oss:///${OSS_CONTAINER}/ --provider ${OSS_PROVIDER}Parent topic: Using the Oracle Big Data Manager bdm-cli Utility