9 Use the odcp Command Line Utility to Copy Data
Use the odcp
command line utility to manage copy jobs
data between HDFS on your cluster and remote storage providers.
odcp
uses Spark to provide parallel transfer of one or
more files. It takes the input file and splits it into chunks, which are then
transferred in parallel to the destination. By default, transferred chunks are then
merged back to one output file.
odcp
supports copying files when using the following:
-
Apache Hadoop Distributed File Service (HDFS)
-
Apache WebHDFS and Secure WebHDFS (SWebHDFS)
-
Amazon Simple Storage Service (S3)
-
Oracle Cloud Infrastructure Object Storage
-
Hypertext Transfer Protocol (HTTP) and HTTP Secure (HTTPS) — Used for sources only.