What Is odcp?

odcp is a command line interface for copying very large files in a distributed environment.

odcp uses Spark to provide parallel transfer of one or more files. It takes the input file and splits it into chunks, which are then transferred in parallel to the destination. By default, transferred chunks are then merged back to one output file.

odcp is compatible with Cloudera Distributed Hadoop 5.7.x and supports copying files when using the following:
  • Apache Hadoop Distributed File Service (HDFS)

  • Apache WebHDFS and Secure WebHDFS (SWebHDFS)

  • Oracle Cloud Infrastructure Object Storage Classic

  • Amazon Simple Storage Service (S3)

  • Oracle Cloud Infrastructure Object Storage

  • Hypertext Transfer Protocol (HTTP) and HTTP Secure (HTTPS) — Used for sources only.