Required Hadoop client libraries

BDD requires a number of client libraries to interact with Hadoop. When the installer runs, it adds these libraries to a single jar, called the Hadoop fat jar, which it distributes to all BDD nodes.

How you obtain the client libraries depends on your Hadoop distribution. If you have CDH, the installer will download them automatically. Note that this requires an internet connection on the install machine. If the script can't download all of the client libraries, it will fail and you will have to download them manually. See Failure to download the Hadoop client libraries for more information.

If you have HDP, you must manually copy the client libraries from your Hadoop nodes to the install machine. The specific libraries you need depend on the version of HDP you have.

HDP 2.2.4

If you have HDP 2.2.4, locate the following directories on your Hadoop nodes and copy them to the install machine:
Note: These directories might not all be on the same node.
  • /usr/hdp/<version>/pig/lib/h2/
  • /usr/hdp/<version>/hive/lib/
  • /usr/hdp/<version>/spark/lib/
  • /usr/hdp/<version>/spark/external/spark-native-yarn/lib/
  • /usr/hdp/<version>/hadoop/
  • /usr/hdp/<version>/hadoop/lib/
  • /usr/hdp/<version>/hadoop-hdfs/
  • /usr/hdp/<version>/hadoop-hdfs/lib/
  • /usr/hdp/<version>/hadoop-yarn/
  • /usr/hdp/<version>/hadoop-yarn/lib/
  • /usr/hdp/<version>/hadoop-mapreduce/
  • /usr/hdp/<version>/hadoop-mapreduce/lib/

HDP 2.3.x

If you have HDP 2.3.x, locate the following directories on your Hadoop nodes and copy them to the install machine:
Note: These directories might not all be on the same node.
  • /usr/hdp/<version>/hive/lib/
  • /usr/hdp/<version>/spark/lib/
  • /usr/hdp/<version>/hadoop/
  • /usr/hdp/<version>/hadoop/lib/
  • /usr/hdp/<version>/hadoop-hdfs/
  • /usr/hdp/<version>/hadoop-hdfs/lib/
  • /usr/hdp/<version>/hadoop-yarn/
  • /usr/hdp/<version>/hadoop-yarn/lib/
  • /usr/hdp/<version>/hadoop-mapreduce/
  • /usr/hdp/<version>/hadoop-mapreduce/lib/