Next, obtain the Hadoop client libraries and put them on the Admin
Server.
BDD requires a number of client libraries to interact with Hadoop. In
a normal Hadoop cluster, these libraries are spread out, making it difficult
for BDD to find them all. To solve this issue, the upgrade script adds the
required libraries to a single JAR, called the Hadoop fat JAR, and distributes
it to all BDD nodes.
The specific libraries you need depend on your Hadoop distribution.
The location you put them in is arbitrary, as you will define it in
bdd.conf.
Note: If you're upgrading from BDD 1.0, be sure to obtain the libraries
for one of the currently supported CDH versions, even though you haven't
upgraded to it yet.
- CDH: Download the
following files from
http://archive-primary.cloudera.com/cdh5/cdh/5/ to the
Admin Server and extract them:
- spark-<spark_version>.cdh.<cdh_version>.tar.gz
- hive-<hive_version>.cdh.<cdh_version>.tar.gz
- hadoop-<hadoop_version>.cdh.<cdh_version>.tar.gz
- avro-<avro_version>.cdh.<cdh_version>.tar.gz
Be sure to download the files that correspond to the component
versions you currently have installed (unless you're upgrading from BDD 1.0).
- HDP: Copy the
following libraries from your Hadoop nodes to the Admin Server. Note that these
directories might not all be on the same node.
- /usr/hdp/<version>/hive/lib/
- /usr/hdp/<version>/spark/lib/
- /usr/hdp/<version>/hadoop/
- /usr/hdp/<version>/hadoop/lib/
- /usr/hdp/<version>/hadoop-hdfs/
- /usr/hdp/<version>/hadoop-hdfs/lib/
- /usr/hdp/<version>/hadoop-yarn/
- /usr/hdp/<version>/hadoop-yarn/lib/
- /usr/hdp/<version>/hadoop-mapreduce/
- /usr/hdp/<version>/hadoop-mapreduce/lib/
If you're upgrading from 1.0 or 1.1.x, you should now apply the
upgrade hotfix. If you have 1.2.x, move on to
Backing up your current cluster.