Spark 1.5.x installation

Apache Spark 1.5.x must be installed on the machine that is configured as the BDD Admin Server.

You must download the Apache Spark which matches the Spark version of your CDH or HDP cluster. You can get the Spark version information from your installed CDH/HDP cluster or from the CDH/HDP official website.

To install the Spark 1.5.x component:

  1. Create a directory on the Admin Server machine to store the Spark software component.
    For example, create a /localdisk/hadoop directory.
  2. Download the Spark version which matches your CDH (Cloudera Distribution for Hadoop) or HDP (Hortonworks Data Platform) version of Spark. For example:
  3. Unpack the archive file into the /localdisk/hadoop directory.
    When the file is unpacked, it produces a Spark directory. For example, the CDH version will produce a spark-1.5.0-bin-hadoop2.6 directory.
After the Spark directory is created, you set that directory as the SPARK_HOME property in the bdd-shell.conf file, as in this CDH example:
## Path to the Spark installation on the server running BDD Shell
SPARK_HOME=/localdisk/hadoop/spark-1.5.0-bin-hadoop2.6