Cloudera Distribution for Hadoop (CDH) provides a number of
Hadoop-related components and tools that BDD requires to process and manage
data. CDH 5.3.0 must be installed on your system before you install BDD.
Note: CDH does not need to be installed on all nodes that will host BDD
components. Some BDD components only require specific CDH components, and
others require none at all. For more information, see
CDH requirements.
CDH components have a number of functions within BDD. For example:
- Cloudera Manager provides
the BDD installer with information about your CDH cluster at runtime. You can
also use Cloudera Manager to administer your CDH cluster after installation,
although this is not required.
- The Hadoop Distributed File
System (HDFS) stores all of your source data.
- ZooKeeper manages the Dgraph
instances.
- Spark runs all Data
Processing jobs.