D Downloading the Correct Versions of the Hadoop, Hive, and HBase Clients for a Local Repostory

If you choose to download these dependencies from a local repository, use these instructions to add the correction client versions to the repository.

By default, the Jaguar installer will attempt to download the clients from the Cloudera or HDP repositories on the Internet. If this is access is restricted within your data center, then you can specify a local repository. These can be local directories or NFS paths. They can also be URLs within the local network or on the Internet.

For CDH 5.x :

First, check the content management service (CM or Ambari) and find the version of the Hadoop, Hive, and HBase services running on the Hadoop cluster. The compatible clients are of the same versions. In each case, the client tarball filename includes a version string segment that matches the version of the service installed on the cluster. In the case of CDH, you can then browse the public repository and find the URL to the client that matches the service version.

  1. Log on to Cloudera Manager and go to the Hosts menu. Select All Hosts , then Inspect All Hosts.

  2. When the inspection is finished, select either Show Inspector Results (on the screen) or Download Result Data (to a JSON file).

  3. In either case, scan the result set and find the service versions.

    In JSON version of the inspector results, there is a componentInfo section for each cluster that shows the versions of software installed on that cluster. The format of the data set is as follows:

    "componentInfo": [
           ...              
    {                 
       "cdhVersion": "CDH5",                  
       "componentRelease": "1.cdh5.11.1.p0.6",  
       "componentVersion": "2.6.0+cdh5.11.1+2400",   
       "name": "hadoop"                
    },              
    ...
  4. Go to https://archive.cloudera.com/cdh5/cdh/5.

    Note:

    Since February 2021 all Cloudera repos require password authentication, you will need to supply your Cloudera credential to access and download both client jars for cdh5 or client rpms for cdh6. If you are running Big Data Appliance please contact oracle support to request a patch with the specific clients you need.

    Look in the ”hadoop,” hive,” and “hbase” subdirectories of the CDH5 section of the archive. In the listings, you should find the client tarball packages for the versions of the services installed on the cluster, such as the following:

    https://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.12.1.tar.gz
    https://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.12.1.tar.gz
    https://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.12.1.tar.gz

After you identify the correct versions of the clients and download them to the local repository, provide the path in the repositories section of the bds-config.json file used by the Jaguar installer.

For CDH 6.X:

The dir and url parameters in the bds-config.json configuration file are not supported on Cloudera 6.x systems. For CDH 6.x, set up a local repository prior to running bds-database-install.sh (the database-side) installer and include the --alternate-repo parameter on the installer command line as described in the Command Line Parameter Reference for bds-database-install.sh.

For HDP:

  1. Log on to Ambari. Go to Admin, then Stack and Versions. On the Stack tab, locate the entries for the HDFS, Hive, and HBase services and note down the version number of each as the “service version.”

  2. Click the Versions tab. Note down the version of HDP that is running on the cluster as the “HDP version base.”

  3. Click Show Details to display a pop-up window that shows the full version string for the installed HDP release. Note this down as the “HDP full version

  4. The last piece of information needed is the Linux version (“centos5,” “centos6,” or “centos7”). Note this down as “OS version.”

To search though the HDP repository in Amazon S3 storage to find the correct client URLs using this information acquired in this steps, you would need an S3 browser, browser extension, or command line tool. As alternative, you can piece together the correct URLs, using these strings.

For HDP 2.5 and earlier, the URLs pattern is as follows.

http://public-repo-1.hortonworks.com/HDP/<OS version>/2.x/updates/<HDP version base>/tars/{hadoop|apache-hive|hbase}-<service version>.<HDP full version>.tar.gz

Here are some examples. Note that the pattern of the gzip filename is slightly different for Hive. There is an extra “-bin” segment in the name.

http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.2.0/tars/hadoop-2.7.1.2.3.2.0-2950.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.2.0/tars/apache-hive-1.2.1.2.3.2.0-2950-bin.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.3.2.0/tars/hbase-1.1.2.2.3.2.0-2950.tar.gz 
For HDP 2.5 and later releases, the pattern is almost the same except that there is an additional hadoop, hive, or hbase directory under the tar directory:
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.5.6.0/tars/hadoop/hadoop-2.7.3.2.5.6.0-40.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.5.6.0/tars/hive/apache-hive-1.2.1000.2.5.6.0-40-bin.tar.gz
http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.5.6.0/tars/hbase/hbase-1.1.2.2.5.6.0-40.tar.gz 

Alternative Method for HDP:

You can get the required software versions from the command line instead of using Ambari.

  • # hdp-select versions

    Copy and save the numbers to the left of the dash as the “HDP version base”.

  • # hadoop version 
    # beeline --version 
    # hbase version
    Use the output from these commands to formulate the <service version>.<HDP full version> segment for each URL.