7 Dependency Downloader

Utility scripts are located in the {OGGBD install}/DependencyDownloader directory to download client dependency jars for the various supported Oracle GoldenGate for Big Data integrations.

These scripts use Java and Apache Maven to download the dependency jars from the Maven Central Repository and other publicly available repositories (Hortonworks, Cloudera, Confluent).

Topics:

7.1 Dependency Downloader Setup

To complete the Dependency Downloader setup:
  1. To verify that Java is installed, execute the following from the command line: java -version.

    Note:

    The Dependency Downloader utility scripts require Java to run. Ensure that Oracle Java is downloaded and is available in the PATH on the machine where the scripts are installed.
  2. Configure the proxy settings in the following script: {OGGBD install}/DependencyDownloader/config_proxy.sh. Following are the 2 entries in this file:
    • #export PROXY_SERVER_HOST=www-proxy-hqdc.us.oracle.com
    • #export PROXY_SERVER_PORT=80
    To configure the proxy settings:
    1. Uncomment the configuration settings. (remove the # at beginning of the lines).
    2. Change the host name and port number to your correct proxy server settings.

    Note:

    Most companies maintain a private network which in turn has a network firewall to shield it from the public Internet. Additionally, most companies maintain a forwarding proxy server which serves as a gateway between the customer’s private network and the public Internet. The Dependency Downloader utilities must access Maven repositories, which are available on the Internet. Therefore, you need to supply configuration for HTTP proxy settings in order to download dependency libraries. Proxy servers are identified by host name and port. If you do not know whether your company employs a proxy server or the settings, then contact your IT or network administrators.

The Dependency Downloader uses Bash scripts in order to invoke Maven and download dependencies. The Bash shell is not supported natively from the Windows Command Prompt. You can run the Dependency Downloader scripts on Windows, but it requires the installation of a Unix emulator. A Unix emulator provides a Unix style command line on Windows and supports various flavors of the Unix shells including Bash. An option for Unix emulators is Cygwin, which is available free of charge. After Cygwin is installed, the setup process is the same. Setup and running of the scripts should be done through the Cygwin64 Terminal. See https://www.cygwin.com/.

7.2 Running the Dependency Downloader Scripts

To run the dependency downloader scripts:
  1. Use a Unix terminal interface navigate to the following directory: {OGGBD install}/DependencyDownloader.
  2. Execute the following to run the scripts: ./{the dependency script} {version of the dependencies to download}

    For example: ./aws.sh 1.11.893

    Dependency libraries get downloaded to the following directory:

    {OGGBD install}/DependencyDownloader/dependencies/{the dependency name}_{the_dependency_version}.

    For example: {OGGBD install}/DependencyDownloader/dependencies/aws_sdk_1.11.893.

Ensure that the version string exactly matches the version string of the dependency which is being downloaded. If a dependency version doesn't exist in the public Maven repository,then it is not possible to download the dependency and running the script results in an error. Most public Maven repositories support a web-based GUI whereby you can browse the supported versions of various dependencies. The exception is the Confluent Maven repository does not support a web-based GUI. This makes downloading dependencies challenging, because the version string is not independently verifiable through a web interface.

After the dependencies are successfully downloaded, you must configure the gg.classpath variable in the Java Adapter properties file to include the dependencies for the corresponding replicat process.

Note:

Best Practices
  1. Whenever possible, use the exact version of the client libraries to the server/application integration to which you are connecting.
  2. Prior to running the Dependency Downloader scripts, independently verify that the version string exists in the repository through the web GUI.

7.3 Dependency Downloader Scripts

Table 7-1 Dependency Downloader Scripts

Client Script Description Relevant Handlers Versions Supported Dependency Link

Amazon Web Services SDK

aws.sh This script downloads the Amazon Web Services (AWS) SDK, which provides client libraries for connectivity to the AWS cloud. Kinesis Handler

S3 Event Handler

1.11.x https://search.maven.org/artifact/com.amazonaws/aws-java-sdk
Google BigQuery bigquery.sh This script downloads the required client libraries for Google BigQuery. BigQuery Handler 1.x https://search.maven.org/artifact/com.google.cloud/google-cloud-bigquery
Cassandra DSE (Datastax Enterprise) Client cassandra_dse.sh This script downloads the Cassandra DSE client. Cassandra DSE is the for-purchase version of Cassandra available from Datastax. Cassandra Handler 2.0.0 and higher https://search.maven.org/artifact/com.datastax.dse/dse-java-driver-core
Apache Cassandra Client cassandra.sh This script downloads the Apache Cassandra client. Cassandra Handler 4.0.0 and higher https://search.maven.org/artifact/com.datastax.oss/java-driver-core
Elasticsearch REST client elasticsearch_rest.sh This script downloads the Elasticsearch High Level Rest Client. Elasticsearch Handler 7.x versions are currently supported https://search.maven.org/artifact/org.elasticsearch.client/elasticsearch-rest-high-level-client
Elasticsearch Transport Client elasticsearch_transport.sh This script downloads the Elasticsearch Transport Client. Elasticsearch Handler 5.x, 6.x, and 7.x https://search.maven.org/artifact/org.elasticsearch.client/transport
Hadoop Azure Client from Cloudera hadoop_azure_cloudera.sh This script downloads the Hadoop Azure client libraries provided by Cloudera. The Hadoop Azure client libraries cannot be loaded along with the Hadoop client because in Cloudera, the version numbers between the two components do not line up perfectly.
  • HDFS Handler
  • HDFS Event Handler
  • ORC Event Handler
  • Parquet Event Handler
2.x and 3.x https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/hadoop/hadoop-azure/
Hadoop Client from Cloudera hadoop_cloudera.sh This script downloads the Hadoop client libraries provided by Cloudera.
  • HDFS Handler
  • HDFS Event Handler
  • ORC Event Handler
  • Parquet Event Handler
2.x and 3.x https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/hadoop/hadoop-client/
Hadoop Client from Hortonworks hadoop_hortonworks.sh The Hadoop client including the libraries for connectivity to Azure Data Lake available from Hortonworks.
  • HDFS Handler
  • HDFS Event Handler
  • ORC Event Handler
  • Parquet Event Handler
2.x and 3.x https://repo.hortonworks.com/content/groups/public/org/apache/hadoop/hadoop-azure/
Apache Hadoop Client Plus Required Libraries for Azure Connectivity hadoop.sh The Hadoop client including the libraries for connectivity to Azure Data Lake.
  • HDFS Handler
  • HDFS Event Handler
  • ORC Event Handler
  • Parquet Event Handler
2.7.x and higher, and 3.x https://search.maven.org/artifact/org.apache.hadoop/hadoop-azure
HBase Client Provided by Cloudera hbase_cloudera.sh The HBase client libraries provided by Cloudera. HBase Handler 1.x and 2.x https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/hbase/hbase-client/
HBase Client Provided by Hortonworks hbase_hortonworks.sh The HBase client libraries provided by Hortonworks. HBase Handler 1.x and 2.x https://repo.hortonworks.com/content/groups/public/org/apache/hbase/hbase-client/
Apache HBase Client hbase.sh

The HBase client.

HBase Handler 1.x and 2.x https://search.maven.org/artifact/org.apache.hbase/hbase-client
Apache Kafka Client plus Kafka Connect Framework and JSON Converter from Cloudera kafka_cloudera.sh The Kafka Client plus libraries for the Kafka Connect framework and the Kafka Connect JSON Converter provided by Cloudera.
  • Kafka Handler
  • Kafka Connect Handler
  • Kafka Capture
0.9.x to current https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/kafka/kafka-clients/
Apache Kafka Client plus Kafka Connect Framework and JSON Converter from Hortonworks kafka_hortonworks.sh The Kafka Client plus libraries for the Kafka Connect framework and the Kafka Connect JSON Converter provided by Hortonworks.
  • Kafka Handler
  • Kafka Connect Handler
  • Kafka Capture
0.9.x to current https://repo.hortonworks.com/content/groups/public/org/apache/kafka/kafka-clients/
Apache Kafka Client plus Kafka Connect Framework and JSON Converter kafka.sh The Kafka Client plus libraries for the Kafka Connect framework and the Kafka Connect JSON Converter.
  • Kafka Handler
  • Kafka Connect Handler
  • Kafka Capture
0.9.x to current https://search.maven.org/artifact/org.apache.kafka/kafka-clients
Confluent Kafka Client plus Kafka Connect Framework and JSON and Avro Converters kafka_confluent.sh The Kafka Client plus libraries for the Kafka Connect framework and the Kafka Connect JSON Converter and the Kafka Connect Avro Converter available from Confluent.
  • Kafka Handler
  • Kafka Connect Handler
  • Kafka Capture
Confluent platform 4.1.0 and higher. The Confluent Maven repository does not provide a web GUI interface.
MongoDB Client mongodb.sh The MongoDB client libraries. MongoDB Handler 3.x https://search.maven.org/artifact/org.mongodb/mongo-java-driver
Oracle NoSQL Client oracle_nosql.sh The Oracle NoSQL client libraries. Oracle NoSQL Handler 3.x, 4.x, and 18.x https://search.maven.org/artifact/com.oracle.kv/oracle-nosql-client
Oracle OCI Client oracle_oci.sh The Oracle OCI client libraries. Oracle OCI Event Handler 1.x https://search.maven.org/artifact/com.oracle.oci.sdk/oci-java-sdk-full
Apache ORC (Optimized Row Columnar) Client orc.sh The Apache ORC client libraries. ORC is built on top of the Hadoop client so the ORC Event Handler needs the Hadoop client in order to run. The Hadoop client needs to be downloaded separately. ORC Event Handler 1.x https://search.maven.org/artifact/org.apache.orc/orc-core
Apache Parquet Client parquet.sh The Apache Parquet client libraries. Parquet is built on top of the Hadoop client, so the Parquet Event Handler needs the Hadoop client in order to run. The Hadoop client needs to be downloaded separately. Parquet Event Handler 1.x https://search.maven.org/artifact/org.apache.parquet/parquet-hadoop