7.8 Realtime Message Ingestion to Azure Event Hubs with Oracle GoldenGate for Distributed Applications and Analytics

Overview

This Quickstart covers a step-by-step processes showing how to ingest messages to Azure Event Hub in real-time with Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA).

Azure Event Hubs is a cloud native data streaming service that can stream high volumes of messages. Azure Event Hubs provides an Apache Kafka endpoint on an event hub, which enables users to connect to the event hub using the Kafka protocol.

GG for DAA connects to Azure Event Hub Apache Kafka Endpoint with Kafka Handler. GG for DAA reads the source operations from the trail file, formats them, maps to Azure Event Hubs and delivers.

7.8.1 Prerequisites

To successfully complete this Quickstart, you must have the following:

  • An Azure Event Hubs Namespace
  • Shared Access Policies for your Azure Event Hubs Namespace
In this Quickstart, a sample trail file (named tr), which is shipped with GG for DAA is used. The sample trail file is located at GG_HOME/opt/AdapterExamples/trail/ in your GG for DAA instance.

7.8.2 Install Dependency Files

GG for DAA uses Apache Kafka client libraries for Azure Event Hubs. You can download the SDKs using Dependency Downloader utility shipped with GG for DAA. Dependency downloader is a set of shell scripts that downloads dependency jar files from Maven and other repositories.

  1. In your GG for DAA VM, go to dependency downloader utility. It is located at GG_HOME/opt/DependencyDownloader/ and locate kafka.sh.
  2. Execute kafka.sh with the required version. You can check the version and reported vulnerabilities in Maven Central. This document uses 3.7.0 which is the latest version when this quick start is published.

    Figure 7-51 Execute kafka.sh

    Execute kafka.sh
  3. A new directory is created in /u01/app/ogg/opt/DependencyDownloader/dependencies/kafka_3.7.0.

7.8.3 Create a producer.properties for Azure Event Hubs

In GG for DAA instance, create a file called producer.properties and copy the following configuration.

bootstrap.servers=<namespace>.servicebus.windows.net:9093
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule  required username="$ConnectionString" password=" Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXXXXX";
value.serializer = org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer = org.apache.kafka.common.serialization.ByteArraySerializer
For more information, see Azure Event Hub Shared Access Key.

7.8.4 Create a Replicat in Oracle GoldenGate for Distirbuted Applications and Analytics

To create a replicat in Oracle GoldenGate for Distirbuted Applications and Analytics (GG for DAA):

  1. In the GG for DAA UI, in the Administration Service tab, click the + sign to add a replicat.

    Figure 7-52 Click + in the Administration Service tab.

    Click + in the Administration Service tab.
  2. Select the Classic Replicat Replicat Type and click Next There are two different Replicat types available: Classic and Coordinated. Classic Replicat is a single threaded process whereas Coordinated Replicat is a multithreaded one that applies transactions in parallel.

    Figure 7-53 Add Replicat

    Add Replicat
  3. Enter the basic information, and click Next:
    1. Replicat Trail : Name of the required trail file (if using sample trail, provide as tr)
    2. Subdirectory: Enter GG_HOME/opt/AdapterExamples/trail/ if using the sample trail.
    3. Target: Kafka

    Figure 7-54 Replicat Options

    Replicat Options
  4. Leave Managed Options as is and click Next.

    Figure 7-55 Managed Options

    Managed Options
  5. Enter Parameter File details and click Next. In the Parameter File, you can specify source to target mapping or leave it as-is with a wildcard selection.

    Figure 7-56 Parameter File

    Parameter File
  6. In the Properties file, update the properties marked as TODO and click Create and Run.
    #Kafka Handler Template
    gg.handlerlist=kafkahandler
    gg.handler.kafkahandler.type=kafka
    #TODO: Set the name of the Kafka producer properties file.
    gg.handler.kafkahandler.kafkaProducerConfigFile=/path_to/producer.properties
    #TODO: Set the template for resolving the topic name.
    gg.handler.kafkahandler.topicMappingTemplate=<target_event_hub_name>
    gg.handler.kafkahandler.keyMappingTemplate=${primaryKeys}
    gg.handler.kafkahandler.mode=op
    gg.handler.kafkahandler.format=json
    gg.handler.kafkahandler.format.metaColumnsTemplate=${objectname[table]},${optype[op_type]},${timestamp[op_ts]},${currenttimestamp[current_ts]},${position[pos]}
    #TODO: Set the location of the Kafka client libraries.
    gg.classpath=path_to/dependencies/kafka_3.7.0/*
    jvm.bootoptions=-Xmx512m -Xms32m
    
    GG for DAA supports dynamic topic mapping by template keywords. For example, if you assign topicMappingTemplate as ${tablename}, GG for DAA will create an Event Hub with the source table name, per each source table and will map the events to these topics.

    Oracle recommends to use keyMappingTemplate=${primaryKeys}, GG for DAA sends the source operations with the same pk to the same partition. This will guarantee maintaining the order of the source operations while delivering to Azure Event Hubs.
  7. If replicat starts successfully, it will be in running state. You can go to action/details/statistics to see the replication statistics.

    Figure 7-57 Replicat Statistics

    Replicat Statistics
  8. You can go to Azure Event Hub console and check the statistics.

    Figure 7-58 Azure Event Hub console

    Azure Event Hub console

For more details about Azure Event Hub replication, see Apache Kafka.

Note:

  • If target kafka topic does not exist, then it is auto created by GG for DAA if the auto topic create is selected in OCI Streaming Kafka connection settings. You can use Template Keywords to dynamically assign topic names.
  • See blog for improving the performance of the OCI Streaming replication.