Learn About Connecting Real-Time Data Streams to Oracle Autonomous Database

Streaming real-time data to your data warehouse platform for processing and generating insightful reports can be a challenging task.

When JSON is used for streaming data formats the data conversion can be a challenge. In this solution, you will learn how to use Oracle Cloud Infrastructure resources like Oracle GoldenGate Stream Analytics to stream Kafka data to Oracle Autonomous Database.

Before You Begin

Before you begin, complete the following installation from Oracle Live Labs:

Architecture

This architecture shows on-premises Kafka Streams, Oracle Stream Analytics, and Oracle Autonomous Database in an OCI region.

Use this architecture for ingesting data from an on-premises Kafka stream into Oracle GoldenGate Stream Analytics (GGSA).

Description of kafka-stream-adb-goldengate-arch.png follows
Description of the illustration kafka-stream-adb-goldengate-arch.png

kafka-stream-adb-goldengate-arch.zip

  1. The OCI region containing GGSA ingests data from the on-premises Kafka streams.
  2. GGSA streams and converts the JSON data to relational data.
  3. GGSA stores the data in a relational table in Autonomous Database.

This architecture supports the following components:

  • Kafka Streams

    Kafka Streams is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters. It combines the simplicity of writing and deploying standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster technology.

    Kafka Streams greatly simplifies the stream processing from topics. Built on top of Kafka client libraries, it provides data parallelism, distributed coordination, fault tolerance, and scalability. Kafka Streams uses the concepts of partitions and tasks as logical units strongly linked to the topic partitions. Besides, it uses threads to parallel process within an application instance. Another important capability supported is the state stores, used by Kafka Streams to store and query data coming from the topics.

    The Oracle GoldenGate for Kafka Handler streams change capture data from an Oracle GoldenGate trail to a Kafka topic.

  • Oracle GoldenGate

    Oracle Cloud Infrastructure GoldenGate is a fully managed service that allows data ingestion from sources residing on premises or in any cloud, leveraging the GoldenGate CDC technology for a non intrusive and efficient capture of data and delivery to Oracle Autonomous Data Warehouse in real time and at scale in order to make relevant information available to consumers as quickly as possible.

  • Autonomous Database

    Oracle Cloud Infrastructure Autonomous Database is a fully managed, preconfigured database environments that you can use for transaction processing and data warehousing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.

About Required Services and Roles

This solution requires the following services and roles:

  • Oracle Autonomous Data Warehouse

  • Oracle GoldenGate Stream Analytics (GGSA)

  • Oracle Cloud Infrastructure

These are the roles needed for each service.

Service Name: Role Required to...
Oracle Autonomous Data Warehouse: admin Create the credentials.
Oracle GoldenGate Stream Analytics: admin Access the GGSA console.
Oracle Cloud Infrastructure: admin
  1. Install GGSA from Marketplace.
  2. Configure Kafka producer to ingest data.
  3. Connect GGSA for Kafka to Autonomous Database.

See Learn how to get Oracle Cloud services for Oracle Solutions to get the cloud services you need.