1 What is Oracle Data Integration Platform Cloud?

Oracle Data Integration Platform Cloud (DIPC) is a unified platform for real-time data replication, data transformation, data quality, and data governance. Learn about Oracle Data Integration Platform Cloud features, its common use cases, and the different components of its architecture. Understand what features are available in different types of Data Integration Platform Cloud instances and editions, and choose the ones that are most suitable for your data integration needs.

Why Use Oracle Data Integration Platform Cloud?

Data Integration Platform Cloud Data Integration Platform Cloud helps migrate and extract value from data by bringing together capabilities of a complete Data Integration, Data Quality and Data Governance solution into a single unified cloud based platform. You can use Data Integration Platform Cloud to move, transform, cleanse, integrate, replicate, analyze, and govern data.

With Data Integration Platform Cloud you can:

  • Perform seamless batch and real-time data movement among cloud and on-premises data sources.

  • Synchronize an entire data source or coup high volumes of data in batches to a new Oracle Database Cloud deployment. You can then access, profile, transform, and cleanse your data.

  • Stream data in real time to new data sources, perform data analysis on streaming data, and keep any number of data sources synchronized.

  • Perform bulk transformation by importing and executing Oracle Data Integrator scenarios.

  • Copy or move data from flat files or on-premises data sources to Oracle Data Lake. This is applicable only on the Data Integration Platform Cloud Classic instance.

  • Integrate with big data technologies.

The most common uses of Data Integration Platform Cloud are:

  • Accelerate Data Warehouses

  • Automate Data Mart generation

  • Migrate data without any down time (zero down time migration)

  • Support redundancy through active-active model

  • Integrate Big Data

  • Synchronize data

  • Replicate data to Kafka and bootstrap stream analytics.

  • Monitor data health

  • Profile and validate data

Data Integration Platform Cloud Certifications

The following sections provide information on data sources, operating systems and platforms that you can use to as connections to create tasks and run jobs for Data Integration Platform Cloud (DIPC) and Data Integration Platform Cloud Classic (DIPC-C) instances.

What’s Certified for each of the Data Integration Platform Cloud Tasks

Before you create tasks, you must have an agent set up and running. Review what’s certified for agents first, before you review what’s certified for your tasks.

Other Links

If none of these tasks work as a data integration solution for you, then refer to What's Certified for Hosted VM Tasks for a do-it-yourself approach.

High Level Concepts in Oracle Data Integration Platform Cloud

Data Integration Platform Cloud offers a number of different tasks to meet your data integration needs. Here’s a high-level overview of each task.

Data Integration Platform Cloud’s elevated integration tasks enable you to organize, consolidate, cleanse, transform, synchronize, replicate, and store data in the cloud. Supported data sources for these tasks include:

  • Oracle databases

  • Oracle Object Storage Classic

  • Autonomous Data Warehouse

  • Flat files

  • MySQL databases

  • SQL Server databases

  • Kafka

  • Salesforce

You create Connections to identify your data source in Data Integration Platform Cloud. After a Connection is created, Data Integration Platform Cloud harvests the metadata from the schema associated with that Connection, then profiles the data and calculates the popularity. This metadata is referred to as a Data Entity. You can view and edit data entities in the Catalog.

To learn about creating Connections, see Create a Connection.

Integration tasks that you can perform in Data Integration Platform Cloud include:

  • Synchronizing Data
  • Replicating Data
  • Executing an ODI Scenario
  • Preparing Data
  • Adding Data to a Data Lake

Synchronize Data

The Synchronize Data Task enables you to synchronize data between two Cloud databases, between an on-premises database and a cloud database, or between two on-premises databases. You can configure your Synchronize Data task to include Initial Load and/or Replication. You can also select the data entities to include in your task.

To learn about the data sources supported for this task, see What’s Certified for Synchronize Data?.

For information about creating this task, see Create a Synchronize Data Task.

Replicate Data

The Replicate Data Task of Data Integration Platform Cloud captures changes in your data source and updates the target in real time.

To learn about the data sources supported for this task, see What’s Certified for Replicate Data?

For more information about creating this task, see Replicate Data.

ODI Execution

Perform data transformations and bulk operations using new or existing Scenarios from ODI.

To learn about the data sources supported for this task, see What’s Certified for ODI Execution?

For more information about creating this task, see ODI Execution.

Prepare Data

The Prepare Data Task enables you to harvest data from a source Connection, then perform various transformations on that data.

To learn about the data sources supported for this task, see What’s Certified for Data Preparation.

For information about creating this task, see Create a Data Preparation Task.

Add Data to Data Lake

A Data Lake is a repository used to store vast amounts of data in its natural form until you're ready to use it. When you create a Data Lake in Data Integration Platform Cloud, you can find it in the Catalog as a Data Asset. You can then create an Add Data to Data Lake task, where you can add data from a variety of sources, including relational databases or flat files. You can also create an Execution Environment to run the Add Data to Data Lake task from a Spark on Big Data Cloud or YARN.

To learn about the data sources supported for this task, see What’s Certified for Add Data to Data Lake?

For information about creating this task, see Add Data to Data Lake.

Monitor Jobs

A Job is created when a task is run. You can monitor Job actions in the Monitor dashboard. Select a Job to see its details as it’s running or review Job history after its completed.

For information about monitoring jobs, see Monitor Jobs

Tour Data Integration Platform Cloud

Take a tour through Data Integration Platform Cloud, where you can perform all your real-time data replication, integration, transformation, streaming, analytics and data quality management from a single platform.

Using Data Integration Platform Cloud, you can move your data quickly and easily to Oracle Autonomous Data Warehouse Cloud, smoothly create connections between data sources and targets, and execute a process for bulk data movement. This product tour takes you through the Data Integration Platform Cloud interfaces and various features.

Data Integration Platform Cloud versus Data Integration Platform Cloud Classic

Oracle offers Data Integration Platform Cloud in two forms, Oracle-managed Data Integration Platform Cloud and user-managed Data Integration Platform Cloud Classic.

The following table outlines the differences between Data Integration Platform Cloud and Data Integration Platform Cloud Classic. It also provides information about the services that are available in each of them. Read this information carefully and choose the one that’s appropriate for you.

Feature Data Integration Platform Cloud Data Integration Platform Cloud Classic

Infrastructure

Oracle Cloud Infrastructure hosts the DIPC instances. Oracle Cloud Infrastructure is designed for automating, scaling, and managing containerized applications. Data regions are automatically selected based on your location.

Oracle Cloud Infrastructure Classic hosts the DIPC Classic instance.

See Data Regions to select a proper region for your instance.

Management

Instance patching, upgrade, rollback, start, stop, and restart operations are automatic. The Instance Administration page provides the details of the Oracle-managed backups.

You have to manage instance patching, upgrade, rollback, start, stop, and restart operations. For example, if you need backups, you have to create an object storage container either before or during instance creation.

Access to Data Integration applications

VM access is not available with Oracle-managed instances.

The DIPC host includes Oracle Data Integrator and Oracle Enterprise Data Quality installations. You can access Oracle Enterprise Data Quality Director using any web browser. The Oracle-managed DIPC instance does not provide access to the Oracle Data Integrator Studio on the instance itself.

You won't have access to the database and storage associated with your instance.

The DIPC host includes the Oracle GoldenGate, Oracle Data Integrator, and Oracle Enterprise Data Quality installations. You can access Oracle Data Integrator Studio and Oracle GoldenGate using a VNC Server, and Oracle Enterprise Data Quality Director using any supported web browser.

Connection Type

By default, the DIPC instance connects with the DIPC agent using HTTP protocol. You can choose to use Oracle FastConnect instead of HTTP.

By default, the DIPC instance connects with the DIPC Agent using HTTP protocol. You can choose to use Oracle FastConnect or Oracle VPNaaS instead of HTTP.

Delivering Data to Big Data and MySQL

You can deliver Big Data using binaries obtained from Big Data (OGG) agent.

You can deliver Big Data and MySQL by setting up Oracle GoldenGate's Big Data and MySQL binaries for heterogeneous replication through the VM hosted on the DIPC instance.

For a complete list of supported sources and targets see, Oracle GoldenGate (OGG) Certifications.

Scaling up and down, the compute shape of a node

Not applicable.

Supported.

Stream Analytics

Not applicable.

Available.

Data Integration Platform Cloud Architecture

This topic applies only to Data Integration Platform Cloud.

The Data Integration Platform Cloud instance is set up on Oracle Cloud Infrastructure. The following diagram shows the relationship between Oracle Data Integration Platform Cloud and the on-premises and cloud data sources.

Figure 1-1 Data Integration Platform Cloud Architecture

Description of Figure 1-1 follows
Description of "Figure 1-1 Data Integration Platform Cloud Architecture"

The Data Integration Platform Cloud (DIPC) instance architecture includes the following components:

  • DIPC host - The DIPC host is located on Oracle Cloud Infrastructure. The DIPC host includes Oracle Data Integrator and Oracle Enterprise Data Quality installations. You can access Oracle Enterprise Data Quality Director using any web browser. The Oracle-managed DIPC instance does not provide access to the Oracle Data Integrator Studio on the instance itself.
  • DIPC repository - The DIPC repository is located on an embedded Oracle Database Cloud Service within Oracle Cloud Infrastructure. The DIPC repository stores metadata, such as connection details of the data sources, data objects, tasks, jobs, catalog, and so on. This metadata is required to run the DIPC tasks. DIPC does not store any type of user information.
  • DIPC agent - The DIPC agent establishes a connection between the DIPC host and the on-premises and cloud data sources. The DIPC agent also orchestrates the jobs that you run on the data sources from the DIPC console. You can download the DIPC agent from the DIPC console and install it on any machine, on-premises or cloud that has access to your data sources. For example, you can install the DIPC agent on premises, or on cloud, such as Oracle Cloud VMs or Third-Party Cloud VMs.
  • Components - The Components comprise a set of binary files and agent properties file that you need to set up to run the DIPC tasks. Based on the DIPC tasks you want to run, you need to select the appropriate components when you download the DIPC agent from the DIPC console. Depending on the components you select, appropriate files are included in the DIPC agent package, which you download as a zip file. You need to copy this zip file and unzip it on the machine where you want to configure the DIPC agent and the components.
  • Connection type - By default, the DIPC instance connects with the DIPC agent using HTTP protocol. You can choose to use Oracle FastConnect instead of HTTP.
  • DIPC console - The DIPC console provides an easy-to-use graphical user interface that lets you perform the data integration tasks on your data sources. You can access the DIPC console using any supported web browser.

Data Integration Platform Cloud Classic Architecture

Only DIPC ClassicThis topic only applies to Data Integration Platform Cloud Classic.

The Data Integration Platform Cloud Classic instance is set up on Oracle Cloud Infrastructure Classic. The following diagram shows the relationship between Oracle Data Integration Platform Cloud and the on-premises and cloud data sources.

Figure 1-2 Data Integration Platform Cloud Classic Architecture

Description of Figure 1-2 follows
Description of "Figure 1-2 Data Integration Platform Cloud Classic Architecture"

The Data Integration Platform Cloud Classic instance architecture includes the following components:

  • DIPC host - The DIPC host is located on Oracle Cloud Infrastructure. The DIPC host includes the DIPC host agent, Oracle GoldenGate, Oracle Data Integrator, and Oracle Enterprise Data Quality. You can access Oracle Data Integrator Studio and Oracle GoldenGate using a VNC Server, and Oracle Enterprise Data Quality Director using any supported web browser.
  • DIPC repository - The DIPC repository is located on the Oracle Database Cloud Service Classic within the Oracle Cloud Infrastructure. The DIPC repository stores metadata, such as connection details of the data sources, data objects, tasks, jobs, catalog, and so on. This metadata is required to run the DIPC tasks. DIPC does not store any type of user information.
  • DIPC agent - The DIPC agent establishes a connection between the DIPC host and the on-premises and cloud data sources. The DIPC agent also orchestrates the jobs that you run on the data sources from the DIPC console. You can download the DIPC agent from the DIPC console and install it on any machine, on-premises or cloud that has access to your data sources. For example, you can install the DIPC agent on premises, or on cloud, such as Oracle Cloud VMs or Third-Party Cloud VMs.
  • DIPC host agent - The DIPC host agent is a pre-configured ready-to-use agent that is located on the DIPC host. This agent works in the same manner as the DIPC agent, but only with the cloud data sources and components. If you want to use on-premises data sources or components, you must either download and use the DIPC agent, or set up a VPN connection so that the DIPC host agent can access the on-premises data sources.
  • Components - The Components comprise a set of binary files and agent properties file that you need to set up to run the DIPC tasks. Based on the DIPC tasks you want to run, you need to select the appropriate components when you download the DIPC agent using the DIPC console. Depending on the components you select, appropriate files are included in the DIPC agent package, which you download as a zip file. You need to copy this zip file and unzip it on the machine where you want to configure the DIPC agent and the components.
  • Connection type - By default, the DIPC instance connects with the DIPC Agent using HTTP protocol. You can choose to use Oracle FastConnect or Oracle VPNaaS instead of HTTP.
  • DIPC console - The DIPC console provides an easy-to-use graphical user interface that lets you perform the data integration tasks on your data sources. You can access the DIPC console using any supported web browser.

What are the Oracle Data Integration Platform Cloud Editions?

Data Integration Platform Cloud is available in three editions: Standard, Enterprise, and Governance. Each edition provides a different set of features. Understand what features are available in each of the editions and choose the one most suitable for your requirements.

Edition: Features

Standard

High-performance Extract-Transform-Load (ETL) functions. You can bulk copy your data sources, and then extract, enrich, and transform your data.

Enterprise

In addition to the features of the Standard edition, you have access to execute all the integration tasks from the Data Integration Platform Cloud console, Stream Analytics, and Big Data technologies.

Governance

In addition to the features of the Enterprise edition, you have access to the data quality, data profiling, and data governance features. You can profile, cleanse, and govern your data sources using customized dashboards.