1 What is Oracle Data Integration Platform Cloud?

Oracle Data Integration Platform Cloud (DIPC) is a unified platform for real-time data replication, data transformation, data quality, and data governance. Learn about Data Integration Platform Cloud features, its common use cases, and the different components of its architecture. Understand what features are available in different types of Data Integration Platform Cloud instances and editions, and choose the ones that are most suitable for your data integration needs.

Why Use Oracle Data Integration Platform Cloud?

Data Integration Platform Cloud Data Integration Platform Cloud helps migrate and extract value from data by bringing together capabilities of a complete Data Integration, Data Quality and Data Governance solution into a single unified cloud based platform. You can use Data Integration Platform Cloud to move, transform, cleanse, integrate, replicate, analyze, and govern data.

With Data Integration Platform Cloud you can:

  • Perform seamless batch and real-time data movement among cloud and on-premises data sources.

  • Synchronize an entire data source or coup high volumes of data in batches to a new Oracle Database Cloud deployment. You can then access, profile, transform, and cleanse your data.

  • Stream data in real time to new data sources, perform data analysis on streaming data, and keep any number of data sources synchronized.

  • Perform bulk transformation by importing and executing Oracle Data Integrator scenarios.

  • Copy or move data from flat files or on-premises data sources to Oracle Data Lake. This is applicable only on the User-managed Data Integration Platform Cloud instance.

  • Integrate with big data technologies.

The most common uses of Data Integration Platform Cloud are:

  • Accelerate Data Warehouses

  • Automate Data Mart generation

  • Migrate data without any down time (zero down time migration)

  • Support redundancy through active-active model

  • Integrate Big Data

  • Synchronize data

  • Replicate data to Kafka and bootstrap stream analytics.

  • Monitor data health

  • Profile and validate data

High Level Concepts in Oracle Data Integration Platform Cloud

Data Integration Platform Cloud offers a number of different tasks to meet your data integration needs. Here’s a high-level overview of each task.

Data Integration Platform Cloud’s elevated integration tasks enable you to organize, consolidate, cleanse, transform, synchronize, replicate, and store data in the cloud. Supported data sources for these tasks include:

  • Oracle databases

  • Oracle Object Storage Classic

  • Autonomous Data Warehouse

  • Flat files

  • MySQL databases

  • SQL Server databases

  • Kafka

  • Salesforce

You create Connections to identify your data source in Data Integration Platform Cloud. After a Connection is created, Data Integration Platform Cloud harvests the metadata from the schema associated with that Connection, then profiles the data and calculates the popularity. This metadata is referred to as a Data Entity. You can view and edit data entities in the Catalog.

To learn about creating Connections, see Create a Connection.

Integration tasks that you can perform in Data Integration Platform Cloud include:
  • Synchronizing Data
  • Replicating Data
  • Executing an ODI Scenario
  • Preparing Data
  • Adding Data to a Data Lake

Synchronize Data

The Synchronize Data Task enables you to synchronize data between two Cloud databases, between an on-premises database and a cloud database, or between two on-premises databases. You can configure your Synchronize Data task to include Initial Load and/or Replication. You can also select the data entities to include in your task.

To learn about the data sources supported for this task, see What’s Certified for Synchronize Data?.

For information about creating this task, see Create a Synchronize Data Task.

Replicate Data

The Replicate Data Task of Oracle Data Integration Platform Cloud captures changes in your data source and updates the target in real time with that change.

ODI Execution

Use the ODI Execution Task to perform data transformations and bulk operations.

Prepare Data

The Prepare Data Task enables you to harvest data from a source Connection, then perform various transformations on that data.

To learn about the data sources supported for this task, see What’s Certified for Data Preparation.

For information about creating this task, see Create a Data Preparation Task.

Add Data to Data Lake

A Data Lake is a repository used to store vast amounts of data in its natural form until you're ready to use it. When you create a Data Lake in Data Integration Platform Cloud, you can find it in the Catalog as a Data Asset. You can then create an Add Data to Data Lake task, where you can add data from a variety of sources, including relational databases or flat files. You can also create an Execution Environment to run the Add Data to Data Lake task from a Spark on Big Data Cloud or YARN.

Monitor Jobs

A Job is created when a task is run. You can monitor Job actions in the Monitor dashboard. Select a Job to see its details as it’s running or review Job history after its completed.

Oracle-Managed or User-Managed?

Data Integration Platform Cloud instances are offered as Oracle-managed and user-managed.

Whether you choose Oracle-managed or user-managed, both instances provide the same Data Integration Platform Cloud console, an all-in-one platform with graphical menu options for data preparation, integration, synchronization and quality management. If you'd like your instances to take care of themselves in terms of management while you perform your data integration tasks from a console, then go with Oracle-managed. If you have special data synchronization cases that require you to access the GoldenGate or ODI applications to set up the parameter files, then go with user-managed.

Use the information in this table to determine whether you should choose Oracle-managed or user-managed.

Feature Oracle-Managed User-Managed

Infrastructure of where VM is hosted

Instances are hosted on Oracle Cloud Infrastructure (OCI), a new infrastructure designed for automating, scaling and management of containerized applications.

Note: Data regions are automatically selected based on your location.

Instances are hosted on Oracle Cloud Infrastructure Classic (OCI Classic) where traditionally Oracle Public Cloud instances were hosted.

See Data Regions to select a proper region for your instance.

Management

Instance patching, upgrade, rollback, start, stop, and restart operations are automatic. Details of Oracle-managed backups and patching are available through the Instance Administration page.

You manage instance patching, upgrade, rollback, start, stop, and restart operations. For example, you create an object storage container either before or during instance creation if backups are needed.

Data Integration Platform Cloud Console

You get access to this console through the menu option of the instance to perform the following tasks:

  • Create Connections

  • Synchronize Data (one way replication from source to target, does not include Oracle 11g to 12c)

  • Replicate Data

  • ODI Execution

  • Data Preparation

  • Add Data to Data Lake

You get access to this console through the menu option of the instance to perform the following tasks:

  • Create Connections

  • Synchronize Data (one way replication from source to target)

  • Replicate Data

  • ODI Execution

  • Data Preparation

  • Add Data to Data Lake

Agents

There are no pre-configured cloud agents available with your Oracle-managed instance. You can download remote agents from the Agents page of Data Integration Platform console.

See Set up an Agent.

A cloud agent is available with your user-managed instance. Remote agents are available for download from the Agents page of Data Integration Platform console.

See Set up an Agent.

Application Access

With Oracle-managed instances, you won't manage access rules or get SSH access to the following applications:

  • ODI Studio, accessible through SSH tunneling

  • ODI console

  • GoldenGate application on the VM 

If you need ODI Studio, run it from a user-managed instance or on-premises and import / export the design work. You won't have access to WebLogic Server to fix an ODI data source. Instead you file a report to the operations team.

You won't have access to the database and storage associated with your instance.

Manage access rules and have SSH access to the following applications:

  • ODI Studio, accessible through SSH tunneling

  • ODI console

  • GoldenGate application on the VM

Access to the IP addresses of the instance and its associated database and storage.

Access to:

  • Fusion Middleware Control Console

  • WebLogic Server Console

Data Integration Platform Cloud Repository

Oracle-managed instances have embedded Database Cloud Service instances that store Data Integration Platform Cloud metadata.

User-managed instances must have Database Cloud Service instances created independently to store repository data.

Metering

Charged a flat rate based on volume of data processed per hour by instances. Your options are:

  • 1GB data processed per hour

  • 5GB data processed per hour

  • 10GB data processed per hour

  • 20GB data processed per hour

For pricing information, see Data Integration Platform Cloud Universal Credits

Charged based on the shape of your instances. Your options are:

  • OC1M - 1.0 OCPU - 15.0 GB Memory

  • OC3M - 4.0 OCPU - 60.0 GB Memory

For more information about instance shapes go to, see About Shapes.

For pricing information, see Data Integration Platform Cloud Universal Credits.

Managing Users and Roles

Identity Cloud Service is available through the Instance Overview page.

Identity Cloud Service is available through the Instance Overview page.

VPN as a Service

Not Oracle Autonomous services This topic does not apply to Oracle Autonomous services.

Available.

Delivering Data to Big Data and MySQL

Available through binaries obtained from Big Data (OGG) Agent.

Available through setting up Oracle GoldenGate's Big Data and MySQL binaries for heterogeneous replication through the instance VM.

For a complete list of supported sources and targets see, Oracle GoldenGate (OGG) Certifications.

Scaling up and down, the compute shape of a node

Not applicable.

Supported.

Load balancer options

An Oracle-managed load balancer controls traffic to instances, but is not visible to the users.

An automatic load balancer controls traffic to instances.

Stream Analytics

Not Oracle Autonomous services This topic does not apply to Oracle Autonomous services.

Available.

Oracle-Managed Data Integration Platform Cloud Architecture

The Oracle-Managed Data Integration Platform Cloud instance is set up on the Oracle Cloud Infrastructure. The following diagram shows the relationship between Oracle Data Integration Platform Cloud and the on-premises and cloud data sources.

Figure 1-1 Oracle-Managed Data Integration Platform Architecture

Description of Figure 1-1 follows
Description of "Figure 1-1 Oracle-Managed Data Integration Platform Architecture"

The Oracle-Managed Data Integration Platform Cloud (DIPC) instance architecture includes the following components:

  • DIPC host - The DIPC host is located on the Oracle Cloud Infrastructure. The DIPC host includes Oracle Data Integrator and Oracle Enterprise Data Quality installations. You can access Oracle Enterprise Data Quality Director using any web browser. The Oracle-managed DIPC instance does not provide access to the Oracle Data Integrator Studio on the instance itself.
  • DIPC repository - The DIPC repository is located on an embedded Oracle Database Cloud Service within the Oracle Cloud Infrastructure. The DIPC repository stores metadata, such as connection details of the data sources, data objects, tasks, jobs, catalog, and so on. This metadata is required to run the DIPC tasks. DIPC does not store any type of user information.
  • DIPC agent - The DIPC agent establishes a connection between the DIPC host and the on-premises and cloud data sources. The DIPC agent also orchestrates the jobs that you run on the data sources from the DIPC console. You can download the DIPC agent from the DIPC console and install it on any machine, on-premises or cloud that has access to your data sources. For example, you can install the DIPC agent on premises, or on cloud, such as Oracle Cloud VMs or Third-Party Cloud VMs.
  • Components - The Components comprise a set of binary files and agent properties file that you need to set up to run the DIPC tasks. Based on the DIPC tasks you want to run, you need to select the appropriate components when you download the DIPC agent from the DIPC console. Depending on the components you select, appropriate files are included in the DIPC agent package, which you download as a zip file. You need to copy this zip file and unzip it on the machine where you want to configure the DIPC agent and the components.
  • Connection type - By default, the DIPC instance connects with the DIPC agent using HTTP protocol. You can choose to use Oracle FastConnect instead of HTTP.
  • DIPC console - The DIPC console provides an easy-to-use graphical user interface that lets you perform the data integration tasks on your data sources. You can access the DIPC console using any supported web browser.

User-Managed Data Integration Platform Cloud Architecture

The User-Managed Data Integration Platform Cloud instance is set up on the Oracle Cloud Infrastructure - Classic. The following diagram shows the relationship between Oracle Data Integration Platform Cloud and the on-premises and cloud data sources.

Figure 1-2 User-Managed Data Integration Platform Cloud Architecture

Description of Figure 1-2 follows
Description of "Figure 1-2 User-Managed Data Integration Platform Cloud Architecture"

The User-Managed Data Integration Platform Cloud (DIPC) instance architecture includes the following components:

  • DIPC host - The DIPC host is located on the Oracle Cloud Infrastructure. The DIPC host includes the DIPC host agent, Oracle GoldenGate, Oracle Data Integrator, and Oracle Enterprise Data Quality. You can access Oracle Data Integrator Studio and Oracle GoldenGate using a VNC Server, and Oracle Enterprise Data Quality Director using any supported web browser.
  • DIPC repository - The DIPC repository is located on the Oracle Database Cloud Service Classic within the Oracle Cloud Infrastructure. The DIPC repository stores metadata, such as connection details of the data sources, data objects, tasks, jobs, catalog, and so on. This metadata is required to run the DIPC tasks. DIPC does not store any type of user information.
  • DIPC agent - The DIPC agent establishes a connection between the DIPC host and the on-premises and cloud data sources. The DIPC agent also orchestrates the jobs that you run on the data sources from the DIPC console. You can download the DIPC agent from the DIPC console and install it on any machine, on-premises or cloud that has access to your data sources. For example, you can install the DIPC agent on premises, or on cloud, such as Oracle Cloud VMs or Third-Party Cloud VMs.
  • DIPC host agent - The DIPC host agent is a pre-configured ready-to-use agent that is located on the DIPC host. This agent works in the same manner as the DIPC agent, but only with the cloud data sources and components. If you want to use on-premises data sources or components, you must either download and use the DIPC agent, or set up a VPN connection so that the DIPC host agent can access the on-premises data sources.
  • Components - The Components comprise a set of binary files and agent properties file that you need to set up to run the DIPC tasks. Based on the DIPC tasks you want to run, you need to select the appropriate components when you download the DIPC agent using the DIPC console. Depending on the components you select, appropriate files are included in the DIPC agent package, which you download as a zip file. You need to copy this zip file and unzip it on the machine where you want to configure the DIPC agent and the components.
  • Connection type - By default, the DIPC instance connects with the DIPC Agent using HTTP protocol. You can choose to use Oracle FastConnect or Oracle VPNaaS instead of HTTP.
  • DIPC console - The DIPC console provides an easy-to-use graphical user interface that lets you perform the data integration tasks on your data sources. You can access the DIPC console using any supported web browser.

What are the Oracle Data Integration Platform Cloud Editions?

Data Integration Platform Cloud is available in three editions: Standard, Enterprise, and Governance. Each edition provides a different set of features. Understand what features are available in each of the editions and choose the one most suitable for your requirements.

Edition: Features

Standard

High-performance Extract-Transform-Load (ETL) functions. You can bulk copy your data sources, and then extract, enrich, and transform your data.

Enterprise

In addition to the features of the Standard edition, you have access to execute all the integration tasks from the Data Integration Platform Cloud console, stream analytics, and Big Data technologies.

Governance

In addition to the features of the Enterprise edition, you have access to the data quality, data profiling, and data governance features. You can profile, cleanse, and govern your data sources using customized dashboards.