Deploy an Analytics Platform for Informatica IDMC on Oracle Cloud

The partnership between Oracle and Informatica brings together two industry-leaders in database and data management to deliver a comprehensive enterprise data warehouse and lakehouse ecosystem.

This reference architecture shows how the Informatica IDMC Secure Agent operates in Oracle Cloud Infrastructure (OCI). Data can be exported from a wide range of on-premises and cloud sources by using any of over 300 connectors and then imported into an Oracle Autonomous Database to be consumed by either analytics or data science processes.

Without this integration, we can access actionable information from our application data (for example Oracle E-Business Suite) but we cannot enrich this with other sources of data to unlock valuable insight. It is also not good practice to run analytical workloads on operational systems.

This reference architecture positions the technology solution within the overall business context:



The integration provides an analytical platform where application data containing a record of interactions is combined with other sets of curated data in the management layer and is refined into actionable information and insight in the exploitation layer.

Architecture

This reference architecture shows how the Informatica IDMC Secure Agent operates in Oracle Cloud Infrastructure (OCI). Data is exported from on-premises and cloud-based enterprise applications, files, object stores, and databases and is then imported into Oracle Autonomous Database to be consumed by either analytics or data science processes.

The following diagram is a functional representation of the reference architecture.



informatica-oci-oracle.zip

In general, the architecture includes the following logical divisions. This reference architecture focuses on the data refinery and data persistence architecture components:

  • Ingest, Transform

    Ingests and refines the data for use in each of the data layers in the architecture.

  • Persist, Curate, Create

    Facilitates access and navigation of the data to show the current and historical business view. It contains both raw data as well as granular and aggregated curated data. For relational technologies, data may be logical or physically structured in simple relational, longitudinal, dimensional or OLAP forms. For non-relational data, this layer contains one or more pools of data, either output from an analytical process or data optimized for a specific analytical task.

    Oracle Autonomous Data Warehouse is a self-driving, self-securing, self-repairing database service that is optimized for data warehousing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.

  • Analyze, Learn, Predict

    Abstracts the logical business view of the data for the consumers. This abstraction facilitates agile approaches to development, migration to the target architecture, and the provision of a single reporting layer from multiple federated sources.



informatica-oci-arch-oracle.zip

In the above architecture, the compute shape hosts the Informatica Cloud Secure Agent. The Informatica Cloud Secure Agent is a lightweight program that runs all tasks and enables secure communication across the firewall between your organization and Informatica Intelligent Data Management Cloud. When the Secure Agent runs a task, it connects to the Informatica Cloud hosting facility to access task information. It connects directly and securely to sources and targets, transfers data between them, orchestrates the flow of tasks, runs processes, and performs any additional task requirement.

The architecture has the following components:

  • Tenancy

    A tenancy is a secure and isolated partition that Oracle sets up within Oracle Cloud when you sign up for Oracle Cloud Infrastructure. You can create, organize, and administer your resources in Oracle Cloud within your tenancy. A tenancy is synonymous with a company or organization. Usually, a company will have a single tenancy and reflect its organizational structure within that tenancy. A single tenancy is usually associated with a single subscription, and a single subscription usually only has one tenancy.

  • Region

    An Oracle Cloud Infrastructure region is a localized geographic area that contains one or more data centers, called availability domains. Regions are independent of other regions, and vast distances can separate them (across countries or even continents).

  • Availability domain

    Availability domains are standalone, independent data centers within a region. The physical resources in each availability domain are isolated from the resources in the other availability domains, which provides fault tolerance. Availability domains don’t share infrastructure such as power or cooling, or the internal availability domain network. So, a failure at one availability domain is unlikely to affect the other availability domains in the region.

  • Compartment

    Compartments are cross-region logical partitions within an Oracle Cloud Infrastructure tenancy. Use compartments to organize your resources in Oracle Cloud, control access to the resources, and set usage quotas. To control access to the resources in a given compartment, you define policies that specify who can access the resources and what actions they can perform.

  • Virtual cloud network (VCN) and subnets

    A VCN is a customizable, software-defined network that you set up in an Oracle Cloud Infrastructure region. Like traditional data center networks, VCNs give you complete control over your network environment. A VCN can have multiple non-overlapping CIDR blocks that you can change after you create the VCN. You can segment a VCN into subnets, which can be scoped to a region or to an availability domain. Each subnet consists of a contiguous range of addresses that don't overlap with the other subnets in the VCN. You can change the size of a subnet after creation. A subnet can be public or private.

  • Security list

    For each subnet, you can create security rules that specify the source, destination, and type of traffic that must be allowed in and out of the subnet.

  • Route table

    Virtual route tables contain rules to route traffic from subnets to destinations outside a VCN, typically through gateways.

  • Internet gateway

    The internet gateway allows traffic between the public subnets in a VCN and the public internet.

  • Network address translation (NAT) gateway

    A NAT gateway enables private resources in a VCN to access hosts on the internet, without exposing those resources to incoming internet connections.

  • Service gateway

    The service gateway provides access from a VCN to other services, such as Oracle Cloud Infrastructure Object Storage. The traffic from the VCN to the Oracle service travels over the Oracle network fabric and never traverses the internet.

  • Bastion service

    Oracle Cloud Infrastructure Bastion provides restricted and time-limited secure access to resources that don't have public endpoints and that require strict resource access controls, such as bare metal and virtual machines, Oracle MySQL Database Service, Autonomous Transaction Processing (ATP), Oracle Container Engine for Kubernetes (OKE), and any other resource that allows Secure Shell Protocol (SSH) access. With Oracle Cloud Infrastructure Bastion service, you can enable access to private hosts without deploying and maintaining a jump host. In addition, you gain improved security posture with identity-based permissions and a centralized, audited, and time-bound SSH session. Oracle Cloud Infrastructure Bastion removes the need for a public IP for bastion access, eliminating the hassle and potential attack surface when providing remote access.

  • Compute

    The Oracle Cloud Infrastructure Compute service enables you to provision and manage compute hosts in the cloud. You can launch compute instances with shapes that meet your resource requirements for CPU, memory, network bandwidth, and storage. After creating a compute instance, you can access it securely, restart it, attach and detach volumes, and terminate it when you no longer need it.

  • Identity and Access Management (IAM)

    Oracle Cloud Infrastructure Identity and Access Management (IAM) is the access control plane for Oracle Cloud Infrastructure (OCI) and Oracle Cloud Applications. The IAM API and the user interface enable you to manage identity domains and the resources within the identity domain. Each OCI IAM identity domain represents a standalone identity and access management solution or a different user population.

  • Policy

    An Oracle Cloud Infrastructure Identity and Access Management policy specifies who can access which resources, and how. Access is granted at the group and compartment level, which means you can write a policy that gives a group a specific type of access within a specific compartment, or to the tenancy.

  • Object storage

    Object storage provides quick access to large amounts of structured and unstructured data of any content type, including database backups, analytic data, and rich content such as images and videos. You can safely and securely store and then retrieve data directly from the internet or from within the cloud platform. You can seamlessly scale storage without experiencing any degradation in performance or service reliability. Use standard storage for "hot" storage that you need to access quickly, immediately, and frequently. Use archive storage for "cold" storage that you retain for long periods of time and seldom or rarely access.

  • Autonomous Data Warehouse

    Oracle Autonomous Data Warehouse is a self-driving, self-securing, self-repairing database service that is optimized for data warehousing workloads. You do not need to configure or manage any hardware, or install any software. Oracle Cloud Infrastructure handles creating the database, as well as backing up, patching, upgrading, and tuning the database.

Recommendations

Use the following recommendations as a starting point for integrating Informatica IDMC platform on Oracle Cloud.

Your requirements might differ from the architecture described here.

  • Virtual cloud network (VCN)

    When you create a VCN, determine the number of CIDR blocks required and the size of each block based on the number of resources that you plan to attach to subnets in the VCN. Use CIDR blocks that are within the standard private IP address space.

    Select CIDR blocks that don't overlap with any other network (in Oracle Cloud Infrastructure, your on-premises data center, or another cloud provider) to which you intend to set up private connections.

    After you create a VCN, you can change, add, and remove its CIDR blocks.

    When you design the subnets, consider your traffic flow and security requirements. Attach all the resources within a specific tier or role to the same subnet, which can serve as a security boundary.

    Use regional subnets.

  • Virtual Machines and other recommendations

    For virtual machine sizing and other recommendations, see the links in the Deploy section.

  • Security lists

    Use security lists to define ingress and egress rules that apply to the entire subnet.

Considerations

When integrating Informatica IDMC platform on Oracle Cloud, consider these implementation options.

Informatica Integration Data Refinery Data Persistence Platform
Recommended Informatica Intelligent Data Management Cloud (IDMC) Oracle Autonomous Data Warehouse (ADW)
Other Options N/A Exadata
Rationale Informatica considers Informatica IDMC to be the most comprehensive, microservices-based, API-driven, and AI-powered enterprise integration platform-as-a-service. With IDMC, a customer has the flexibility to use any cloud service Informatica has available to meet their integration and governance needs. ADW is an easy-to- use, fully autonomous database that scales elastically, delivers fast query performance, and requires no database administration. It also offers direct access to the data from object storage by using external tables.

Deploy

The Terraform code is available as a sample stack in Oracle Cloud Infrastructure Resource Manager. You can also download the code from GitHub, and customize it to your requirements.

  • Deploy using the sample stack in Oracle Cloud Infrastructure Resource Manager:
    1. Go to Deploy to Oracle Cloud.

      If you aren't already signed in, enter the tenancy and user credentials.

    2. Select the region where you want to deploy the stack.
    3. Follow the on-screen prompts and instructions to create the stack.
    4. After creating the stack, click Terraform Actions, and select Plan.
    5. Wait for the job to be completed, and review the plan.

      To make any changes, return to the Stack Details page, click Edit Stack, and make the required changes. Then, run the Plan action again.

    6. If no further changes are necessary, return to the Stack Details page, click Terraform Actions, and select Apply.
  • Deploy using the Terraform code in GitHub:
    1. Go to GitHub.
    2. Clone or download the repository to your local computer.
    3. Follow the instructions in the README document.

Change Log

This log lists significant changes:

Acknowledgments

  • Authors: Larry Fumagalli, Wei Han
  • Contributor: Robert Lies