1 Oracle Data Integrator Overview

This chapter provides an introduction to Oracle Data Integrator, the technical architecture, and the contents of this Getting Started guide.

This chapter includes the following sections:

1.1 Introduction to Oracle Data Integrator

A widely used data integration software product, Oracle Data Integrator provides a new declarative design approach to defining data transformation and integration processes, resulting in faster and simpler development and maintenance. Based on a unique E-LT architecture (Extract - Load Transform), Oracle Data Integrator not only guarantees the highest level of performance possible for the execution of data transformation and validation processes but is also the most cost-effective solution available today.

Oracle Data Integrator provides a unified infrastructure to streamline data and application integration projects.

1.1.1 The Business Problem

In today's increasingly fast-paced business environment, organizations need to use more specialized software applications; they also need to ensure the coexistence of these applications on heterogeneous hardware platforms and systems and guarantee the ability to share data between applications and systems. Projects that implement these integration requirements need to be delivered on-spec, on-time and on-budget.

1.1.2 A Unique Solution

Oracle Data Integrator employs a powerful declarative design approach to data integration, which separates the declarative rules from the implementation details. Oracle Data Integrator is also based on a unique E-LT (Extract - Load Transform) architecture which eliminates the need for a standalone ETL server and proprietary engine, and instead leverages the inherent power of your RDBMS engines. This combination provides the greatest productivity for both development and maintenance, and the highest performance for the execution of data transformation and validation processes.

Here are the key reasons why companies choose Oracle Data Integrator for their data integration needs:

  • Faster and simpler development and maintenance: The declarative rules driven approach to data integration greatly reduces the learning curve of the product and increases developer productivity while facilitating ongoing maintenance. This approach separates the definition of the processes from their actual implementation, and separates the declarative rules (the "what") from the data flows (the "how").

  • Data quality firewall: Oracle Data Integrator ensures that faulty data is automatically detected and recycled before insertion in the target application. This is performed without the need for programming, following the data integrity rules and constraints defined both on the target application and in Oracle Data Integrator.

  • Better execution performance: traditional data integration software (ETL) is based on proprietary engines that perform data transformations row by row, thus limiting performance. By implementing an E-LT architecture, based on your existing RDBMS engines and SQL, you are capable of executing data transformations on the target server at a set-based level, giving you much higher performance.

  • Simpler and more efficient architecture: the E-LT architecture removes the need for an ETL Server sitting between the sources and the target server. It utilizes the source and target servers to perform complex transformations, most of which happen in batch mode when the server is not busy processing end-user queries.

  • Platform Independence: Oracle Data Integrator supports all platforms, hardware and OSs with the same software.

  • Data Connectivity: Oracle Data Integrator supports all RDBMSs including all leading Data Warehousing platforms such as Oracle, Exadata, Teradata, IBM DB2, Netezza, Sybase IQ and numerous other technologies such as flat files, ERPs, LDAP, XML.

  • Cost-savings: the elimination of the ETL Server and ETL engine reduces both the initial hardware and software acquisition and maintenance costs. The reduced learning curve and increased developer productivity significantly reduce the overall labor costs of the project, as well as the cost of ongoing enhancements.

1.2 ODI Component Architecture

The Oracle Data Integrator platform integrates in the broader Fusion Middleware platform and becomes a key component of this stack. Oracle Data Integrator provides its run-time components as Java EE applications, enhanced to fully leverage the capabilities of the Oracle WebLogic Application Server. Oracle Data Integrator components include exclusive features for Enterprise-Scale Deployments, high availability, scalability, and hardened security. Figure 1-1 shows the ODI component architecture.

Figure 1-1 Oracle Data Integrator Component Architecture

Surrounding text describes Figure 1-1 .

1.2.1 Repositories

The central component of the architecture is the Oracle Data Integrator Repository. It stores configuration information about the IT infrastructure, metadata of all applications, projects, scenarios, and the execution logs. Many instances of the repository can coexist in the IT infrastructure, for example Development, QA, User Acceptance, and Production. The architecture of the repository is designed to allow several separated environments that exchange metadata and scenarios (for example: Development, Test, Maintenance and Production environments). The repository also acts as a version control system where objects are archived and assigned a version number.

The Oracle Data Integrator Repository is composed of one Master Repository and several Work Repositories. Objects developed or configured through the user interfaces are stored in one of these repository types.

There is usually only one master repository that stores the following information:

  • Security information including users, profiles and rights for the ODI platform

  • Topology information including technologies, server definitions, schemas, contexts, languages and so forth.

  • Versioned and archived objects.

The work repository is the one that contains actual developed objects. Several work repositories may coexist in the same ODI installation (for example, to have separate environments or to match a particular versioning life cycle). A Work Repository stores information for:

  • Models, including schema definition, datastores structures and metadata, fields and columns definitions, data quality constraints, cross references, data lineage and so forth.

  • Projects, including business rules, packages, procedures, folders, Knowledge Modules, variables and so forth.

  • Scenario execution, including scenarios, scheduling information and logs.

When the Work Repository contains only the execution information (typically for production purposes), it is then called an Execution Repository.

1.2.2 ODI Studio and User Interfaces

Administrators, Developers and Operators use the Oracle Data Integrator Studio to access the repositories. This Fusion Client Platform (FCP) based UI is used for administering the infrastructure (security and topology), reverse-engineering the metadata, developing projects, scheduling, operating and monitoring executions.

ODI Studio provides four Navigators for managing the different aspects and steps of an ODI integration project:

  • Designer Navigator is used to design data integrity checks and to build transformations such as for example:

    • Automatic reverse-engineering of existing applications or databases

    • Graphical development and maintenance of transformation and integration interfaces

    • Visualization of data flows in the interfaces

    • Automatic documentation generation

    • Customization of the generated code

  • Operator Navigator is the production management and monitoring tool. It is designed for IT production operators. Through Operator Navigator, you can manage your interface executions in the sessions, as well as the scenarios in production.

  • Topology Navigator is used to manage the data describing the information system's physical and logical architecture. Through Topology Navigator you can manage the topology of your information system, the technologies and their datatypes, the data servers linked to these technologies and the schemas they contain, the contexts, the languages and the agents, as well as the repositories. The site, machine, and data server descriptions will enable Oracle Data Integrator to execute the same integration interfaces in different physical environments.

  • Security Navigator is the tool for managing the security information in Oracle Data Integrator. Through Security Navigator you can create users and profiles and assign user rights for methods (edit, delete, etc) on generic objects (data server, datatypes, etc), and fine-tune these rights on the object instances (Server 1, Server 2, and so forth).

Oracle Data Integrator also provides a Java API for performing all these run-time and design-time operations. This Oracle Data Integrator Software Development Kit (SDK) is available for standalone Java applications and application servers.

1.2.3 Run-Time Agent

At design time, developers generate scenarios from the business rules that they have designed. The code of these scenarios is then retrieved from the repository by the Run-Time Agent. This agent then connects to the data servers and orchestrates the code execution on these servers. It retrieves the return codes and messages for the execution, as well as additional logging information – such as the number of processed records, execution time and so forth - in the Repository.The Agent comes in two different flavors:

  • The Java EE Agent can be deployed as a web application and benefit from the features of an application server.

  • The Standalone Agent runs in a simple Java Machine and can be deployed where needed to perform the integration flows.

Both these agents are multi-threaded java programs that support load balancing and can be distributed across the information system. This agent holds its own execution schedule which can be defined in Oracle Data Integrator, and can also be called from an external scheduler. It can also be invoked from a Java API or a web service interface.

1.2.4 Oracle Data Integrator Console

Business users (as well as developers, administrators and operators), can have read access to the repository, perform topology configuration and production operations through a web based UI called Oracle Data Integrator Console. This web application can deployed in a Java EE application server such as Oracle WebLogic.

To manage and monitor the Java EE and Standalone Agents as well as the ODI Console, Oracle Data Integrator provides a new plug-in that integrates in Oracle Fusion Middleware Control Console.

1.3 Get Started with Oracle Data Integrator

Table 1-1 summarizes the contents of this guide.

Table 1-1 Content Summary

This chapter Describes how to...

Chapter 2, "Installing Oracle Data Integrator and the Demonstration Environment"

Install Oracle Data Integrator and the demonstration environment

Chapter 3, "Working with the ETL Project"

Provides an introduction to the demonstration environment delivered with Oracle Data Integrator Studio

Chapter 4, "Starting Oracle Data Integrator"

Start the demonstration environment and Oracle Data Integrator Studio

Chapter 5, "Implementing Data Quality Control"

Implement data quality control

Chapter 6, "Working with Integration Interfaces"

Create and work with integration interfaces in Oracle Data Integrator

Chapter 7, "Working with Packages"

Create and work with Packages in Oracle Data Integrator

Chapter 8, "Executing Your Developments and Reviewing the Results"

Execute your developments, follow the execution, and interpret the execution results

Chapter 9, "Deploying Integrated Applications"

Run an ODI Package automatically in a production environment

Chapter 10, "Going Further with Oracle Data Integrator"

Perform advanced tasks with Oracle Data Integrator