What's New In Oracle Data Integrator?

This section summarizes the new features and significant product changes for Oracle Data Integrator (ODI) in the Oracle Fusion Middleware 12c release.

This chapter includes the following sections:

New and Changed Features for Release 12c (12.2.1)

Oracle Data Integrator 12c (12.2.1) introduces the following enhancements:

Big Data Enhancement: Oozie Execution Modes

You can now choose between Task and Session execution modes for Oozie workflow generation. The new Session mode allows support for transactions, scripting, and loops in packages. ODI will automatically choose the correct mode based on the executed object or the mode can also be manually selected.

Please note that Big Data enhancements were made available in ODI 12.1.3.0.1 and improved upon in this release. For more information regarding Big Data enhancements, see "New and Changed Features for Release 12c (12.1.3.0.1)".

Lifecycle Management of ODI Objects

  • Oracle Data Integrator is now integrated with Subversion and this provides you with the ability to version control ODI Objects in Subversion. For more information on Subversion, visit the Subversion website.

  • Using the Subversion integration capabilities, you can create tags to take a snapshot of ODI object versions. You can create branches for parallel development from distributed locations or for parallel development for multiple releases.

  • Release management capabilities are introduced to provide a distinction between the development and deployment environments. You can create Deployment Archives (DA) from a development environment, which can be deployed in a QA environment for testing, and then delivered to the production environment. The DA can be created using ODI Studio or from a command line.

ODI Exchange for Sharing Global ODI Objects

You can now browse, download, and install global ODI objects made available to you by Oracle or other ODI users through Official or Third-Party Update Centers. This feature is available for Global Knowledge Modules, Global User Functions and Mapping Components. The Check for Updates menu item in the Help menu in ODI Studio enables you to connect to the Update Centers and obtain Global ODI Objects.

Complex File Enhancements

The Native Format Builder utility is now included with ODI Studio and allows you to create nXSD files without leaving the ODI user interface.

Complex File, File, LDAP, JMS Queue XML, JMS Topic XML, and XML Technology Enhancements

All JDBC properties for Complex File, File, LDAP, JMS Queue XML, JMS Topic XML, and XML technologies are now displayed at the Data Server level along with default values where applicable and a description of the properties, thereby enhancing usability.

Pre/Post Processing for XML and Complex JDBC Drivers

You can now customize the way data is fed into XML and Complex File drivers. This feature adds support for intermediate processing stages that may be added for processing data as it has either been retrieved from the external endpoint using Oracle Data Integrator, or it is to be written out to an external endpoint. This feature also provides support for complex configuration of intermediate processing stages as part of the configuration of data servers that use ODI XML or Complex File JDBC drivers.

Improved Web Service Support

A new SOAP Web Service technology is now available in Topology and allows the creation of data servers, physical schemas, and logical schemas for Web Services. Oracle Web Service Management (OWSM) policies can also be attached to Web Services data servers. In addition, the OdiInvokeWebService tool is enhanced to support Web Services data servers through Contexts and Logical Schemas.

Ability to Cancel Import/Export and Reverse Engineering Operations

You can now cancel import/export and reverse-engineering operations that may run for a long time.

Support for Analytic or Window Functions

Analytic or Window functions are now supported out of the box at the Mapping level. Analytic functions such as PERCENT_RANK, LAST, FIRST, or LAG can be used at the Mapping Expression level in any component.

Oracle Connectivity Enhancements

A new Knowledge Module to perform Partition Exchange Loading is now available allowing you to swap partitions when needed. In addition, improvements have been made to Loading Knowledge Modules using External Tables, which can now load more than one file at a time. Knowledge Modules using Data Pump have also been improved.

Enhanced Integration with Oracle Enterprise Data Quality

A new Oracle Enterprise Data Quality (EDQ) technology is available in Topology and allows the creation of data servers, physical schemas, and logical schemas for EDQ. Also, the OdiEnterpriseDataQuality tool is enhanced to support EDQ data servers through Contexts and Logical Schemas.

Ability to View a List of Users Connected to Studio/Repository

The Review User Activity menu item has been added to the Security menu. Using this you can view, purge, and save user activity record in the User Connections dialog. This feature is available in ODI Studio and ODI Console.

ODI Console Enhancements

The overall look and feel of ODI Console has been improved. Security tasks such as creating users or profiles can now be performed in ODI Console. Also, Release Management activities can now be performed in ODI Console and the functionality related to Topology activities has been enhanced.

New and Changed Features for Release 12c (12.1.3.0.1)

Oracle Data Integrator 12c (12.1.3.0.1) introduces the following enhancements:

Execution of ODI Mappings using Spark and Pig

ODI allows the defining of mappings through a logical design, which is independent of the implementation language. For Hadoop-based transformations, you can select between Hive, Spark, and Pig as the generated transformation language. This allows you to pick the best implementation based on the environment and use case; you can also choose different implementations simultaneously using multiple physical designs. This selection makes development for Big Data flexible and future-proof.

  • Generate Pig Latin transformations: You can choose Pig Latin as the transformation language and execution engine for ODI mappings. Apache Pig is a platform for analyzing large data sets in Hadoop and uses the high-level language, Pig Latin for expressing data analysis programs. Any Pig transformations can be executed either in Local or MapReduce mode. Custom Pig code can be added through user-defined functions or the table function component.

  • Generate Spark transformations: ODI mapping can also generate PySpark, which exposes the Spark programming model in the Python language. Apache Spark is a transformation engine for large-scale data processing. It provides fast in-memory processing of large data sets. Custom PySpark code can be added through user-defined functions or the table function component.

Orchestration of ODI Jobs using Oozie

You can now choose between the traditional ODI Agent or Apache Oozie as the orchestration engine for ODI jobs such as mappings, packages, scenarios, and procedures. Apache Oozie allows a fully native execution on a Hadoop infrastructure without installing an ODI environment for orchestration. You can utilize Oozie tools to schedule, manage, and monitor ODI jobs. ODI uses Oozie's native actions to execute Hadoop processes and conditional branching logic.

Enhanced Hive Driver and Knowledge Modules

ODI includes the WebLogic Hive JDBC driver that provides a number of advantages when compared to the Apache Hive driver, such as, total JDBC compliance and improved performance. All Hive Knowledge Modules have been rewritten to benefit from this new driver. Also, the Knowledge Modules whose main purpose is to load from a source are now provided as Load Knowledge Modules, enabling them to be combined in a single mapping with other Load Knowledge Modules. A new class of "direct load" Load Knowledge Modules also allows the loading of targets without intermediate staging. The table function component has been extended to support Hive constructs.

Retrieval of Hadoop Audit Logs

ODI integrates results from Hadoop Audit Logs in Operator tasks for executions of Oozie, Pig, and other tasks. The log results show MapReduce statistics and provide a link to Hadoop statistics in native web consoles.

HDFS access in ODI File Tools

The file based tools used in ODI packages and procedures have been enhanced to include Hadoop Distributed File System (HDFS) file processing. This includes copying, moving, appending, and deleting files, detecting file changes, managing folders, and transferring files using FTP directly into HDFS.

Flatten and Jagged Components

The new Flatten component for mappings allows complex sub-structures to be processed as part of a flat list of attributes. The new Jagged component converts key-value lists into named attributes for further processing.

New Big Data Guide added to the ODI Documentation Set

A new guide, Integrating Big Data with Oracle Data Integrator, has been added to the ODI documentation set. This guide provides information on how to integrate Big Data, deploy and execute Oozie workflows, and generate code in languages such as Pig Latin and Spark.

New and Changed Features for Release 12c (12.1.3)

Oracle Data Integrator 12c (12.1.3) introduces the following enhancements:

ODI FIPS Compliance

ODI now uses Advanced Encryption Standard (AES) as the standard encryption algorithm for encrypting Knowledge Modules, procedures, scenarios, actions, and passwords. You can configure the encryption algorithm and key length to meet requirements. Passwords and other sensitive information included in repository exports are now encrypted and secured by a password.

For more information, see "Advanced Encryption Standard".

ODI XML Driver Enhancements

The following XML Schema support enhancements have been added:

  • Recursion: ODI now supports recursion inside XML Schemas.

  • any, anyType, and anyAttribute: Data defined by these types is stored in string type columns with XML markup from the original document.

  • Metadata annotations can be added inside an XML Schema to instruct the ODI XML Driver which table name, column name, type, length, and precision should be used.

For more information, see "Oracle Data Integrator Driver for XML Reference" in Connectivity and Knowledge Modules Guide for Oracle Data Integrator.

JSON Support

The ODI Complex File Driver can now read and write files in JSON format. The JSON structure is defined through an nXSD schema.

For more information, see "JSON Support" in Connectivity and Knowledge Modules Guide for Oracle Data Integrator.

Hadoop SQOOP Integration

ODI can now load the following sources and targets using Hadoop SQOOP:

  • From relational databases to HDFS, Hive, and HBase through Knowledge Module IKM File-Hive to SQL (SQOOP)

  • From HDFS and Hive to relational databases through Knowledge Module IKM SQL to Hive-HBase-File (SQOOP)

SQOOP enables load and unload mechanisms using parallel JDBC connections in Hadoop Map-Reduce processes.

Hadoop HBase Integration

ODI now supports Hadoop HBase through a new technology and the following knowledge modules:

  • LKM HBase to Hive (HBase-SerDe)

  • IKM Hive to HBase Incremental Update (HBase-SerDe)

  • RKM HBase

Hive Append Optimization

Knowledge Modules writing to Hive now support the Hive 0.8+ capability and can append data to the existing data files rather than copying existing data into a new appended file.

Multi-threaded Target Table Load in ODI Engine

ODI can now load a target table using multiple parallel connections. This capability is controlled through the Degree of Parallelism for Target property in the data server.

For more information, see "Creating a Data Server".

Improved Control for Scenario and Load Plan Concurrent Execution

You can now limit concurrent executions in a scenario or load plan and force a concurrent execution to either wait or raise an execution error.

For more information, see "Controlling Concurrent Execution of Scenarios and Load Plans" in Developing Integration Projects with Oracle Data Integrator.

Create New Model and Topology Objects

The Create New Model and Topology Objects dialog in the Designer Navigator provides the ability to create a new model and associate it with new or existing topology objects, if connected to a work repository. This dialog enables you to create topology objects without having to use Topology editors unless more advanced options are required.

For more information, see "Creating a Model and Topology Objects" in Developing Integration Projects with Oracle Data Integrator.

Documentation Changes

The information that was previously available in the Oracle Data Integrator Developer's Guide is now reorganized. The following new guides have been added to the ODI documentation library:

  • Understanding Oracle Data Integrator

  • Administering Oracle Data Integrator

  • Oracle Data Integrator Tools Reference

For more information, see "What's New In Oracle Data Integrator?" in Developing Integration Projects with Oracle Data Integrator.

New and Changed Features for Release 12c (12.1.2)

Oracle Data Integrator 12c (12.1.2) introduces the following enhancements:

Declarative Flow-Based User Interface

The new declarative flow-based user interface combines the simplicity and ease-of-use of the declarative approach with the flexibility and extensibility of configurable flows. Mappings (the successor of the Interface concept in Oracle Data Integrator 11g) connect sources to targets through a flow of components such as Join, Filter, Aggregate, Set, Split, and so on.

Reusable Mappings

Reusable Mappings can be used to encapsulate flow sections that can then be reused in multiple mappings. A reusable mapping can have input and output signatures to connect to an enclosing flow; it can also contain sources and targets that are encapsulated inside the reusable mapping.

Multiple Target Support

A mapping can now load multiple targets as part of a single flow. The order of target loading can be specified, and the Split component can be optionally used to route rows into different targets, based on one or several conditions.

Step-by-Step Debugger

Mappings, Packages, Procedures, and Scenarios can now be debugged in a step-by-step debugger. You can manually traverse task execution within these objects and set breakpoints to interrupt execution at pre-defined locations. Values of variables can be introspected and changed during a debugging session, and data of underlying sources and targets can be queried, including the content of uncommitted transactions.

Runtime Performance Enhancements

The runtime execution has been improved to enhance performance. Various changes have been made to reduce overhead of session execution, including the introduction of blueprints, which are cached execution plans for sessions.

Performance is improved by loading sources in parallel into the staging area. Parallelism of loads can be customized in the physical view of a map.

You also have the option to use unique names for temporary database objects, allowing parallel execution of the same mapping.

Oracle GoldenGate Integration Improvements

The integration of Oracle GoldenGate as a source for the Change Data Capture (CDC) framework has been improved in the following areas:

  • Oracle GoldenGate source and target systems are now configured as data servers in Topology. Extract and replicate processes are represented by physical and logical schemas. This representation in Topology allows separate configuration of multiple contexts, following the general context philosophy.

  • Most Oracle GoldenGate parameters can now be added to extract and replicate processes in the physical schema configuration. The UI provides support for selecting parameters from lists. This minimizes the need for the modification of Oracle GoldenGate parameter files after generation.

  • A single mapping can now be used for journalized CDC load and bulk load of a target. This is enabled by the Oracle GoldenGate JKM using the source model as opposed to the Oracle GoldenGate replication target, as well as configuration of journalizing in mapping as part of a deployment specification. Multiple deployment specifications can be used in a single mapping for journalized load and bulk load.

  • Oracle GoldenGate parameter files can now be automatically deployed and started to source and target Oracle GoldenGate instances through the JAgent technology.

Standalone Agent Management with WebLogic Management Framework

Oracle Data Integrator Standalone agents are now managed through the WebLogic Management Framework. This has the following advantages:

  • UI-driven configuration through Configuration Wizard

  • Multiple configurations can be maintained in separate domains

  • Node Manager can be used to control and automatically restart agents

Integration with OPSS Enterprise Roles

Oracle Data Integrator can now use the authorization model in Oracle Platform Security Services (OPSS) to control access to resources. Enterprise roles can be mapped into Oracle Data Integrator roles to authorize enterprise users across different tools.

XML Improvements

The following XML Schema constructs are now supported:

  • list and union - List or union-based elements are mapped into VARCHAR columns.

  • substitutionGroup - Elements based on substitution groups create a table each for all types of the substitution group.

  • Mixed content - Elements with mixed content map into a VARCHAR column that contains text and markup content of the element.

  • Annotation - Content of XML Schema annotations are stored in the table metadata.

Oracle Warehouse Builder Integration

Oracle Warehouse Builder (OWB) jobs can now be executed in Oracle Data Integrator through the OdiStartOwbJob tool. The OWB repository is configured as a data server in Topology. All the details of the OWB job execution are displayed as a session in the Operator tree.

Unique Repository IDs

Master and work repositories now use unique IDs following the GUID convention. This avoids collisions during import of artifacts and allows for easier management and consolidation of multiple repositories in an organization.