This section summarizes the new features and significant product changes for Oracle Data Integrator (ODI) in the Oracle Fusion Middleware 12c release.
This chapter includes the following sections:
Oracle Data Integrator 12c (12.2.1) introduces the following enhancements:
You can now choose between Task and Session execution modes for Oozie workflow generation. The new Session mode allows support for transactions, scripting, and loops in packages. ODI will automatically choose the correct mode based on the executed object or the mode can also be manually selected.
Please note that Big Data enhancements were made available in ODI 184.108.40.206.1 and improved upon in this release. For more information regarding Big Data enhancements, see "New and Changed Features for Release 12c (220.127.116.11.1)".
Oracle Data Integrator is now integrated with Subversion and this provides you with the ability to version control ODI Objects in Subversion. For more information on Subversion, visit the Subversion website.
Using the Subversion integration capabilities, you can create tags to take a snapshot of ODI object versions. You can create branches for parallel development from distributed locations or for parallel development for multiple releases.
Release management capabilities are introduced to provide a distinction between the development and deployment environments. You can create Deployment Archives (DA) from a development environment, which can be deployed in a QA environment for testing, and then delivered to the production environment. The DA can be created using ODI Studio or from a command line.
You can now browse, download, and install global ODI objects made available to you by Oracle or other ODI users through Official or Third-Party Update Centers. This feature is available for Global Knowledge Modules, Global User Functions and Mapping Components. The Check for Updates menu item in the Help menu in ODI Studio enables you to connect to the Update Centers and obtain Global ODI Objects.
The Native Format Builder utility is now included with ODI Studio and allows you to create nXSD files without leaving the ODI user interface.
All JDBC properties for Complex File, File, LDAP, JMS Queue XML, JMS Topic XML, and XML technologies are now displayed at the Data Server level along with default values where applicable and a description of the properties, thereby enhancing usability.
You can now customize the way data is fed into XML and Complex File drivers. This feature adds support for intermediate processing stages that may be added for processing data as it has either been retrieved from the external endpoint using Oracle Data Integrator, or it is to be written out to an external endpoint. This feature also provides support for complex configuration of intermediate processing stages as part of the configuration of data servers that use ODI XML or Complex File JDBC drivers.
A new SOAP Web Service technology is now available in Topology and allows the creation of data servers, physical schemas, and logical schemas for Web Services. Oracle Web Service Management (OWSM) policies can also be attached to Web Services data servers. In addition, the OdiInvokeWebService tool is enhanced to support Web Services data servers through Contexts and Logical Schemas.
You can now cancel import/export and reverse-engineering operations that may run for a long time.
Analytic or Window functions are now supported out of the box at the Mapping level. Analytic functions such as PERCENT_RANK, LAST, FIRST, or LAG can be used at the Mapping Expression level in any component.
A new Knowledge Module to perform Partition Exchange Loading is now available allowing you to swap partitions when needed. In addition, improvements have been made to Loading Knowledge Modules using External Tables, which can now load more than one file at a time. Knowledge Modules using Data Pump have also been improved.
A new Oracle Enterprise Data Quality (EDQ) technology is available in Topology and allows the creation of data servers, physical schemas, and logical schemas for EDQ. Also, the OdiEnterpriseDataQuality tool is enhanced to support EDQ data servers through Contexts and Logical Schemas.
The Review User Activity menu item has been added to the Security menu. Using this you can view, purge, and save user activity record in the User Connections dialog. This feature is available in ODI Studio and ODI Console.
The overall look and feel of ODI Console has been improved. Security tasks such as creating users or profiles can now be performed in ODI Console. Also, Release Management activities can now be performed in ODI Console and the functionality related to Topology activities has been enhanced.
Oracle Data Integrator 12c (18.104.22.168.1) introduces the following enhancements:
ODI allows the defining of mappings through a logical design, which is independent of the implementation language. For Hadoop-based transformations, you can select between Hive, Spark, and Pig as the generated transformation language. This allows you to pick the best implementation based on the environment and use case; you can also choose different implementations simultaneously using multiple physical designs. This selection makes development for Big Data flexible and future-proof.
Generate Pig Latin transformations: You can choose Pig Latin as the transformation language and execution engine for ODI mappings. Apache Pig is a platform for analyzing large data sets in Hadoop and uses the high-level language, Pig Latin for expressing data analysis programs. Any Pig transformations can be executed either in Local or MapReduce mode. Custom Pig code can be added through user-defined functions or the table function component.
Generate Spark transformations: ODI mapping can also generate PySpark, which exposes the Spark programming model in the Python language. Apache Spark is a transformation engine for large-scale data processing. It provides fast in-memory processing of large data sets. Custom PySpark code can be added through user-defined functions or the table function component.
You can now choose between the traditional ODI Agent or Apache Oozie as the orchestration engine for ODI jobs such as mappings, packages, scenarios, and procedures. Apache Oozie allows a fully native execution on a Hadoop infrastructure without installing an ODI environment for orchestration. You can utilize Oozie tools to schedule, manage, and monitor ODI jobs. ODI uses Oozie's native actions to execute Hadoop processes and conditional branching logic.
ODI includes the WebLogic Hive JDBC driver that provides a number of advantages when compared to the Apache Hive driver, such as, total JDBC compliance and improved performance. All Hive Knowledge Modules have been rewritten to benefit from this new driver. Also, the Knowledge Modules whose main purpose is to load from a source are now provided as Load Knowledge Modules, enabling them to be combined in a single mapping with other Load Knowledge Modules. A new class of "direct load" Load Knowledge Modules also allows the loading of targets without intermediate staging. The table function component has been extended to support Hive constructs.
ODI integrates results from Hadoop Audit Logs in Operator tasks for executions of Oozie, Pig, and other tasks. The log results show MapReduce statistics and provide a link to Hadoop statistics in native web consoles.
The file based tools used in ODI packages and procedures have been enhanced to include Hadoop Distributed File System (HDFS) file processing. This includes copying, moving, appending, and deleting files, detecting file changes, managing folders, and transferring files using FTP directly into HDFS.
The new Flatten component for mappings allows complex sub-structures to be processed as part of a flat list of attributes. The new Jagged component converts key-value lists into named attributes for further processing.
A new guide, Integrating Big Data with Oracle Data Integrator, has been added to the ODI documentation set. This guide provides information on how to integrate Big Data, deploy and execute Oozie workflows, and generate code in languages such as Pig Latin and Spark.
Oracle Data Integrator 12c (12.1.3) introduces the following enhancements:
ODI now uses Advanced Encryption Standard (AES) as the standard encryption algorithm for encrypting Knowledge Modules, procedures, scenarios, actions, and passwords. You can configure the encryption algorithm and key length to meet requirements. Passwords and other sensitive information included in repository exports are now encrypted and secured by a password.
For more information, see "Advanced Encryption Standard".
The following XML Schema support enhancements have been added:
Recursion: ODI now supports recursion inside XML Schemas.
anyAttribute: Data defined by these types is stored in string type columns with XML markup from the original document.
Metadata annotations can be added inside an XML Schema to instruct the ODI XML Driver which table name, column name, type, length, and precision should be used.
For more information, see "Oracle Data Integrator Driver for XML Reference" in Connectivity and Knowledge Modules Guide for Oracle Data Integrator.
The ODI Complex File Driver can now read and write files in JSON format. The JSON structure is defined through an nXSD schema.
For more information, see "JSON Support" in Connectivity and Knowledge Modules Guide for Oracle Data Integrator.
ODI can now load the following sources and targets using Hadoop SQOOP:
From relational databases to HDFS, Hive, and HBase through Knowledge Module IKM File-Hive to SQL (SQOOP)
From HDFS and Hive to relational databases through Knowledge Module IKM SQL to Hive-HBase-File (SQOOP)
SQOOP enables load and unload mechanisms using parallel JDBC connections in Hadoop Map-Reduce processes.
ODI now supports Hadoop HBase through a new technology and the following knowledge modules:
LKM HBase to Hive (HBase-SerDe)
IKM Hive to HBase Incremental Update (HBase-SerDe)
Knowledge Modules writing to Hive now support the Hive 0.8+ capability and can append data to the existing data files rather than copying existing data into a new appended file.
ODI can now load a target table using multiple parallel connections. This capability is controlled through the Degree of Parallelism for Target property in the data server.
For more information, see "Creating a Data Server".
You can now limit concurrent executions in a scenario or load plan and force a concurrent execution to either wait or raise an execution error.
For more information, see "Controlling Concurrent Execution of Scenarios and Load Plans" in Developing Integration Projects with Oracle Data Integrator.
The Create New Model and Topology Objects dialog in the Designer Navigator provides the ability to create a new model and associate it with new or existing topology objects, if connected to a work repository. This dialog enables you to create topology objects without having to use Topology editors unless more advanced options are required.
For more information, see "Creating a Model and Topology Objects" in Developing Integration Projects with Oracle Data Integrator.
The information that was previously available in the Oracle Data Integrator Developer's Guide is now reorganized. The following new guides have been added to the ODI documentation library:
Understanding Oracle Data Integrator
Administering Oracle Data Integrator
Oracle Data Integrator Tools Reference
For more information, see "What's New In Oracle Data Integrator?" in Developing Integration Projects with Oracle Data Integrator.
Oracle Data Integrator 12c (12.1.2) introduces the following enhancements:
The new declarative flow-based user interface combines the simplicity and ease-of-use of the declarative approach with the flexibility and extensibility of configurable flows. Mappings (the successor of the Interface concept in Oracle Data Integrator 11g) connect sources to targets through a flow of components such as Join, Filter, Aggregate, Set, Split, and so on.
Reusable Mappings can be used to encapsulate flow sections that can then be reused in multiple mappings. A reusable mapping can have input and output signatures to connect to an enclosing flow; it can also contain sources and targets that are encapsulated inside the reusable mapping.
A mapping can now load multiple targets as part of a single flow. The order of target loading can be specified, and the Split component can be optionally used to route rows into different targets, based on one or several conditions.
Mappings, Packages, Procedures, and Scenarios can now be debugged in a step-by-step debugger. You can manually traverse task execution within these objects and set breakpoints to interrupt execution at pre-defined locations. Values of variables can be introspected and changed during a debugging session, and data of underlying sources and targets can be queried, including the content of uncommitted transactions.
The runtime execution has been improved to enhance performance. Various changes have been made to reduce overhead of session execution, including the introduction of blueprints, which are cached execution plans for sessions.
Performance is improved by loading sources in parallel into the staging area. Parallelism of loads can be customized in the physical view of a map.
You also have the option to use unique names for temporary database objects, allowing parallel execution of the same mapping.
The integration of Oracle GoldenGate as a source for the Change Data Capture (CDC) framework has been improved in the following areas:
Oracle GoldenGate source and target systems are now configured as data servers in Topology. Extract and replicate processes are represented by physical and logical schemas. This representation in Topology allows separate configuration of multiple contexts, following the general context philosophy.
Most Oracle GoldenGate parameters can now be added to extract and replicate processes in the physical schema configuration. The UI provides support for selecting parameters from lists. This minimizes the need for the modification of Oracle GoldenGate parameter files after generation.
A single mapping can now be used for journalized CDC load and bulk load of a target. This is enabled by the Oracle GoldenGate JKM using the source model as opposed to the Oracle GoldenGate replication target, as well as configuration of journalizing in mapping as part of a deployment specification. Multiple deployment specifications can be used in a single mapping for journalized load and bulk load.
Oracle GoldenGate parameter files can now be automatically deployed and started to source and target Oracle GoldenGate instances through the JAgent technology.
Oracle Data Integrator Standalone agents are now managed through the WebLogic Management Framework. This has the following advantages:
UI-driven configuration through Configuration Wizard
Multiple configurations can be maintained in separate domains
Node Manager can be used to control and automatically restart agents
Oracle Data Integrator can now use the authorization model in Oracle Platform Security Services (OPSS) to control access to resources. Enterprise roles can be mapped into Oracle Data Integrator roles to authorize enterprise users across different tools.
The following XML Schema constructs are now supported:
list and union - List or union-based elements are mapped into VARCHAR columns.
substitutionGroup - Elements based on substitution groups create a table each for all types of the substitution group.
Mixed content - Elements with mixed content map into a VARCHAR column that contains text and markup content of the element.
Annotation - Content of XML Schema annotations are stored in the table metadata.
Oracle Warehouse Builder (OWB) jobs can now be executed in Oracle Data Integrator through the
OdiStartOwbJob tool. The OWB repository is configured as a data server in Topology. All the details of the OWB job execution are displayed as a session in the Operator tree.
Master and work repositories now use unique IDs following the GUID convention. This avoids collisions during import of artifacts and allows for easier management and consolidation of multiple repositories in an organization.