9 Creating an Integration Project

This chapter describes the different components involved in an integration project, and explains how to start a project.

This chapter contains these sections:

Section 9.1, "Introduction to Integration Projects"
Section 9.2, "Creating a New Project"
Section 9.3, "Managing Knowledge Modules"
Section 9.4, "Organizing the Project with Folders"

9.1 Introduction to Integration Projects

An integration project is composed of several components. These components include organizational objects, such as folder, and development objects such as interfaces or variables. Section 9.1.1, "Oracle Data Integrator Project Components" details the different components involved in an integration project.

A project has also a defined life cycle which can be adapted to your practises. Section 9.1.2, "Project Life Cycle" suggests a typical project lifestyle.

9.1.1 Oracle Data Integrator Project Components

Components involved in a project include components contained in the project and global components referenced by the project. In addition, a project also uses components defined in the models and topology.

9.1.1.1 Oracle Data Integrator Project Components

The following components are stored into a project. The appear in the in the Project accordion in the Designer Navigator, under the project's node.

Folder

Folders are components that help organizing the work into a project. Sub-folders can be inserted into folders. Folders contain Packages, Interfaces and Procedure.

Packages

The package is the largest unit of execution in Oracle Data Integrator. A package is a workflow, made up of a sequence of steps organized into an execution diagram. Packages assemble and reference other components from a project such as interfaces, procedure or variable. See Chapter 10, "Working with Packages" for more information on packages.

Interface

An interface is a reusable dataflow. It is set of declarative rules that describe the loading of a datastore or a temporary target structure from one or more source datastores. See Chapter 11, "Working with Integration Interfaces" for more information on interfaces.

Procedure

A Procedure is a reusable component that groups a sequence of operations that do not fit in the interface concept.

Examples of procedures:

Wait and unzip a file
Send a batch of files via FTP
Receive emails
Purge a database

Variable

A variable's value is stored in Oracle Data Integrator. This value may change during the execution.

Sequence

A sequence is an variable automatically incremented when used. Between two uses the value is persistent.

User Functions

User functions enable to define customized functions or "functions aliases", for which you will define technology-dependant implementations. They are usable in the interfaces and procedures.

See Chapter 12, "Working with Procedures, Variables, Sequences, and User Functions" for more information about the components described above.

Knowledge Modules

Oracle Data Integrator uses Knowledge Modules at several points of a project design. A Knowledge Module is a code template related to a given technology that provides a specific function (loading data, reverse-engineering, journalizing).

Marker

Component of a project may be flagged in order to reflect a methodology or organization. Flags are defined using the markers. These markers are organized into groups, and can be applied to most objects in a project. See Chapter 18, "Organizing and Documenting your Work" for more information on markers.

Scenario

When a package, interface, procedure or variable component is finished, it is compiled in a scenario. A scenario is the execution unit for production. Scenarios can be scheduled for automated execution. See Chapter 13, "Working with Scenarios" for more information on scenarios.

9.1.1.2 Global Components

Global components are similar to the project objects. The main different is their scope. They have a global scope and can be used in any project. Global objects include Variables, Knowledge Modules, Sequences, Markers and User Functions.

9.1.2 Project Life Cycle

The project life cycle depends on the methods and organization of your development team. The following steps must be considered as guidelines for creating, working with and maintaining an integration project.

Create a new project and import Knowledge Modules for this project.
Define the project organization and practises using folders, markers and documentation.
Create reusable components: interfaces, procedures, variables, sequences. Perform unitary tests.
Assemble these components into packages. Perform integration tests.
Release the work in scenarios.

9.2 Creating a New Project

To create a project:

In Designer Navigator, click New Project in the toolbar of the Projects accordion.
Enter the Name of the project.
Keep or change the automatically-generated project Code. As this code is used to identify objects within this project, it is recommended to make this code for a compact string. For example, if the project is called Corporate Datawarehouse, a compact code could be CORP_DWH.
From the File menu, click Save.

The new project appears in the Projects tree with one empty folder.

9.3 Managing Knowledge Modules

Knowledge Modules (KMs) are components of Oracle Data Integrator' Open Connector technology. KMs contain the knowledge required by Oracle Data Integrator to perform a specific set of tasks against a specific technology or set of technologies.

Oracle Data Integrator uses six different types of Knowledge Modules:

RKM (Reverse Knowledge Modules) are used to perform a customized reverse-engineering of data models for a specific technology. These KMs are used in data models. See Chapter 5, "Creating and Reverse-Engineering a Model".
LKM (Loading Knowledge Modules) are used to extract data from source systems (files, middleware, database, etc.). These KMs are used in interfaces. See Chapter 11, "Working with Integration Interfaces".
JKM (Journalizing Knowledge Modules) are used to create a journal of data modifications (insert, update and delete) of the source databases to keep track of the changes. These KMs are used in data models and used for Changed Data Capture. See Chapter 7, "Working with Changed Data Capture".
IKM (Integration Knowledge Modules) are used to integrate (load) data to the target tables. These KMs are used in interfaces. See Chapter 11, "Working with Integration Interfaces"
CKM (Check Knowledge Modules) are used to check that constraints on the sources and targets are not violated. These KMs are used in data model's static check and interfaces flow checks. See Chapter 5, "Creating and Reverse-Engineering a Model" and Chapter 11, "Working with Integration Interfaces".
SKM (Service Knowledge Modules) are used to generate the code required for creating data services. These KMs are used in data models. See Chapter 8, "Working with Data Services".

9.3.1 Project and Global Knowlegde Modules

Knowledge Modules can be created and used as Project Knowledge Modules or Global Knowledge Modules. Global Knowledge Modules can be used in all projects, while Project Knowledge Modules can only be used within the project into which they have been imported.

Global KMs are listed in Designer Navigator in the Global Objects accordion, while Project KMs appear under the project into which they have been imported. See Section 20.2.6, "Importing Objects" for more information on how to import a Knowledge Module.

When using global KMs, note the following:

Global KMs should only reference global objects. Project objects are not allowed.
You can only use global markers to tag a global KM.
It is not possible to transform a project KM into a global KM and vice versa.
If a global KM is modified, the changes will be seen by any ODI object using the Knowledge Module.
Be careful when deleting a global KM. A missing KM causes execution errors.
To distinguish global from project KMs, the prefix GLOBAL is used for the name of global KMs if they are listed with project KMs.
The order in which the global and project KMs are displayed changes depending on the context:
- The KM Selector lists in the Interface editor displays first the project KMs, then the global KMs. The GLOBAL or PROJECT_CODE prefix is used.
- The KM Selector lists in the Model editor displays first the global KMs, then the project KMs. The GLOBAL or PROJECT_CODE prefix is used.

9.3.2 Knowledge Modules Naming Convention

Oracle Data Integrator's KMs are named according to a convention that facilitates the choice of the KM. This naming convention is as follows:

Loading Knowledge Modules

They are named with the following convention: LKM <source technology> to <target technology> [(loading method)].

In this convention the source and target technologies are the source and target of the data movement this LKM can manage. When the technology is SQL, then the technology can be any technology supporting JDBC and SQL. When the technology is JMS, the technology can be any technology supporting JMS connectivity.

The loading method is the technical method used for moving the data. This method is specific to the technology involved. When no method is specified, the technical method used is a standard Java connectivity (JDBC, JMS and such) and data is loaded via the run-time agent. Using a KM that uses a loading method specific to the source and/or target technology usually brings better performances.

Examples of LKMs are given below:

LKM Oracle to Oracle (DBLink) loads data from an Oracle data server to another Oracle data server using the Oracle DBLink.
LKM File to Oracle (SQLLDR) loads data from a file into an Oracle data server using SQLLoader.
LKM SQL to SQL loads data from a data server supporting SQL into another one. This is the most generic loading Knowledge Module, which works for most data servers.

Integration Knowledge Modules

They are named with the following convention: IKM [<staging technology>] <target technology> [<integration mode>] [(<integration method>)].

In this convention, the target technology is the technology of the target into which data will be integrated. IKMs may have a staging technology when the target is not located on the same server as the staging area. These KMs are referred to as Multi-technology IKMs. They are used when the target cannot be used as the staging area. For example, with the File technology.

The integration mode is the mode used for integrating record from the data flow into the target. Common modes are:

Append: Insert records from the flow into the target. It is possible to optionally delete all records from the target before the insert. Existing records are not updated.
Control Append: Same as above, but in addition the data flow is checked in the process.
Incremental Update: Same as above. In addition, it is possible to update existing records with data from the flow.
Slowly Changing Dimension: Integrate data into a table using Type 2 slowly changing dimensions (SCD).

The integration method is the technical method used for integrating the data into the target. This method is specific to the technologies involved. When no method is specified, the technical method used is a standard Java connectivity (JDBC, JMS and such) and SQL language. Using a KM that uses a integration method specific to a given technology usually brings better performances.

Examples of IKMs are given below:

IKM Oracle Incremental Update (MERGE) integrates data from an Oracle staging area into an Oracle target using the incremental update mode. This KM uses the Oracle Merge Table feature.
IKM SQL to File Append integrates data from a SQL-enabled staging area into a file. It uses the append mode.
IKM SQL Incremental Update integrates data from a SQL-enabled staging area into a target located in the same data server. This IKM is suitable for all cases when the staging area is located on the same data server as the target, and works with most technologies.
IKM SQL to SQL Append integrates data from a SQL-enabled staging area into a target located in a different SQL-enabled data server. This IKM is suitable for cases when the staging area is located on a different server than the target, and works with most technologies.

Check Knowledge Modules

They are named with the following convention: CKM <staging technology>.

In this convention, the staging technology is the technology of the staging area into which data will be checked.

Examples of CKMs are given below:

CKM SQL checks the quality of an integration flow when the staging area is in a SQL-enabled data server. This is a very generic check Knowledge Module that works with most technologies.
CKM Oracle checks the quality of an integration flow when the staging area is in an Oracle data server.

Reverse-engineering Knowledge Modules

They are named with the following convention: RKM <reversed technology> [(reverse method)].

In this convention, the reversed technology is the technology of the data model that is reverse-engineered. The reverse method is the technical method used for performing the reverse-engineering process.

Examples of RKMs are given below:

RKM Oracle reverse-engineers an Oracle data model
RKM Netezza reverse-engineers a Netezza data model

Journalizing Knowledge Modules

They are named with the following convention: JKM <journalized technology> <journalizing mode> (<journalizing method>).

In this convention, the journalized technology is the technology into which changed data capture is activated. The journalizing mode is either Consistent or Simple. For more information about these modes, see Chapter 7, "Working with Changed Data Capture".

The journalizing method is the technical method for capturing the changes. When not specified, the method used for performing the capture process is triggers.

Examples of JKMs are given below:

JKM Oracle 11g Consistent (Streams) enables CDC for Oracle 11g in consistent set mode using Oracle Streams features.
JKM Oracle Simple enables CDC for Oracle in simple mode using triggers.
JKM DB2 400 Simple (Journal) enables CDC for DB2/400 simple mode using DB2/400 Journals technology.

Service Knowledge Modules

They are named with the following convention: SKM <data server technology>.

In this convention, the data server technology is the technology into which the data to be accessed with web services is stored.

9.3.3 Choosing the Right Knowledge Modules

Oracle Data Integrator provides a large range of Knowledge Modules out of the box. When starting an integration project, you must import the Knowledge Module appropriate for your project.

It is possible to import additional KMs after setting up the project, and it is possible to change the KMs used afterwards. The following guidelines can be used for choosing the right KMs when starting a new project:

Start with Generic KMs. The SQL KMs work with almost all technologies. If you are not comfortable with the source/target technologies you are working with, you can start by using the generic SQL KMs, as they use standard SQL. A simple project can start with the following generic KMs: LKM File to SQL, LKM SQL to SQL, IKM SQL to SQL Append, IKM SQL Control Append, CKM SQL.
Start with simple KMs. If you are not comfortable with the technologies you are integrating, do not start using the KMs using complex integration methods or modes.
Select KMs that match your source/target combinations to increase performance. The more specific the KM to a technology combination, the better the performance. For achieving the best performances, make sure to switch to KMs that match the source/target combination you have, and that leverage the features from these sources/targets.
Select KMs according to your infrastructure limitations. If it is not possible to use the target data servers as the staging area for security reasons, make sure to have multi-technology IKMs available in your project.
Select JKMs and SKMs only if you need them. Do not import JKMs or SKMs if you do not plan to use Changed Data Capture or Data Services. You can import them later when needed.
Review the KM documentation and options. KMs include a Description field that contain useful information. Each of the KM options is also described. All KMs are detailed in the Oracle Fusion Middleware Connectivity and Knowledge Modules Guide for Oracle Data Integrator.

9.3.4 Importing and Replacing Knowledge Modules

Two main operations allow you to manage KMs into a project:

When you create a new project or want to use new KMs in an existing project, you must import the KMs. You can either import a project KM or a global KM. See Section 9.3.1, "Project and Global Knowlegde Modules" for more information on the knowledge module's scope.
If you want to start using a new version of an existing global or project KM, or if you want to replace an existing KM in use with another one, then you can replace this KM.

This section includes the following topics:

Importing a Project Knowlegde Module
Replacing a Knowledge Module
Importing a Global Knowledge Module

Importing a Project Knowlegde Module

To import a Project Knowledge Module into a project:

In the Projects accordion in Designer Navigator, select the project into which you want to import the KM.
Right-click and select Import > Import Knowledge Modules....
Specify the File Import Directory. A list of the KMs export files available in this directory appears.
Select several KMs from the list and then click OK.
Oracle Data Integrator imports the selected KMs and presents an import report.
Click Close to close this report.

The Knowledge Modules are imported into you project. They are arranged under the Knowledge Modules node of the project, grouped per KM type.

Note:

Knowledge modules can be imported in Duplication mode only. To replace an existing Knowledge Modules, use the import replace method described below. When importing a KM in Duplication mode and if the KM already exists in the project, ODI creates a new KM with prefix copy_of.

Replacing a Knowledge Module

When you want to replace a global KM or a KM in a project by another one and have all interfaces automatically use the new KM, you must use the Import Replace mode.

To import a Knowledge Module in replace mode:

In Designer Navigator, select the Knowledge Module you wish to replace.
Right-click and select Import Replace.
In the Replace Object dialog, select the export file of the KM you want to use as a replacement.
Click OK.

The Knowledge Module is now replaced by the new one.

Note:

When replacing a Knowledge module by another one, Oracle Data Integrator sets the options in interfaces for the new module using the option name similarities with the old module's options. When a KM option was set by the user in an interface, this value is preserved if the new KM has an option with the same name. New options are set to the default value. It is advised to check the values of these options in the interfaces.

Replacing a KM by another one may lead to issues if the KMs have different structure or behavior, for example when you replace a IKM with a RKM. It is advised to check the interfaces' design and execution with the new KM.

Importing a Global Knowledge Module

To import a global knowledge module in Oracle Data Integrator:

In the Navigator, select the Global Knowledge Modules node in the Global Objects accordion.
Right-click and select Import Knowledge Modules.
In the Import dialog:
1. Select the Import Type. See Section 20.1.3, "Import Types" for more information.
2. Enter the File Import Directory.
3. Select the file(s) to import from the list.
Click OK.

The global KM is now available in all your projects.

9.3.5 Encrypting and Decrypting a KM

Encrypting a Knowledge Module (KM) allows you to protect valuable code. An encrypted KM cannot be read or modified if it is not decrypted. The commands generated in the log by an Encrypted KM or procedure are also unreadable.

Oracle Data Integrator uses a DES Encryption algorithm based on a personal encryption key. This key can be saved in a file and can be reused to perform encryption or decryption operations.

WARNING:

There is no way to decrypt an encrypted KM or procedure without the encryption key. It is therefore strongly advised to keep this key in a safe location.

To Encrypt a KM or a Procedure:

In the Projects tree in Designer Navigator, expand the project, and select the KM you want to encrypt.
Right-click and select Encrypt.
In the Encryption Options window, you can either:
- Encrypt with a personal key that already exists by giving the location of the personal key file or by typing in the value of the personal key.
- Or click Get a new encryption key to have a new key generated.
Click OK to encrypt the KM. If you have chosen to generate a new key, a window will appear with the new key. You can save the key in a file from here.

Note:

If you type in a personal key with too few characters, an invalid key size error appears. In this case, please type in a longer personal key. A personal key of 10 or more characters is required.

To decrypt a KM or a procedure:

In the Projects tree in Designer Navigator, expand the project, and select the KM you want to decrypt.
Right-click and select Decrypt.
In the KM Decryption window, either
- Select an existing encryption key file;
- or type in (or paste) the string corresponding to your personal key.
Click OK to decrypt.

9.4 Organizing the Project with Folders

In a project, interfaces, procedures and packages are organized into folders and sub-folders. It recommended to maintain a good organization of the project by using folders. Folders simplify finding objects developed in the project and facilitate the maintenance tasks. Organization is detailed in Chapter 18, "Organizing and Documenting your Work".