3 Working with the ETL Project

This chapter provides an introduction to the ETL (Extract Transform Load) project that is delivered in the demonstration environment with Oracle Data Integrator Studio.

This chapter includes the following sections:

3.1 The Example Environment

The ETL project is an example to help you understand how to transform and check the integrity of the data in your information systems.

The examples in this getting started guide track sales from various heterogeneous data sources issued from the production systems. Figure 3-1 shows the example environment.

Figure 3-1 Example Environment

Surrounding text describes Figure 3-1 .

The example environment uses the following elements:

  • The Repository: The Repository contains all of the metadata required for the training examples. It is hosted, for these evaluation purposes, in a supplied database.

  • Orders Application: An application for tracking customer orders, hosted in a supplied database (the "srcdemo" sample database).

  • Parameters (File): Flat files (ASCII) issued from the production system containing a list of sales representatives and the segmentation of ages into age ranges.

  • Sales Administration: The administration or tracking of sales, hosted in another supplied database (the "trgdemo" sample database). This data warehouse is populated with our transformations.

3.2 The Data Models

The demonstration environment includes three ODI data models:

This section provides the schema diagrams for these data models.

3.2.1 Orders Application - HSQL

The Orders Application data model is based on the HSQL technology and includes five datastores:

  • SRC_CITY

  • SRC_CUSTOMER

  • SRC_ORDERS

  • SRC_ORDER_LINES

  • SRC_PRODUCT

  • SRC_REGION

Figure 3-2 shows the schema diagram of this data model.

Note that this data model does not enforce any foreign key constraints, even if some functional relations exist between the data.

Figure 3-2 Orders Application Schema Diagram

Surrounding text describes Figure 3-2 .

3.2.2 Parameters - FILE

The Parameters data model is based on the File technology and includes two datastores:

  • SRC_SALES _PERSON

  • SRC_AGE_GROUP

Figure 3-3 shows the schema diagram of this data model.

Figure 3-3 Parameters Schema Diagram

Surrounding text describes Figure 3-3 .

3.2.3 Sales Administration - HSQL

The Sales Administration data model is based on the HSQL technology and includes six datastores:

  • TRG_CITY

  • TRG_COUNTRY

  • TRG_CUSTOMER

  • TRG_PRODUCT

  • TRG_PROD_FAMILY

  • TRG_REGION

  • TRG_SALES

Figure 3-4 shows the schema diagram of this data model.

Figure 3-4 Sales Administration Schema Diagram

Surrounding text describes Figure 3-4 .

3.3 Integration Challenges

The challenges common to all data integration and transformation projects are:

  • Checking and improving the quality of your application data

  • Accurately and easily exchanging data between your applications while respecting the business rules of your information system

The examples used in this guide illustrate how to address these issues. During this getting started guide, you will learn how to:

  • Implement Data Quality Control to check data in a database

    By implementing two examples, you will learn how Oracle Data Integrator enables you to ensure the quality of the data in your applications while segregating invalid rows. The Orders Application tables contain a number of data inconsistencies that you will detect.

  • Create integration interfaces to move and transform data

    Two simple examples will show you how to improve productivity by loading the data from the Orders Application and Parameters (File) applications into the Sales Administration data warehouse.

  • Automate the execution of these interfaces into packages

    This part of the Getting Started guide will show you how to automate your Oracle Data Integrator processes. The aim of this exercise is to load the entire Sales Administration data warehouse with a single click.

  • Execute the package and review the execution results

    You will learn how to execute the Load Sales Administration Package and the integration interfaces Pop. TRG_CUSTOMER and Pop. TRG_SALES you have created and how to review the results of these executions.

  • Prepare the developed components for deployment

    You will learn how to run the Load Sales Administration Package automatically in a production environment.

Note:

In this guide, we will be looking at processes that focus on ETL. While it is beyond the scope of this document, implementing different integration patterns (real-time, for example) can be carried out in the same fashion. For more information on this, see the Oracle Data Integrator documentation after completing this guide.

Now that you have been introduced to the concepts of the ETL-Project and its components, you can move on to Starting Oracle Data Integrator.