Skip Headers
Oracle® Warehouse Builder User's Guide
11g Release 1 (11.1)

B31278-06
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

15 Data Transformation

One of the main functions of an Extract, Transformation, and Loading (ETL) tool is to transform data. This chapter provides an overview of data transformation in Warehouse Builder.

This chapter contains the following topics:

About Data Transformation in Warehouse Builder

After you import your source data and define the target, you can consider how to transform the source data into the output desired for the target. In Warehouse Builder, you specify how to transform the data by designing mappings in the Mapping Editor. A mapping is a Warehouse Builder entity that describes the sequence of operations required to extract data from sources, transform the data, and load the data into one or more targets.

The fundamental unit of design for a mapping is the operator. For each distinct operation you want to perform in the mapping, you represent that operation with an operator. To indicate the order of operations, you connect the mappings with data flow connections.

To specify data transformation in a mapping, select from the many prebuilt transformation operators or design a new transformation operator. The prebuilt transformation operators enable commonly performed operations such as filtering, joining, and sorting. Warehouse Builder also includes prebuilt operators for complex operations such as merging data, cleansing data, or profiling data.

If none of the prebuilt transformation operators meet your needs, you can design a new one. You can design the new transformation operator based on the Oracle database library of PL/SQL functions, procedures, package functions, and package procedures.

Extraction and loading operations are represented by any of the numerous source and target operators. A table operator represents a table and a flat file operator represents a flat file. Whether that operator specifies an extraction or loading operation depends on how you connect the operator relative to other operators in the mapping.

An important distinction to understand is the difference between the operator in the mapping and the object it represents. The operator and the object are separate entities until you bind the two together. For example, when you add a table operator to a mapping, you can bind that operator to a specific table in the repository. With the operator bound to the table, you can synchronize changing definitions between the two. If the table operator represents a target and you change the operator in the mapping, then you can propagate those changes back to the table in the repository. If the operator represents a source that incurred a change in its metadata definition, then you can reimport the table in the Design Center and then propagate those changes to the table operator in the Mapping Editor.

About Mappings

Mappings describe a series of operations that extract data from sources, transform it, and load it into targets. They provide a visual representation of the flow of the data and the operations performed on the data. When you design a mapping in Warehouse Builder, you use the Mapping Editor interface.

Alternatively, you can create and define mappings using OMB Plus, the scripting interface for Warehouse Builder as described in the Oracle Warehouse Builder API and Scripting Reference.

Based on the ETL logic that you define in a mapping, Warehouse Builder generates the code required to implement your design. Warehouse Builder can generate code for the following languages:

About Operators

The basic design element for a mapping is the operator. Use operators to represent sources and targets in the data flow. Also use operators to define how to transform the data from source to target. The operators you select as sources have an impact on how you design the mapping. Based on the operators you select, Warehouse Builder assigns the mapping to one of the following Mapping Generation Languages:

Each of these code languages require you to adhere to certain rules when designing a mapping.

Types of Operators

As you design a mapping, you select operators from the Mapping Editor palette and drag them onto the canvas.

This section introduces the types of operators and refers you to other chapters in this manual for detailed information.

Oracle Source/Target Operators

Use source and target operators to represent relational database objects and flat file objects.

Table 15-1 lists each source and target operator alphabetically, gives a brief description.

Table 15-1 Source and Target Operators

Icon Operator Description
This illustration is described in the surrounding text.

Constant operator

Produces a single output group that can contain one or more constant attributes.

This illustration is described in the surrounding text.

Construct Object operator

Produces object types and collection types.

This illustration is described in the surrounding text.

Cube operator

Represents a cube that you previously defined.

This illustration is described in the surrounding text.

Data Generator operator

Provides information such as record number, system date, and sequence values.

This illustration is described in the surrounding text.

Dimension operator

Represents a dimension that you previously defined.

This illustration is described in the surrounding text.

Expand Object operator

Expands an object type to obtain the individual attributes that comprise the object type.

This illustration is described in the surrounding text.

External Table operator

Represents an external table that you previously defined or imported.

This illustration is described in the surrounding text.

Flat File operator

Represents a flat file that you previously defined or imported.

This illustration is described in the surrounding text.

Materialized View operator

Represents a materialized view that you previously defined.

This illustration is described in the surrounding text.

Sequence operator

Generates sequential numbers that increment for each row.

This illustration is described in the surrounding text.

Table operator

Represents a table that you previously defined or imported.

This illustration is described in the surrounding text.

Varray Iterator operator

Iterates through the values in the table type.

This illustration is described in the surrounding text.

View operator

Represents a view that you previously defined or imported.


Data Flow Operators

Use data flow operators to transform data in a mapping.

Table 15-2 lists each data flow operator alphabetically, gives a brief description. For more information on these transformation operators, see "Data Flow Operators".

Table 15-2 Data Flow Operators

Icon Operator Description
This illustration is described in the surrounding text.

Aggregator operator

Performs data aggregations, such as SUM and AVG, and provides an output row set with aggregated data.

This illustration is described in the surrounding text.

Anydata Cast operator

Converts an object of type Sys.AnyData to either a primary type or to a user-defined type.

This illustration is described in the surrounding text.

Deduplicator operator

Removes duplicate data in a source by placing a DISTINCT clause in the select code represented by the mapping.

This illustration is described in the surrounding text.

Expression operator

Enables you to write SQL expressions that define non-procedural algorithms for one output parameter of the operator. The expression text can contain combinations of input parameter names, variable names, and library functions.

This illustration is described in the surrounding text.

Filter operator

Conditionally filters out rows from a row set.

This illustration is described in the surrounding text.

Joiner operator

Joins multiple row sets from different sources with different cardinalities and produces a single output row set.

This illustration is described in the surrounding text.

Key Lookup operator

Performs a lookup of data from a lookup object such as a table, view, cube, or dimension.

This illustration is described in the surrounding text.

Match Merge operator

Data quality operator that identifies matching records and merges them into a single record.

This illustration is described in the surrounding text.

Name and Address operator

Identifies and corrects errors and inconsistencies in name and address source data.

This illustration is described in the surrounding text.

Pivot operator

Transforms a single row of attributes into multiple rows. Use this operator to transform data that contained across attributes instead of rows.

This illustration is described in the surrounding text.

Set Operation operator

Performs union, union all, intersect, and minus operations in a mapping.

This illustration is described in the surrounding text.

Sorter operator

Sorts attributes in ascending or descending order.

This illustration is described in the surrounding text.

Splitter operator

Splits a single input row set into several output row sets using a boolean split condition.

This illustration is described in the surrounding text.

Table Function operator

Enables you to develop custom code to manipulate a set of input rows and return a set of output rows of the same or different cardinality that can be queried like a physical table.

You can use a table function operator as a target.

This illustration is described in the surrounding text.

Transformation operator

Transforms the attribute value data of rows within a row set using a PL/SQL function or procedure.

This illustration is described in the surrounding text.

Unpivot operator

Converts multiple input rows into one output row. It enables you to extract from a source once and produce one row from a set of source rows that are grouped by attributes in the source data.


Pre/Post Processing Operators

Use Pre/Post Processing operators to perform processing before or after executing a mapping. The Mapping parameter operator is used to provide values to and from a mapping.

Table 15-3 lists the Pre/Post Process operators and the Mapping Parameter operators.

Table 15-3 Pre/Post Processing Operators

Icon Operator Description
This illustration is described in the surrounding text.

Mapping Input Parameter operator

Passes parameter values into a mapping.

This illustration is described in the surrounding text.

Mapping Output Parameter operator

Sends values out of a mapping.

This illustration is described in the surrounding text.

Post-Mapping Process operator

Calls a function or procedure after executing a mapping.

This illustration is described in the surrounding text.

Pre-Mapping Process operator

Calls a function or procedure prior to executing a mapping.


Pluggable Mapping Operators

A pluggable mapping is a reusable grouping of mapping operators that behaves as a single operator.

Table 15-4 lists the Pluggable Mappings operators.

Table 15-4 Pluggable Mapping Operators

Icon Operator Description
This illustration is described in the surrounding text.

Pluggable Mapping operator

Represents a reusable mapping.

This illustration is described in the surrounding text.

Pluggable Mapping Input Signature operator

A combination of input attributes that flow into the pluggable mapping.

This illustration is described in the surrounding text.

Pluggable Mapping Output Signature operator

A combination of output attributes that flow out of the pluggable mapping.


About Transformations

Transformations are PL/SQL functions, procedures, packages, and types that enable you to transform data. You use transformations when designing mappings and process flows that define ETL processes.

Transformations are stored in the Warehouse Builder workspace and can be used in the project in which they are defined.

Transformation packages are deployed at the package level but executed at the transformation level.

Types of Transformations

Transformations, in Warehouse Builder, can be categorized as follows:

The following sections provide more details about these types of transformations.

Predefined Transformations

Warehouse Builder provides a set of predefined transformations that enable you to perform common transformation operations. These predefined transformations are part of the Oracle Library that consists of built-in and seeded functions and procedures. You can directly use these predefined transformations to transform your data. For more information on the Oracle Library, see "Types of Transformation Libraries".

Predefined transformations are organized into the following categories:

  • Administration

  • Character

  • Control Center

  • Conversion

  • Date

  • Numeric

  • OLAP

  • Others

  • SYS

  • Spatial

  • Streams

  • XML

For more information about the transformations that belong to each category, see "Oracle Warehouse Builder Transformations".

Custom Transformations

A custom transformation is one this is created by the user. Custom transformations can use predefined transformations as part of their definition.

Custom transformations contains the following categories:

  • Functions: The Functions category contains standalone functions. This category is available under the Custom node of the Public Transformations node in the Global Explorer. It is also created automatically under the Transformations node of every Oracle module in the Project Explorer.

    Functions can be defined by the user or imported from a database. A function transformation takes 0-n input parameters and produces a result value.

  • Procedures: The Procedures category contains any standalone procedures used as transformations. This category is available under the Custom node of the Public Transformations node in the Global Explorer. It is also automatically created under the Transformations node of each Oracle module in the Global Explorer.

    Procedures can be defined by the user or imported from a database. A procedure transformation takes 0-n input parameters and produces 0-n output parameters.

  • Packages: The Packages category contains packages, which in turn contain functions, procedures, and PL/SQL types. This category is available under the Custom node of the Public Transformations node in the Global Explorer. It is also automatically created under the Transformations node of each Oracle module in the Global Explorer.

    PL/SQL packages can be created or imported in Warehouse Builder. The package body may be modified. The package header, which is the signature for the function or procedure, cannot be modified.

  • PL/SQL Types: The PL/SQL Types category contains any standalone PL/SQL types. This includes PL/SQL record types, REF cursor types, and nested table types. The PL/SQL Types category is automatically created in each package that you define using the Packages node in the Transformations node of the Project Explorer. It is also available under every package that you define in the following path of the Global Explorer: Public Transformations -> Custom -> Packages.

For further instructions, see "Defining Custom Transformations".

In addition to the above categories, you can also import PL/SQL packages. Although you can modify the package body of an imported package, you cannot modify the package header, which is the signature for the function or procedure. For instructions, see "Importing PL/SQL".

About Transformation Libraries

A transformation library consists of a set of reusable transformations. Each time you create a repository, Warehouse Builder creates a Transformation Library containing transformation operations for that project. This library contains the standard Oracle Library and an additional library for each Oracle module defined within the project.

Transformation libraries are available under the Public Transformations node of the Global Explorer in the Design Center.

Types of Transformation Libraries

Transformation libraries can be categorized as follows:

  • Oracle Library

    This is a collection of predefined functions from which you can define procedures for your Global Shared Library. The Oracle Library is contained in the Global Explorer. Expand the Pre-Defined node under the Public Transformations node. Each category of predefined transformations is represented by a separate node. Expand the node for a category to view the predefined transformations in that category. For example, expand the Character node to view the predefined character transformations contained in the Oracle Library.

  • Global Shared Library

    This is a collection of reusable transformations created by the user. These transformations are categorized as functions, procedures, and packages defined within your workspace.

    The transformations in the Global Shared Library are available under the Custom node of the Public Transformations node. Any transformation that you create under this node is available across all projects in the workspace. For information on creating transformations in the Global Shared Library, see "Defining Custom Transformations".

    When you deploy a transformation defined in the Global Shared Library, the transformation is deployed to the location that is associated with the default control center.

Accessing Transformation Libraries

Since transformations can be used at different points in the ETL process, Warehouse Builder enables you to access transformation libraries from different points in the Design Center.

You can access the transformation libraries using the following:

  • Expression Builder

    While creating mappings, you may need to create expressions to transform your source data. The Expression Builder interface enables you to create the expressions required to transform data. Since these expressions can include transformations, Warehouse Builder enables you to access transformation libraries from the Expression Builder.

    Transformation libraries are available under the Transformations tab of the Expression Builder. The Private node under TRANSFORMLIBS contains transformations that are available only in the current project. These transformations are created under the Transformations node of the Oracle module. The Public node contains the custom transformations from the Global Shared Library and the predefined transformations from the Oracle Library.

  • Add Transformation Operator Dialog Box

    The Transformation operator in the Mapping Editor enables you to add transformations, both from the Oracle library and the Global Shared Library, to a mapping. You can use this operator to transform data as part of the mapping.

  • Create Function Wizard, Create Procedure Wizard, Edit Function Dialog Box, or Edit Procedure Dialog Box

    The Implementation page on the these wizards or the Implementation tab of these editors enable you to specify the PL/SQL code that is part of the function or procedure body. You can use transformations in the PL/SQL code.