10 Netezza

This chapter describes how to work with Netezza in Oracle Data Integrator.

This chapter includes the following sections:

10.1 Introduction

Oracle Data Integrator (ODI) seamlessly integrates data in a Netezza database. Oracle Data Integrator features are designed to work best with Netezza, including reverse-engineering, data integrity check, and mappings.

10.1.1 Concepts

The Netezza database concepts map the Oracle Data Integrator concepts as follows: A Netezza cluster corresponds to a data server in Oracle Data Integrator. Within this server, a database/owner pair maps to an Oracle Data Integrator physical schema.

Oracle Data Integrator uses Java Database Connectivity (JDBC) to connect to a Netezza database.

10.1.2 Knowledge Modules

Oracle Data Integrator provides the Knowledge Modules (KM) listed in Table 10-1 for handling Netezza data. These KMs use Netezza specific features. It is also possible to use the generic SQL KMs with the Netezza database. See Chapter 4, "Generic SQL" for more information.

Table 10-1 Netezza Knowledge Modules

Knowledge Module Description

CKM Netezza

Checks data integrity against constraints defined on a Netezza table. Rejects invalid records in the error table created dynamically. Can be used for static controls as well as flow controls.

IKM Netezza Control Append

Integrates data in a Netezza target table in replace/append mode. When flow data needs to be checked using a CKM, this IKM creates a temporary staging table before invoking the CKM.

IKM Netezza Incremental Update

Integrates data in a Netezza target table in incremental update mode.

IKM Netezza To File (EXTERNAL TABLE)

Integrates data in a target file from a Netezza staging area. It uses the native EXTERNAL TABLE feature of Netezza.

LKM File to Netezza (EXTERNAL TABLE)

Loads data from a File to a Netezza Server staging area using the EXTERNAL TABLE feature (dataobject).

LKM File to Netezza (NZLOAD)

Loads data from a File to a Netezza Server staging area using NZLOAD.

RKM Netezza

Retrieves JDBC metadata from a Netezza database. This RKM may be used to specify your own strategy to convert Netezza JDBC metadata into Oracle Data Integrator metadata.

Consider using this RKM if you encounter problems with the standard JDBC reverse-engineering process due to some specificities of the Netezza JDBC driver.


10.2 Installation and Configuration

Make sure you have read the information in this section before you start using the Netezza Knowledge Modules:

10.2.1 System Requirements and Certifications

Before performing any installation you should read the system requirements and certification documentation to ensure that your environment meets the minimum installation requirements for the products you are installing.

The list of supported platforms and versions is available on Oracle Technical Network (OTN):

http://www.oracle.com/technology/products/oracle-data-integrator/index.html.

10.2.2 Technology Specific Requirements

Some of the Knowledge Modules for Netezza use the NZLOAD utility.

The following requirements and restrictions apply for these Knowledge Modules:

  • The source file must be accessible by the ODI agent executing the mapping.

  • The run-time agent machine must have Netezza Performance Server client installed. And the NZLOAD install directory needs to be in the PATH variable when the agent is started.

  • All mappings need to be on the staging area.

  • All source fields need to be mapped, and must be in the same order as the target table in Netezza.

  • Date, Time, Timestamp and Numeric formats should be specified in consistent with Netezza Data Type definition.

For KMs using the EXTERNAL TABLE feature: Make sure that the file is accessible by the Netezza Server.

10.2.3 Connectivity Requirements

This section lists the requirements for connecting to a Netezza database.

JDBC Driver

Oracle Data Integrator uses the Netezza JDBC to connect to a NCR Netezza database. This driver must be installed in your Oracle Data Integrator drivers directory.

10.3 Setting up the Topology

Setting up the Topology consists of:

  1. Creating a Netezza Data Server

  2. Creating a Netezza Physical Schema

10.3.1 Creating a Netezza Data Server

A Netezza data server corresponds to a Netezza cluster connected with a specific Netezza user account. This user will have access to several databases in this cluster, corresponding to the physical schemas in Oracle Data Integrator created under the data server.

10.3.1.1 Creation of the Data Server

Create a data server for the Netezza technology using the standard procedure, as described in "Creating a Data Server" of the Developing Integration Projects with Oracle Data Integrator. This section details only the fields required or specific for defining a Netezza data server:

  1. In the Definition tab:

    • Name: Name of the data server that will appear in Oracle Data Integrator

    • Server: Physical name of the data server

    • User/Password: Netezza user with its password

  2. In the JDBC tab:

    • JDBC Driver: org.netezza.Driver

    • JDBC URL: jdbc:Netezza://<host>:<port>/<database>

Note:

Note that Oracle Data Integrator will have write access only on the database specified in the URL.

10.3.2 Creating a Netezza Physical Schema

Create a Netezza physical schema using the standard procedure, as described in "Creating a Physical Schema" in Administering Oracle Data Integrator.

Note:

When performing this configuration, the work and data databases names must match. Note also that the dollar sign ($) is an invalid character for names in Netezza. Remove the dollar sign ($) from work table and journalizing elements prefixes.

Create for this physical schema a logical schema using the standard procedure, as described in "Creating a Logical Schema" in Administering Oracle Data Integrator and associate it in a given context.

10.4 Setting Up an Integration Project

Setting up a project using the Netezza database follows the standard procedure. See "Creating an Integration Project" of the Developing Integration Projects with Oracle Data Integrator.

It is recommended to import the following knowledge modules into your project for getting started with Netezza:

  • CKM NetezzaIKM Netezza Control AppendIKM Netezza Incremental UpdateIKM Netezza To File (EXTERNAL TABLE)LKM File to Netezza (EXTERNAL TABLE)LKM File to Netezza (NZLOAD)RKM Netezza

10.5 Creating and Reverse-Engineering a Netezza Model

This section contains the following topics:

10.5.1 Create a Netezza Model

Create a Netezza Model using the standard procedure, as described in "Creating a Model" of the Developing Integration Projects with Oracle Data Integrator.

10.5.2 Reverse-engineer a Netezza Model

Netezza supports both Standard reverse-engineering - which uses only the abilities of the JDBC driver - and Customized reverse-engineering.

In most of the cases, consider using the standard JDBC reverse engineering for starting.

Consider switching to customized reverse-engineering if you encounter problems with the standard JDBC reverse-engineering process due to some specificities of the Netezza JDBC driver.

Standard Reverse-Engineering

To perform a Standard Reverse-Engineering on Netezza use the usual procedure, as described in "Reverse-engineering a Model" of the Developing Integration Projects with Oracle Data Integrator.

Customized Reverse-Engineering

To perform a Customized Reverse-Engineering on Netezza with a RKM, use the usual procedure, as described in "Reverse-engineering a Model" of the Developing Integration Projects with Oracle Data Integrator. This section details only the fields specific to the Netezza technology:

  1. In the Reverse Engineer tab of the Netezza Model, select the KM: RKM Netezza.<project name>.

The reverse-engineering process returns tables, views, attributes, Keys and Foreign Keys.

10.6 Setting up Data Quality

Oracle Data Integrator provides the CKM Netezza for checking data integrity against constraints defined on a Netezza table. See "Flow Control and Static Control" in Developing Integration Projects with Oracle Data Integrator for details.

10.7 Designing a Mapping

You can use Netezza as a source, staging area, or a target of a mapping.

The KM choice for a mapping or a check determines the abilities and performance of this mapping or check. The recommendations in this section help in the selection of the KM for different situations concerning a Netezza data server.

10.7.1 Loading Data from and to Netezza

Netezza can be used as a source, target or staging area of a mapping. The LKM choice in the Loading Knowledge Module tab to load data between Netezza and another type of data server is essential for the performance of a mapping.

10.7.1.1 Loading Data from Netezza

Use the Generic SQL KMs or the KMs specific to the other technology involved to load data from a Netezza database to a target or staging area database.

For extracting data from a Netezza staging area to a file, use the IKM Netezza to File (EXTERNAL TABLE). See Section 10.7.2, "Integrating Data in Netezza" for more information.

10.7.1.2 Loading Data to Netezza

Oracle Data Integrator provides Knowledge Modules that implement optimized methods for loading data from a source or staging area into a Netezza database. These optimized Netezza KMs are listed in Table 10-2. In addition to these KMs, you can also use the Generic SQL KMs or the KMs specific to the other technology involved.

Table 10-2 KMs for loading data to Netezza

Source or Staging Area Technology KM Notes

File

LKM File to Netezza (EXTERNAL TABLE)

Loads data from a File to a Netezza staging area database using the Netezza External table feature.

File

LKM File to Netezza (NZLOAD)

Loads data from a File to a Netezza staging area database using the NZLOAD bulk loader.


10.7.2 Integrating Data in Netezza

Oracle Data Integrator provides Knowledge Modules that implement optimized data integration strategies for Netezza. These optimized Netezza KMs are listed in Table 10-3. In addition to these KMs, you can also use the Generic SQL KMs.

The IKM choice in the Integration Knowledge Module tab determines the performances and possibilities for integrating.

Table 10-3 KMs for integrating data to Netezza

KM Notes

IKM Netezza Control Append

Integrates data in a Netezza target table in replace/append mode.

IKM Netezza Incremental Update

Integrates data in a Netezza target table in incremental update mode.

This KM implements a DISTRIBUTE_ON option to define the processing distribution. It is important that the chosen column has a high cardinality (many distinct values) to ensure evenly spread data to allow maximum processing performance.

Please follow Netezza's recommendations on choosing a such a column.Valid options are:

  • [PK]: Primary Key of the target table.

  • [UK]: Update key of the mapping

  • [RANDOM]: Random distribution

  • <list of column>: a comma separated list of columns

If no value is set (empty), no index will be created.

This KM also uses an ANALYZE_TARGET option to generate statistics on the target after integration.

IKM Netezza to File (EXTERNAL TABLE)

Integrates data from a Netezza staging area to a file using external tables.

This KM implements an optional BASE_TABLE option to specify the name of a table that will be used as a template for the external table.