Designing Data Integrator Projects

Sun Data Integrator Overview

Extracting, Transforming, and Loading: ETL

ETL stands for Extract, Transform, and Load. ETL programs periodically extract data from source systems, transforms the data into common format, and then loads the data into the target data store or warehouse. ETL processes bring together and combine data from multiple source systems into a data warehouse or other target database, enabling all users to work off a single, integrated set of data.

Sun Data Integrator Methodology

Extraction, Transform, and Load (ETL) is a data integration methodology that extracts data from data sources, transforms and cleanses the data, then loads the data in a uniform format into one or more target data sources.

Data Integrator provides high-volume extraction and loading of tabular data sets for Java CAPS, NetBeans, or OpenESB, projects, or as a standalone product. You can use Data Integrator to acquire a temporary subset of data for reports or other purposes, or acquire a more permanent data set for the population of a data mart or data warehouse. You can also use ETL for database type conversions or to migrate data from one database or platform to another.

Data Integrator applies the following ETL methodology:

  1. Extraction: The input data is extracted from data sources. Using Data Integrator, the data can be filtered and joined from multiple, heterogeneous sources, which results in a desired subset of data suitable for transformation.

  2. Transformation: Data Integrator applies the operators specified for the process to transform and cleanse the data to the desired state. Sun Data Integrator supports normalization and parsing of certain data.

  3. Load: The transformed data is loaded into one or multiple databases or data warehouses.

Sun Data Integrator Features

The following are the list of features for Sun Data Integrator: