Sun Data Integrator provides your business with a powerful assortment of design-time features that you can use to create and configure ETL processes. The runtime features allow you to monitor the ETL processes and to review any data errors.
Sun Data Integrator provides the following features:
Stores all data transformation logic in one place and enables users, managers, and architects to understand, review, and modify the various interfaces.
Generates a schema based on the master index object structure in order to extract data from legacy systems and load it into a staging database for cleansing and analysis.
Can integrate with a wide variety of source data types, including HTML, XML and RSS.
Simplifies and standardizes ETL processes, requiring little database expertise to build high performance ETL processes.
Automatically discovers metadata, enabling you to design ETL processes faster.
Loads data warehouses faster by taking advantage of database bulk, no-logging tuning where applicable.
Supports creating automatic joins based on primary key and foreign key relationship, and create the code to ensure data integrity.
Takes advantage of the database engine by pushing much of the workload on to the target and source databases.
Supports extensive non-relational data formats.
Provides transform, filter, and sort features at the data source where appropriate.
Provides data cleansing operators to ensure data quality and a dictionary-driven system for complete parsing of names and addresses of individuals, organizations, products, and locations.
Provides the ability to normalize and denormalize data.
Converts data into a consistent, standardized form to enable loading to conformed target databases.
Provides build-in data integrity checks.
Allows you to define customized transformation rules, data type conversion rules, and null value handling.
Provides a robust error handler to ensure data quality, and a comprehensive system for reporting and responding to all ETL error events. Sun Data Integrator also provides automatic notification of significant failures.
Supports concurrent and parallel processing of multiple source data streams.
Supports full refresh and incremental extraction.
Supports data federation that enables you to use SQL as the scripting language to define ETL processes.
Supports near real-time click-stream data warehousing (in conjunction with the JDBC Binding Component (BC)).
Supports ERP/CRM data sources (in conjunction with various components from OpenESB or Java CAPS).
Is platform independent, and can be scaled to enterprise data warehousing applications.
Provides built-in transformation objects so you can easily specify complex transformations.
Supports scheduling of ETL session based on time or on the occurrence of a specific event.
Participates as a partner with BPEL business processes by exposing the ETL process as a web service.
Is able to extract data from outside a firewall in conjunction with the FTP BC and the HTTP BC.
Provides analysis of transformations that failed or were rejected and then allows for resubmitting them after the data is corrected.