Designing Data Integrator Projects

Creating a Bulk Loader ETL Collaboration


Note –

The Data Integrator Wizard was enhanced in Java CAPS 6 Update 1. The instructions in this topic might differ from what is available in Release 6.


You can use the Data Integrator Wizard to generate the Bulk Loader for a master index application. The Bulk Loader loads data that has already been cleansed, standardized, and matched into a master index database. The source files for the Bulk Loader are those generated by the Bulk Matcher.

ProcedureTo Create a Bulk Loader ETL Collaboration

Before You Begin
  1. On the NetBeans Projects window, expand the new Data Integrator project and right-click Collaborations.

  2. Point to New, and then select ETL.

    The New File Wizard appears with the Name and Location window displayed.

  3. Enter name for the collaboration.

    Figure shows the Name and Location window of the Data
Integrator Wizard.
  4. Click Next.

  5. On the Select Type of ETL Loader window on the New File Wizard, select Bulk Loader.

    Figure shows the Select Type of ETL Loader window of
the Data Integrator Wizard.
  6. Click Next.

    The Select or Create Database window appears.

  7. To specify a staging database to use for external data sources (for this project only), do one of the following:

    1. Select an existing database to use from the DB URL field.

    2. Select Create and Use New Database, enter a name for a new database in the DB Name field, and then click Create Database. Select the new database in the DB URL field.


      Note –

      This database is required and is used for internal processing only.


    Figure shows the Select or Create Database window of
the Data Integrator Wizard.
  8. Click Next.

    The Select JDBC Target Tables window appears.

  9. To choose the target tables to load the extracted data into, do the following:

    1. Under Available Connections, select the master index database.

    2. Under Schemas, select the schema that contains the tables to load the data into.

    3. Under Schemas, select only the tables that correspond to the data files produced by the Bulk Matcher, and then click Select.


      Tip –

      You can use the Shift and Control keys to select multiple tables at once. If you select target tables that do not correspond to the Bulk Matcher files, collaborations without source table are generated and the project fails to build.


    Figure shows the Select Target Tables window of the Data
Integrator Wizard.
  10. Click Next.

    The Choose Bulk Loader Data Source window appears.

  11. To specify the source data for the Bulk Loader, do the following:

    1. In the upper portion of the window, browse to the location of the of the output files from the Bulk Matcher.


      Note –

      These files are located in NetBeansProjects_Home/Project_Name/loader-generated/loader/work/masterindex, where work is the location you specified for the working directory in loader-config.xml.


    2. Select all of the data files in the masterindex directory, and then click Add.

    Figure shows the Choose Bulk Loader Data Source window
of the Data Integrator Wizard.
  12. Click Next.

    The Map Selected Collaboration Tables window appears.

  13. To map source and target data, do the following:

    1. To disable constraints on the target tables, select Disable Target Table Constraints.

    2. Select the SQL statement type to use for the transfer. You can select insert, update, or both.

    3. The wizard automatically maps the source and target tables for you. Review the mapping to verify its accuracy.


      Note –

      Not every table on the left will be mapped. For example, system tables such as SBYN_COMMON_HEADER, SBYN_COMMON_DETAIL, SBYN_APPL, and SBYN_SYSTEMS do not need to be mapped.


    Figure shows the Map Selected Collaboration Tables window
of the Data Integrator Wizard.
  14. Click Finish.

    An ETL collaboration is created for each target table. This might take a few minutes to generate.

Next Steps

You can further configure the ETL collaboration in the ETL Collaboration Editor. For more information, see Configuring ETL Collaborations.

To load the data into the master index database, you can either run each collaboration individually, or you can generate a batch file that will run all collaborations for you. For more information, see Loading Matched Data Using the Data Integrator Wizard Bulk Loader in Loading the Initial Data Set for a Sun Master Index.