Designing Data Integrator Projects

ProcedureTo Create an ETL Collaboration for a Master Index Staging Database

Before You Begin

Complete the following tasks:

  1. On the NetBeans Projects window, expand the new Data Integrator project and right-click Collaborations.

  2. Point to New, and then select ETL.

    The New File Wizard appears with the Name and Location window displayed.

  3. Enter name for the collaboration.

    Figure shows the Name and Location window of the Data
Integrator Wizard.
  4. Click Next.

  5. On the Select Type of ETL Loader window on the New File Wizard, select Advanced Extract – Transform – Load (ETL).

    Figure shows the Select Type of ETL Loader window of
the Data Integrator Wizard.
  6. Click Next.

    The Select or Create Database window appears.

  7. To specify a staging database to use for external data sources (for this project only), do one of the following:

    1. Select an existing database to use from the DB URL field.

    2. Select Create and Use New Database, enter a name for a new database in the DB Name field, and then click Create Database. Select the new database in the DB URL field.


      Note –

      This database is required and is used for internal processing only.


    Figure shows the Select or Create Database window of
the Data Integrator Wizard.
  8. Click Next.

    The Choose Data Source window appears.

  9. Do one of the following:

    • If you do not have any file data sources, click Next and skip to step 15 (choosing JDBC data sources).

    • To specify a file data source using a URL, enter the URL and click Add.

    • To specify a file data source that is stored on your network, browse for and select a file containing source data in the Choose a File box, and then click Add.

    • Repeat the above two steps until all file data sources are selected.

    Figure shows the Choose Data Source window of the Data
Integrator Wizard.
  10. Click Next.

    The Enter Table Details window appears, with the information for the first data file displayed.

  11. If necessary, modify the table name, the type of data encoding, and the type of document that contains the source data.

    Data Integrator automatically fills in these fields based on the information from the previous window, so the existing values should be correct.

    Figure shows the Enter Table Details window of the Data
Integrator Wizard.
  12. Click Next.

    If the data file is a spreadsheet, the Choose a Sheet window appears; otherwise, the Import Table MetaData window appears.

  13. If the Choose a Sheet window appears, select the name of the sheet in the spreadsheet that contains the source data, and then click Next.


    Tip –

    To view the data in a sheet, click the Preview button.


    Figure shows the Choose a Sheet window of the Data Integrator
Wizard.
  14. When the Import Table Metadata window appears, modify the information about the data file as needed.

    Data Integrator automatically fills in this information, but you might need to customize it. For more information about the properties you can configure, see Virtual Database Table Metadata Options.

    Figure shows the Import Table Metadata window of the
Data Integrator Wizard.
  15. Preview the information in the bottom portion of the window, and then click Next.

    The Enter Column Properties window appears.

  16. In the upper portion of the window, customize any of the column properties.

    For more information about these properties, see Virtual Database Column Properties.

    Figure shows the Enter Column Properties window of the
Data Integrator Wizard.
  17. Preview the information in the lower portion of the window, and then click Next.

  18. Do one of the following:

    1. If you selected multiple file data sources, the wizard returns to the Enter Table Details window with the attributes for a different file displayed. Repeat the above steps beginning with step 7.

    2. If all the files you specified are configured, a dialog box appears confirming the database table creation. Click OK on the dialog box and continue to the next step.

      The Select Source Tables window appears.

  19. If you specified file data sources, they are already listed under Selected Tables. Click Next if you have no JDBC data sources to specify, or do the following to specify a JDBC data source:

    Figure shows the Select Source Tables window of the Data
Integrator Wizard.
    1. Under Available Connections, select the database that contains the source data.

    2. If there are multiple schemas in the database, select the schema to use.

    3. Under Schemas, select the tables that contain the source data and then click Select.

    4. Click Next.

      If there are tables to join, the Select Source Tables for Join window appears; otherwise, the Generate Target Database window appears.

  20. To define join conditions, do the following. If there are no join conditions, click Next and skip to step 17.

    1. Under Available Tables, select the tables to join, and then click the right arrow to add them to the Selected Tables list.

    2. In the Preview panel, click the drop-down menu at the top of the join box and select the type of join to use from one of the following options:

      • Inner – Use this if all tables to be joined contain the same column.

      • Left Outer – Use this if the results should always include the records from the left table in the join clause.

      • Right Outer – Use this if the results should always include the records from the right table in the join clause.

      • Full Outer – Use this if the results should always include the records from both the right and left tables in the join clause.

      Figure shows the Select Source Tables for Join window
of the Data Integrator Wizard.
    3. To specify columns to exclude from each joined table, click the Select Column tab in the Preview pane and deselect any columns to exclude.

    4. Click Next.

      The Generate Target Database Master Index Model window appears.

  21. To create the staging database, do the following:

    1. Deselect the check box for Use Existing Database Target Tables.

    2. In the Object Definition File field, browse to and select the object.xml file generated for the Master Index project.


      Note –

      This file is located in NetBeansProjects_Home/Project_Name/src/Configuration.


    3. In the Target Database Folder field, select or enter the path where you want to store the database.

    4. In the Target Database Name field, enter a name for the database.

    5. Click Generate Database.

    Figure shows the Generate Target Database Master Index
Model window of the Data Integrator Wizard.
  22. Click Next.

    The Select JDBC Target Tables window appears. The target tables to load the extracted data into are already listed under Available Connections. It is not recommended you change these.

    Figure shows the Select Target Tables window of the Data
Integrator Wizard.
  23. Click Next.

    The Map Selected Collaboration Tables window appears.

  24. To map source and target data, do the following:

    1. To disable constraints on the target tables, select Disable Target Table Constraints.

    2. Select the SQL statement type to use for the transfer. You can select insert, update, or both.

    3. For each target table listed on the right, select one or more source tables from the list directly to the left of the target table. These are the source tables that will be mapped to the target in the collaboration.


      Note –

      If you do not specify a mapping here, the source tables do not appear in the ETL collaboration. You can add the source tables directly to the collaboration using the Select Source and Target Tables function. To select multiple source tables for one target, hold down the Control key while you select the required source tables. If you select multiple source tables for one target, the source tables are automatically joined.


    Figure shows the Map Selected Collaboration Tables window
of the Data Integrator Wizard.
  25. Click Finish.

    The new ETL collaboration appears in the Projects window. If multiple collaboration are created, they are given the name you specified for the collaboration with a target table name appended. To load the data into the staging database, run each of the collaborations. Make sure you are connected to both databases first.

Next Steps

You can further configure the ETL collaboration using the ETL Collaboration Editor. For more information, see Configuring ETL Collaborations.