Analyzing and Cleansing Data for Sun Master Index

ProcedureTo Create the ETL Collaboration and Extract the Data

Before You Begin

Complete the instructions under Connecting to the Source Database.

  1. In the NetBeans Projects window, right-click the Data Integrator Module you created in Creating the Staging Database.

  2. Point to New, and then select Other.

    The New File Wizard appears.

  3. Select SOA under categories, and select ETL under File Types.

  4. Click Next.

  5. On the Name and Location window, enter a name for the collaboration and enter a name for the collaboration directory .

  6. Click Next.

  7. Select Basic Extract – Transform – Load (ETL), and then click Next.

  8. On the Select Source Tables window, do the following:

    1. Under Available Connections, select your source database name or URL.

    2. Under Schemas, select a table containing data to extract and then click Select.

    3. Repeat the above step for each table containing data to extract.

    4. Click Next.

      If there are multiple tables, the Select Source Tables to Create Join window appears.

  9. Do one of the following:

    • If the tables do not need to be joined, click Next.

    • To join tables, select the tables to be joined Under Available Tables, and then click the right arrow to move them under Selected Tables. Click Next.

  10. On the Select Target Tables window, do the following:

    1. Under Available Connections, select your target database name or URL.

    2. Under Schemas, select a table that will store extracted data and then click Select.

    3. Repeat the above step for each table that will store extracted data.

    4. Click Finish.

      The collaboration canvas appears with source tables on the left and target tables on the right.

  11. To map the data fields to be extracted from the source tables to the appropriate fields in the target tables, do one of the following:

    • To map fields individually, click the arrow next to the source field name and drag it to the target field name. Repeat for each field.

    • To map fields automatically, right-click a source table to map and then click Automap. If there is only one table available to map to, Data Integrator performs the mapping; otherwise, a dialog box appears so you can select the target table. Repeat for each table.


      Caution – Caution –

      You must map a unique value to the ObjectID column in each target table (where Object is the name of the object for which the table stores data). You can concatenate the source system and local ID of each incoming record to create a unique value, or you can define a custom function to populate the unique values.


  12. To configure the target tables, do the following for each target table:

    1. Select the target table and open its Properties panel (select Properties in the Window menu).

    2. Deselect Create Target Table.

    3. (Optional) Select Truncate Target Table.

  13. Save your work, and then click Run Collaboration on the collaboration toolbar.


    Tip –

    If any errors occur, the error message appears in the output panel at the bottom of the window. To fix the collaboration, select a target table, open its Properties panel, and delete the value of the Target Join Condition property. Repeat this step for each target table. Save and rerun the collaboration.


  14. After the extraction process is complete, verify the data in the target database table by right-clicking each table on the canvas and then selecting Show Data.

    The data in the tables appears in the output panel.

Next Steps

Continue to Determining the Fields to Analyze.