Skip Headers
Oracle® Business Intelligence Applications Data Warehouse Administration Console Guide
Version 7.9.4
E10759-01
  Go To Documentation Library
Library
Go To Product List
Product
Go To Table Of Contents
Contents
Go To Index
Index

Previous
Previous
 
Next
Next
 

5 Customizing, Designing, Executing and Monitoring ETL Processes

This chapter provides information about customizing, designing, executing, and monitoring ETL processes.

This section includes the following topics:

Creating or Copying a Source System Container

The metadata for a source system is held in a container. You cannot change the metadata for preconfigured containers. If you want to customize the metadata in a preconfigured container, you must first make a copy of the container. The DAC keeps track of all customizations in the copied container, so that at any time you can find the newly created objects and modified objects, as well as the original objects.

You can also create a new, empty container if you want to build your own container with customized metadata.

To create a new container or copy an existing container

  1. In the DAC menu bar, select File, then select New Source System Container.

  2. Enter an ID and a name for the container.

    The ID and Name fields are alphanumeric. The Name can contain spaces but the ID cannot.

  3. Select one of the following:

    • Create Empty New Source System Container

    • Create as a Copy of Source System Container

  4. If you are creating an empty, new container, click OK.

  5. If you are making a copy of an existing container, select the existing container from the drop-down list, and then click OK.

About Customizing the Data Warehouse

You can add tables, columns, and indexes to the data warehouse, and you can modify these existing objects. Customizing the data warehouse in this way requires using the DAC and Informatica client tools. For more information about using Informatica client tools to customize the Oracle Business Analytics Warehouse, see the Oracle Business Intelligence Applications Installation and Configuration Guide.

Figure 5-1 shows the major steps required for adding a new object to the data warehouse or modifying existing objects. As shown in Figure 5-1, you can begin the customization process by adding or modifying the new data warehouse object in the DAC and then using the DAC's Data Warehouse Configurator to create or update the object in the data warehouse. Alternatively, you can add or modify the object directly in the data warehouse database and then use the DAC's Import from Database command to add the new object in the DAC.

Figure 5-1 Process Flow to Add New Object to Data Warehouse

This image is described in the surrounding text.

Adding a New Table and Columns to the Data Warehouse

As shown in Figure 5-1, there are two alternative process flows for adding a new object to the data warehouse. You can enter the table and column definitions in the DAC and then use the DAC's Data Warehouse Configurator to create the table and columns in the data warehouse database; for this method, follow the procedure, "To add a new table and columns to the data warehouse using the DAC's Data Warehouse Configurator".

Alternatively, you can add the new table and column definitions directly in the data warehouse database and then use the DAC's Import from Database command to add the new table and columns in the DAC; for this method, follow the procedure, "To add a new table and columns using the DAC's Import command".

To add a new table and columns to the data warehouse using the DAC's Data Warehouse Configurator

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then select Design, then select Tables.

  3. Create the new table.

    1. In the Tables tab, click New.

    2. In the Edit child tab, enter the appropriate information about the table, and click Save.

      For a description of the fields in this tab, see "Tables Tab".

  4. Add the columns for the new table.

    1. In the Columns child tab, click New.

    2. Enter the appropriate column information for each column you want to add to the table, and click Save.

    3. Enter the appropriate foreign key table and column information.


      Note:

      For performance purposes, it is recommended that you do not enter more than 254 columns to a dimension of fact table.

  5. Create the new tables and columns in the data warehouse database.

    1. Select Tools, then select ETL Management, then select Configure.

    2. Select the appropriate Source and Target database platforms, and then click OK.

    3. In the Data Warehouse Configuration Wizard, select Create Data Warehouse Tables, and then click Next.

    4. Enter the required information, and then click Start.

      An informational message reports whether the process was successful. For information about the process, you can review the createwtables.log file in the OracleBI\DAC\log\config folder.

To add a new table and columns using the DAC's Import command

  1. Add the new table and column definitions into the data warehouse database.

  2. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  3. In the Menu bar, select Views, then select Design, then select Tables.

  4. Import the new table definition.

    1. Right-click and select Import from Database, then select Import Database Tables.

    2. In the Import Tables dialog, select DataWarehouse.

    3. Optionally, enter filter criteria to identity the table name you entered in Step 0.

      See "DAC Query Commands and Operators" for available filter commands and operators.

    4. Click Read Tables.

    5. In the list of tables displayed, select the Import check box for the tables you want to import.

    6. Click Import Tables.

      An informational message indicates whether the process was successful.

  5. Import the new column definitions.

    1. In the Tables tab, query for the table you imported in Step 4.

    2. With the table highlighted, right-click and select Import from Database, then select Import Database Columns.

    3. In the Importing Columns... dialog, select Selected Record Only, and then click OK.

    4. In the Import Table Columns dialog, click Read Columns.

      The Changes column displays a description of column changes, which are explained below:

      Change Explanation
      The object was added to the database. The column is in the database but not the DAC repository. Importing it will add the column to the DAC repository.
      The object was added to the repository. The column is in the DAC repository but not in the database. Importing it will delete it from the DAC repository.
      The object was modified. The column definition in the database does not match the definition in the DAC repository.

    5. In the list of columns displayed, select the Import check box for the columns you want to import.

    6. Click Import Columns.

      An informational message indicates whether the process was successful.

Adding an Index to the Data Warehouse

Follow this procedure to add a new index to the data warehouse.

To add a new index to the data warehouse

  1. Add the new index definition into the data warehouse database.

  2. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  3. In the Menu bar, select Views, then select Design, then select Tables.

  4. Query for the table for which you want to import index definitions.

  5. Right-click and select Import from Database, then select Import Indices.

  6. Choose to import indexes for a selected table or for all the records retrieved in your query, and click OK.

  7. In the Import Indices dialog, select DataWarehouse from the Data Sources drop-down list.

  8. Click Read Indices.

    1. In the list of indexes displayed, select the Import check box for the indexes you want to import.

    2. Click Import Indices.

      An informational message indicates whether the process was successful.

Importing New Data Warehouse Objects into the Informatica Repository

This step requires using Informatica client tools to import new data warehouse objects into the Informatica repository. For instructions on this step of customizing the data warehouse, see Oracle Business Intelligence Applications Installation and Configuration Guide.

Creating Informatica Mappings and Workflows

This step requires using Informatica client tools to create new Informatica mappings and workflows for the data warehouse objects that you imported into the Informatica repository. For instructions on this step of customizing the data warehouse, see Oracle Business Intelligence Applications Installation and Configuration Guide.

Creating Tasks in the DAC for New or Modified Informatica Workflows

You need to perform this step for all new workflows you create in Informatica and for all workflows that you modify.

To create a task in the DAC for new or modified Informatica workflows

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the DAC, create custom logical and physical task folders for the custom folder you created in the Informatica repository.

    1. In the DAC, navigate to Tools, then select Seed Data, then select Task Folders.

    2. To create a custom logical folder, click New.

    3. In the Name field, enter a name for the custom logical folder, for example, Custom Logical.

    4. In the Type field, select Logical.

    5. To create a custom physical folder, click New.

    6. In the Name field, enter a name for the custom physical folder, for example, Custom Physical.

    7. In the Type field, select Physical.

  3. Register the folders you created in Step 2 in the Source System Folders tab.

    1. Navigate to Design, then select Source System Folders.

    2. Click New.

    3. In the Edit child tab, enter the name of the custom logical folder in the Logical Folder field.

    4. Enter the name of the custom physical folder in the Physical Folder field, and click Save.

  4. Create new tasks for the workflows.

    1. Navigate to Design, then select Tasks, and click New in the top pane toolbar.

    2. In the Edit child tab, enter the workflow name as it appears in Informatica Workflow Manager.

    3. Right-click and select Synchronize Tasks.

    4. Select Selected Record Only, and click OK. Click OK in the informational message box.

      This command synchronizes the source and target table information between the DAC and Informatica.

    5. In the Tasks tab, enter the remaining information required for the task.

      For a description of the fields in this tab, see "Tasks Tab".

    The new table is now ready to be associated with a subject area. For information about creating a subject area, see "Creating a Subject Area".

Setting a Task Phase Dependency

A task phase dependency enables you to change the order in which tasks are executed.

To set a task phase dependency

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then Design, then select Tasks.

  3. Query for the task for which you want to add a phase dependency, and make sure it is highlighted.

  4. Click the Phase Dependency child tab.

  5. Click Add/Remove.

  6. In the Choose Phases dialog, select a phase and click Add.

  7. Click OK in the message box that states the phase was added.

  8. Select an Action and a Grain, click OK.

    For information about these fields, see "Task Tab: Phase Dependency Subtab".

    The task phase dependency appears in the Phase Dependency child tab.

  9. Reassemble the appropriate subject area.

    1. In the Subject Areas tab, query for the appropriate subject area.

    2. Click Assemble.

  10. Rebuild the execution plan.

    1. Navigate to the Execution Plans tab in the Execute view.

    2. Query for the appropriate execution plan.

    3. In the Parameters child tab, click Generate.

    4. In the top pane toolbar, click Build.

Creating a Task Group

The DAC automatically organizes tasks into a dependency structure based on dependency rules. For information about the DAC's dependency rules, see "How the DAC Determines the Order of Task Execution within an Execution Plan". The DAC assigns priority randomly to tasks that have the same properties. You can use the Task Group feature to group such tasks that share the same properties and enforce a priority of your choosing.

This feature can be useful for the following: truncation and restartability purposes; when more than one task with similar properties writes to the same table; and when there is a circular read/write relationship between tables; for example, task 1 reads from table A and writes to table B, and task 2 reads from table B and writes to table A.

To create a task group

  1. In the DAC, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then select Design, then select Task Groups.

  3. Create a new task group.

    1. Click New in the top pane toolbar.

    2. In the Edit child tab, enter a name and select the appropriate properties.

  4. Click the Child Tasks child tab, and click Add/Remove in the toolbar.

  5. In the left-hand window of the Choose Child Tasks dialog, query for the tasks you want to add to the task group.

  6. Select the tasks, and click Add.

  7. In the right-hand window, enter an execution order.

  8. Click Save, and then click OK to close the window.

About Parameter Management

This section describes how the DAC handles parameters and how you can define and manage parameters at the source system and task levels. It contains the following topics:

Overview of Parameters

The ETL logic in Oracle Business Intelligence Applications uses parameters in the Informatica mappings and sessions. You define and manage parameters using the DAC parameter management feature. A parameter can apply to all tasks under a source system container (referred to as a source system parameter) or it can apply to a particular task (referred to as a task level parameter). Parameters set up at the task level have priority over parameters set up at the source system level.

In the DAC, there are two types of parameters: static and runtime. The value of static parameters remains constant for all ETL runs. Examples of static parameters include language codes and currencies. The value of runtime parameters is dynamic, and the DAC updates this value for each ETL run. Examples of dynamic parameters include last refresh dates and last WID sequence numbers.

Parameter Data Types

Parameters can have one of the following data types.

Text

The value for the parameter is defined as text. You can use the Text data type for both static and runtime parameters.

DB Specific Text

The value for the parameter is defined as database-specific text. This parameter should be used only if you have a heterogeneous database environment, and the parameter value needs to be different for the different database types. The DAC evaluates the string based on the source or target database type. If you do not specify database-specific text, the DAC returns the default value.

Timestamp

The value for the parameter is defined as a timestamp. You can use the Timestamp data type for both static and runtime parameters. A static timestamp can be any time that is constant. A runtime timestamp parameter is a variable for which the value is supplied by the DAC at runtime. You can define the timestamp in one of multiple formats or define a custom format. You can also use SQL to fetch any value that can be fired against the specified logical database connection. The DAC executes the SQL against a data source that maps to the specified logical connection and then formats the resulting value in the specified format. A SQL specified for a given timestamp parameter can include nested DAC parameters. For information about nested parameters, see "Nesting Parameters within Other Parameters".

SQL

The DAC fetches the value for the parameter from a database using SQL.

Preconfigured Parameters

Oracle Business Intelligence Applications ships with preconfigured parameters. Some of these preconfigured parameters are held in text files named parameterfileDW.txt and parameterfileOLTP.txt, which are stored in the folder \OracleBI\DAC\Informatica\parameters\input. Other preconfigured parameters are held in the DAC. The parameters held in the DAC are specific to the different source system containers.

You can add new parameters in the DAC or change the existing parameters held in the DAC. However, Oracle recommends that you do not change the parameters held in the parameter text files. You can override the parameters in the text files by creating new parameters in the DAC.

If you do make changes to the parameter text files, however, make sure the files remain in the folder \OracleBI\DAC\Informatica\parameters\input.

How the DAC Handles Parameters at Runtime

During an ETL execution, the DAC reads and evaluates the parameters held in the text files parameterfileDW.txt and parameterfileOLTP.txt along with the parameters held in the DAC. The DAC then creates an individual parameter file for each session. This file contains the evaluated name-value pairs for all parameters, both static and runtime. The naming convention for this parameter file is <Informatica folder name>.<Informatica session name>.txt. The DAC copies this file to a location specified in the DAC system property InformaticaParameterFileLocation.


Note:

The Informatica Server must be configured to read parameter files from the location specified in the DAC system property InformaticaParameterFileLocation. For instructions on setting this property, see Oracle Business Intelligence Applications Installation and Configuration Guide.

Nesting Parameters within Other Parameters

You can nest any parameter definition within another parameter definition. For example, you could nest a runtime text parameter that returns the current run ID within a where clause of a SQL parameter, or you could use database specific text inside another parameter of the text or SQL data types. Parameters that are nested within other parameters must use the prefix @DAC_.

An example of a text parameter that returns the current run ID placed in a where clause of a SQL parameter would look similar to the following:

SELECT VALUE FROM PARAM_TEST WHERE ROW_ID=`@DAC_p1'


Note:

Avoid circular nesting, such as parameter A nested within parameter B, which is nested within parameter A. In such situations, the DAC randomly picks one of the parameters in the circle and evaluates it as an empty string.

Defining a Text Type Parameter

Follow this procedure to define a parameter using the Text data type. This procedure applies to parameters defined at both the source system and task levels.

To define a text type parameter:

  1. Do one of the following:

    • To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.

    • To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.

  1. Click New.

  1. Enter a parameter name.

  1. Select the Text data type.

  1. Click in the Value field to open the Enter Parameter Value dialog.

  1. Select one of the following options:

    • Static. This option specifies a value that remains constant for all ETL runs.

    • Runtime. This option specifies a value will be updated by the DAC before each ETL run.

  1. If you selected the Static option, enter a text value in the text window, and click OK.

  1. If you selected the Runtime option, select a DAC Variable from the list, and click OK.

  1. (Optional) To inactivate the parameter, select Inactive.

  2. Click Save.

Defining a Database Specific Text Type Parameter

Follow this procedure to define a parameter using the DB Specific Text data type. This procedure applies to parameters defined at both the source system and task levels.

To define a database specific text type parameter:

  1. Do one of the following:

    • To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.

    • To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.

  1. Click New.

  1. Enter a parameter name.

  1. Select the DB Specific Text data type.

  1. Click in the Value field to open the Enter Parameter Value dialog.

  1. Select one of the following Connection Type options:

    • @DAC_SOURCE_DBTYPE. This option specifies a source database connection.

    • @DAC_TARGET_DBTYPE. This option specifies a target database connection.

  1. To define a parameter specific to all database types:

    1. Click in the Default field to open the Default text box.

    2. Enter the parameter definition, and click OK.

  1. To define a parameter specific to a particular database type:

    1. Click in the appropriate database type field to open the text box.

    2. Enter the parameter definition, and click OK.

  1. Click OK to close the Enter Parameter Value dialog.

  1. (Optional) To inactivate the parameter, select Inactive.

  2. Click Save.

Defining a Timestamp Type Parameter

Follow this procedure to define a parameter using the Timestamp data type. This procedure applies to parameters defined at both the source system and task levels.

To define a Timestamp type parameter:

  1. Do one of the following:

    • To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.

    • To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.

  1. Click New.

  1. Enter a parameter name.

  1. Select the Timestamp data type.

  1. Click in the Value field to open the Enter Parameter Value dialog.

  1. Select one of the following options:

    • Static. This option specifies a value that remains constant for all ETL runs.

    • Runtime. This option specifies the value will be updated by the DAC before each ETL run.

    • SQL. This option

  1. If you selected the Static option:

    1. Click in the Date field to open the Date dialog.

    2. Enter a data and time, click OK.

  1. If you selected the Runtime option:

    1. Click in the Value field to open the Enter Parameter Value dialog.

    2. Select a Variable from the list.

    3. From the Function list, select a format to which the DAC will convert the date. If you select Custom, enter a custom date format.

      If you select SQL Syntax or SQL Syntax (Date Only), select a Connection Type.

  1. If you selected the SQL option:

    1. Click in the SQL field to open the Enter Parameter Value dialog.

    2. Select a Logical Data Source from the list.

    3. Enter the parameter definition and click OK.

  1. Click OK to close the Enter Parameter Value dialog.

  1. (Optional) To inactivate the parameter, select Inactive.

  2. Click Save.

Defining a SQL Type Parameter

Follow this procedure to define a parameter using the text data type. This procedure applies to parameters defined at both the source system and task levels.

To define a SQL type parameter:

  1. Do one of the following:

    • To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.

    • To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.

  1. Click New.

  1. Enter a parameter name.

  1. Select the SQL data type.

  1. Click in the Value field to open the Enter Parameter Value dialog.

  1. Select a Logical Data Source.

  1. Enter a SQL statement, and click OK.

  1. (Optional) To inactivate the parameter, select Inactive.

  2. Click Save.

Specifying Tablespaces for Indexes by Table Type

You can specify tablespaces for indexes by table type in the Setup view in the DAC. For example, you could specify a tablespace for indexes of the dimension table type.


Note:

You must create the tablespace in the database before you can specify tablespaces for indexes.

To specify tablespaces for indexes by table type

  1. In the DAC toolbar, click Setup, then click the Physical Data Sources tab, then click the Index Spaces subtab.

  2. In the Index Spaces toolbar, click Generate.

    A list of available table types appears in the Table Type list.

  3. In the Index Space column, enter an index space for each table type, and click Save.

    If the Index Space property is left empty for a table type, the default index space for the default database connection will be used. The default index space is specified on the Edit subtab of the Physical Data Sources tab in the Setup view. If the Default Index Space property is also empty, the default tablespace assigned for the table owner will be used.

Working with Configuration Tags

A configuration tag is an object that controls the inclusion of tasks in subject areas. When a task is tagged, it is not eligible to be included in the collection of tasks for any subject area, unless the tag is part of the subject area definition "Include Task" property.

A configuration tag can function in one of the following ways:

To remove tasks from all subject areas

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then select Design, then select Configuration Tags.

  3. Create a new configuration tag.

    1. Click New in the top pane toolbar.

    2. In the Edit child tab, enter a name.

    3. Make sure the Include Tasks check box is not selected.

    4. Click Save.

  4. Add tasks to the configuration tag.

    1. With the new configuration tag highlighted in the top pane, click the Tasks child tab.

    2. In the bottom pane toolbar, click Add/Remove.

    3. In the Tasks dialog, query for the tasks you want to add to the configuration tag.

    4. Highlight the tasks, and then click Add.

      The tasks appear in the right-hand window.

    5. Click Save, and then click OK to close the window.

      These tasks will not be eligible to participate in any subject area.

To reassign autogenerated tasks to a subject area

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then Design, then select Configuration Tags.

  3. Query for the configuration tag that contains the tasks you want to reassign to a subject area.

  4. Verify the configuration tag contains the appropriate tasks by clicking the Tasks child tab and reviewing the list of tasks associated with this configuration tag.


    Note:

    Only a subject area's autogenerated tasks will be reassigned. If non-autogenerated tasks appear in the list, the DAC will ignore them.

  5. Associate the configuration tag with the subject areas to which you want to reassign the tasks.

    1. With the configuration tag highlighted in the top pane, click the Subject Areas child tab.

    2. Click Add/Remove in the bottom pane toolbar.

    3. In the Subject Areas dialog, query for one or more subject areas to which you want to reassign the task or tasks.

    4. Highlight the appropriate subject areas, and click Add.

    5. Click Save, and then click OK to close the window.

  6. Reassemble the subject area.

    1. In the Subject Area tab, query for all the subjects areas you added to the configuration tag.

    2. Highlight the subject areas, and click Reassemble.

To add non-autogenerated tasks to a subject area

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then select Design, then select Configuration Tags.

  3. Create a new configuration tag.

    1. Click New in the top pane toolbar.

    2. In the Edit child tab, enter a name.

    3. Select the Include Tasks check box.

    4. Click Save.

  4. Add the non-autogenerated tasks to the configuration tag.

    1. With the new configuration tag highlighted in the top pane, click the Tasks child tab.

    2. In the bottom pane toolbar, click Add/Remove.

    3. In the Tasks dialog, query for the extraneous tasks you want to add to the configuration tag.

    4. Highlight the tasks, and then click Add.

    5. Click Save, and then click OK to close the window.

  5. Associate the configuration tag with the subject areas to which you want to add the non-autogenerated tasks.

    1. With the configuration tag highlighted in the top pane, click the Subject Areas child tab.

    2. Click Add/Remove in the bottom pane toolbar.

    3. In the Subject Areas dialog, query for one or more subject areas to which you want to add the non-autogenerated tasks.

    4. Highlight the appropriate subject areas, and click Add.

    5. Click Save, and then click OK to close the window.

  6. Reassemble the subject area.

    1. In the Subject Area tab, query for all the subjects areas you added to the configuration tag.

    2. Highlight the subject areas, and click Reassemble.

To assign only configuration tag tasks to a subject area (excludes the subject area's autogenerated tasks)

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then select Design, then select Subject Areas.

  3. Query for the subject area to which you want to add configuration tag tasks.


    Note:

    The autogenerated tasks for this subject area will be excluded.

  4. Select the Configuration Tag Tasks Only check box, and click Save.

  5. Create a configuration tag.

    1. Navigate to the Configuration Tags tab.

    2. Click New in the top pane toolbar.

    3. In the Edit child tab, enter a name.

    4. Select the Include Tasks check box.

    5. Click Save.

  6. Add the tasks to the configuration tag.

    1. With the new configuration tag highlighted in the top pane, click the Tasks child tab.

    2. In the bottom pane toolbar, click Edit.

    3. In the Tasks dialog, query for the tasks you want to add to the configuration tag.

    4. Highlight the tasks, and then click Add.

    5. Click Save, and then click OK to close the window.

  7. Associate the configuration tag with the subject area.

    1. With the configuration tag highlighted in the top pane, click the Subject Areas child tab.

    2. Click Add/Remove in the bottom pane toolbar.

    3. In the Subject Areas dialog, query for the appropriate subject area.

    4. Highlight the subject area, and click Add.

    5. Click Save, and then click OK to close the window.

  8. Reassemble the subject area.

    1. In the Subject Area tab, query for all the subjects areas you added to the configuration tag.

    2. Highlight the subject areas, and click Reassemble.

Considerations in Designing a Subject Area

Oracle Business Intelligence Applications provides preconfigured subject areas. You can change these preconfigured subject areas or create new subject areas to correspond to your particular business processes.


Note:

To change a preconfigured subject area or to create a new subject area, you must first make a copy of an existing source system container or create a new container. For instructions, see "Creating or Copying a Source System Container".

Designing a Subject Area

In designing a subject area, you should consider the following questions:

  • Tables. Which tables need to be populated for the data warehouse? From which tables does your organization source data? What tables will create the star schemas.

  • Subject areas. Do the subject areas cover all the relevant tables?

  • Tasks. Are the tasks that load this table defined?

  • Indexes. Do the target tables have the correct indexes defined?

How the DAC Determines Tasks Required for Subject Areas

A subject area is a collection of tasks. When a subject area is defined, the DAC uses the following logic to assemble the collection of tasks:

  1. Initial selection of tables.

    Find all the fact tables that belong to the subject areas.

  2. Recursive selection of related tables.

    Recursively find all the tables directly related through foreign keys and all other logically related tables.

  3. Initial selection of tasks.

    Find all the tasks that load into the tables selected above, that is, tasks whose target tables are one of the tables identified above.

  4. Recursive selection of all tasks.

    Depending on the source and target table relationships, recursively figure out the prerequisite tasks.

How the DAC Determines the Order of Task Execution within an Execution Plan

An execution plan is a collection of subject areas and a unique collection of tasks. A task can have prerequisite tasks that need to be executed before its own execution. The DAC determines the order of tasks based on the following considerations:

  • A task's source and target table

    The DAC server first looks at a task's source and target table. For example, suppose table A is populated by task T1 by reading from table B, and table B is populated by task T2 by reading from table C. The DAC server would determine task T2 should be executed before T1.

    The DAC server next considers the following:

  • Task phase

    An ETL process typically goes through several phases. An example of a typical order in which phases are executed is as follows:

    1. Extract Dimension

    2. Extract Fact

    3. Load Dimension

    4. Load Fact and Load Hierarchy (executed in parallel)

    5. Load Aggregate tables

    6. Update Dimensions

  • A table's Truncate Always properties

    The order of execution based on Truncate Always properties is as follows:

    1. Insert

    2. Upsert

    3. Physical data source

    4. Priority

  • The DAC randomly organizes tasks that are the same in all properties. If some tasks need to be executed in a particular order, the DAC enables you to create a task group in which you can specify an execution order.

Creating a Subject Area

When you create a new subject area, you assign one or more fact tables to the subject area. The DAC then determines which dimension and other related tables are required as well as the tasks and their order of execution.

To create a new subject area

  1. In the DAC, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then select Design, then Subject Areas.

  3. In the top pane toolbar, click New.

  4. In the Edit child tab, enter a name for the subject area, and click Save.

  5. Make sure the new subject area name is highlighted in the top pane, and click the Tables child tab.

  6. Click Add/Remove in the child tab toolbar.

    The Choose Tables dialog opens. The left-hand window lists all the tables held in the selected container.

  7. Query for one or more fact tables.

  8. Select the fact table (use Shift+click to select more than one table), and click Add.

    The tables are added to the right-hand window, which represents the subject area.

  9. Click OK to close the Choose Tables dialog.

  10. In the top pane toolbar, click Assemble.

  11. In the Assembling... dialog, select Selected Record Only.

    If you select the option All Records in the List, the DAC will reassemble all the subject areas listed in the top pane.

    The DAC assembles the selected subject area by determining what dimensions and other related tables are required and what tasks are needed to load these tables.

    You will receive an informational message when the assemble process is completed.

  12. Click the Tasks tab to view which tasks the DAC has determined are required for this subject area.

    Tasks that are automatically assigned to the subject area by the DAC are indicated with the Autogenerated check mark.

    You can inactivate a task from participating in the subject area by selecting the Inactive check box. When the Inactive check box is selected, the task remains inactive even if you reassemble the subject area.

    You can also remove a task from the subject area using the Add/Remove command, but when you remove a task it is only removed from the subject area until you reassemble the subject area.

Building and Running an Execution Plan with the DAC

Execution plans are subject areas that are used to execute ETL processes. Before you attempt to run an execution plan, make sure you have completed the following:


Note:

All subject areas in an execution plan must belong to the same source system container.

To build and run an execution plan

  1. Navigate to the Execute view, then select the Execution Plans tab.

  2. Create a new execution plan.

    1. In the top pane toolbar, click New.

    2. In the Edit child tab, enter a name for the execution plan and other appropriate information.

      For a description of the fields in this tab, see "Execution Plans Tab".

    3. Click Save.

  3. Associate one or more subject areas with the execution plan.

    1. Click the Subject Areas child tab.

    2. Click Add/Remove in the bottom pane toolbar.

    3. In the Choose Subject Areas dialog, select the appropriate source system container.

    4. Query for the subject area you want to associate with the execution plan.

    5. Select the subject area and click Add.

      You can associate multiple subject areas with an execution plan, but all the subject areas must be from the same source system container.

    6. Click OK to close the window.

  4. Generate the runtime parameters.

    1. Click the Parameters child tab.

    2. Click Generate in the bottom pane toolbar.

    3. For each Datasource type, enter the appropriate name in the Value field.

      For the data source type of FlatFileConnection, make sure you have copied all files into the directory specified in the DAC system property InformaticaParameterFileLocation.

  5. In the top pane of the Execution Plans tab, make sure the new execution plan is highlighted, and click Build.

    The DAC builds the execution plan.

  6. To run the execution plan, select the execution plan in the top pane, and click Run Now.

    Once the ETL process starts running you can monitor its progress in the Current Run tab.

    For information about how refresh dates are tracked, see "About Refresh Dates".

    To schedule an execution plan, see "Scheduling an Execution Plan".

Creating a Micro ETL Execution Plan

Micro ETL execution plans are ETL processes that you schedule at very frequent intervals, such as hourly or half-hourly. They usually handle small subject areas or subsets of larger subject areas. The DAC tracks refresh dates for tables in micro ETL execution plans separately from other execution plans and uses these refresh dates in the change capture process.

After a micro ETL execution plan runs, the DAC populates refresh date values in the Refresh Dates child tab of the Execution Plans tab. If a subject area is used in a regular execution plan (an execution plan with the Keep Separate Refresh Dates option not selected) as well as a micro ETL execution plan, the DAC maintains refresh dates for the tables in the regular execution plan in the Refresh Dates child tab of the Physical Data Sources tab (Setup view).

In cases of a subject area being used in both a regular and micro ETL execution plan and the micro ETL execution plan is suspended for a few days but the regular execution plan runs nightly, the DAC automatically detects the last refresh date for the tables common to both execution plans and intelligently extracts only the most recent records for the micro ETL execution plan.


Caution:

Micro ETL processes can cause issues with data inconsistencies, data availability, and additional load on the transactional database. Therefore, you should consider the following factors before implementing a micro ETL process:

To create a micro ETL execution plan

  1. In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.

  2. In the Menu bar, select Views, then select Design, then select Subject Areas.

  3. In the Subject Areas tab, assemble a small subject area.

  4. In the Tasks child tab, inactivate all tasks that are not required for the execution plan.

  5. Create a new execution plan.

    1. Navigate to the Execute view, then select the Execution Plans tab.

    2. Enter a name for the execution plan

    3. Select the Keep Separate Refresh Dates check box.

    4. Click Save.

  6. Associate one or more subject areas with the execution plan.

    1. Click the Subject Areas child tab.

    2. Click Add/Remove in the bottom pane toolbar.

    3. In the Choose Subject Areas dialog, select the appropriate source system container.

    4. Query for the subject area you want to associate with the execution plan.

    5. Select the subject area and click Add.

      You can associate multiple subject areas with an execution plan, but all the subject areas must be from the same source system container.

    6. Click OK to close the window.

  7. Generate the runtime parameters.

    1. Click the Parameters child tab.

    2. Click Generate in the bottom pane toolbar.

    3. For each Datasource type, enter the appropriate name in the Value field.

  8. In the top pane of the Execution Plans tab, make sure the new execution plan is highlighted, and click Build.

    The DAC builds the execution plan.

  9. Navigate to the Ordered Tasks child tab and verify the sequence of tasks.

    The execution plan is now ready to run as a micro ETL execution plan.

  10. Create a schedule for the micro ETL execution plan. For instructions, see "Scheduling an Execution Plan".

Scheduling an Execution Plan

Use this procedure to schedule an execution plan in the DAC.

To schedule an execution plan

  1. In the DAC, navigate to the Scheduler tab.

    The current list of schedules appears in the top pane.

  2. Click New in the top pane toolbar.

    The Edit tab in the bottom pane becomes active.

  3. Enter a name for the schedule.

  4. Select an execution plan.

  5. If you want the schedule to run once, select the Run Only Once check box, and then select a start and end date and time.

  6. To create a periodic schedule, select a recurrence pattern, and enter the appropriate date and time parameters.

  7. Click Save.

About Refresh Dates

Refresh dates refer to the date of the last ETL process (the last time data was extracted from tables in a given database or loaded into tables in a given database). The DAC uses the refresh dates to determine whether to run the incremental load commands or to run full load commands and whether to truncate the target tables.

Refresh dates are tracked only for tables that are either a primary source or a primary target on tasks in a completed run of an execution plan. The DAC runs the full load command for tasks on which a table is a primary source or target if the refresh date against the table is null. When there are multiple primary sources, the earliest of the refresh dates will trigger a full load or an incremental load. If any one of the primary source tables has no refresh date, then the DAC will run the full load command.

Table 5-1 shows the possible scenarios regarding refresh dates.

Table 5-1 Refresh Date Scenarios

Scenario Table Type (in Tasks child tabs) Refresh Date Command DAC Will Use Truncate Target Table?

1

Primary Source

Null

Full Load

Yes

1

Primary Target

Null

Not applicable

Not applicable

2 (See note below)

Primary Source

Null

Full Load

No

2

Primary Target

Not Null

Not applicable

Not applicable

3 (See note below)

Primary Source

Not Null

Full Load

Yes

3

Primary Target

Null

Not applicable

Not applicable

4

Primary Source

Not Null

Incremental Load

No

4

Primary Target

Not Null

Not applicable

Not applicable


Monitoring Execution Plan Processes

The Current Run tab in the Execute view provides predefined reports that enable you to monitor execution plan processes in order to isolate bottlenecks and enhance performance.

To monitor an execution plan process

  1. In the DAC, navigate to the Current Run tab.

  2. Right-click and select Get Run Information.

  3. The following options are available:

    • Get log file

    • Analyze run

    • Get chart

    • Get phase chart

    • Get graph