Oracle® Business Intelligence Data Warehouse Administration Console Guide Version 7.9.5 Part Number E12085-01 |
|
|
View PDF |
This chapter provides information about customizing, designing, executing, and monitoring ETL processes.
This section includes the following topics:
Importing New Data Warehouse Objects into the Informatica Repository
Creating Tasks in the DAC for New or Modified Informatica Workflows
The metadata for a source system is held in a container. You cannot change the metadata for preconfigured containers. If you want to customize the metadata in a preconfigured container, you must first make a copy of the container. The DAC keeps track of all customizations in the copied container, so that at any time you can find the newly created objects and modified objects, as well as the original objects.
You can also create a new, empty container if you want to build your own container with customized metadata.
To create a new container or copy an existing container
In the DAC menu bar, select File, then select New Source System Container.
Enter an ID and a name for the container.
The ID and Name fields are alphanumeric. The Name can contain spaces but the ID cannot.
Select one of the following:
Create Empty New Source System Container
Create as a Copy of Source System Container
If you are creating an empty, new container, click OK.
If you are making a copy of an existing container, select the existing container from the drop-down list, and then click OK.
You can add tables, columns, and indexes to the data warehouse, and you can modify these existing objects. Customizing the data warehouse in this way requires using the DAC and Informatica client tools. For more information about using Informatica client tools to customize the Oracle Business Analytics Warehouse, see the Oracle Business Intelligence Applications Fusion Edition Installation and Configuration Guide.
Figure 5-1 shows the major steps required for adding a new object to the data warehouse or modifying existing objects. As shown in Figure 5-1, you can begin the customization process by adding or modifying the new data warehouse object in the DAC and then using the DAC's Data Warehouse Configurator to create or update the object in the data warehouse. Alternatively, you can add or modify the object directly in the data warehouse database and then use the DAC's Import from Database command to add the new object in the DAC.
As shown in Figure 5-1, there are two alternative process flows for adding a new object to the data warehouse. You can enter the table and column definitions in the DAC and then use the DAC's Data Warehouse Configurator to create the table and columns in the data warehouse database; for this method, follow the procedure, "To add a new table and columns to the data warehouse using the DAC's Data Warehouse Configurator".
Alternatively, you can add the new table and column definitions directly in the data warehouse database and then use the DAC's Import from Database command to add the new table and columns in the DAC; for this method, follow the procedure, "To add a new table and columns using the DAC's Import command".
To add a new table and columns to the data warehouse using the DAC's Data Warehouse Configurator
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Tables.
Create the new table.
In the Tables tab, click New.
In the Edit child tab, enter the appropriate information about the table, and click Save.
For a description of the fields in this tab, see "Tables Tab".
Add the columns for the new table.
In the Columns child tab, click New.
Enter the appropriate column information for each column you want to add to the table, and click Save.
Enter the appropriate foreign key table and column information.
Note:
For performance purposes, it is recommended that you do not enter more than 254 columns to a dimension of fact table.Create the new tables and columns in the data warehouse database.
Select Tools, then select ETL Management, then select Configure.
Select the appropriate Source and Target database platforms, and then click OK.
In the Data Warehouse Configuration Wizard, select Create Data Warehouse Tables, and then click Next.
Enter the required information, and then click Start.
An informational message reports whether the process was successful. For information about the process, you can review the createwtables.log file in the OracleBI\DAC\log\config folder.
To add a new table and columns using the DAC's Import command
Add the new table and column definitions into the data warehouse database.
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Tables.
Import the new table definition.
Right-click and select Import from Database, then select Import Database Tables.
In the Import Tables dialog, select DataWarehouse.
Optionally, enter filter criteria to identity the table name you entered in Step 0.
See "DAC Query Commands and Operators" for available filter commands and operators.
Click Read Tables.
In the list of tables displayed, select the Import check box for the tables you want to import.
Click Import Tables.
An informational message indicates whether the process was successful.
Import the new column definitions.
In the Tables tab, query for the table you imported in Step 4.
With the table highlighted, right-click and select Import from Database, then select Import Database Columns.
In the Importing Columns... dialog, select Selected Record Only, and then click OK.
In the Import Table Columns dialog, click Read Columns.
The Changes column displays a description of column changes, which are explained below:
Change | Explanation |
---|---|
The object was added to the database. | The column is in the database but not the DAC repository. Importing it will add the column to the DAC repository. |
The object was added to the repository. | The column is in the DAC repository but not in the database. Importing it will delete it from the DAC repository. |
The object was modified. | The column definition in the database does not match the definition in the DAC repository. |
In the list of columns displayed, select the Import check box for the columns you want to import.
Click Import Columns.
An informational message indicates whether the process was successful.
Follow this procedure to add a new index to the data warehouse.
To add a new index to the data warehouse
Add the new index definition into the data warehouse database.
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Tables.
Query for the table for which you want to import index definitions.
Right-click and select Import from Database, then select Import Indices.
Choose to import indexes for a selected table or for all the records retrieved in your query, and click OK.
In the Import Indices dialog, select DataWarehouse from the Data Sources drop-down list.
Click Read Indices.
In the list of indexes displayed, select the Import check box for the indexes you want to import.
Click Import Indices.
An informational message indicates whether the process was successful.
This step requires using Informatica client tools to import new data warehouse objects into the Informatica repository. For instructions on this step of customizing the data warehouse, see Oracle Business Intelligence Applications Fusion Edition Installation and Configuration Guide.
This step requires using Informatica client tools to create new Informatica mappings and workflows for the data warehouse objects that you imported into the Informatica repository. For instructions on this step of customizing the data warehouse, see Oracle Business Intelligence Applications Fusion Edition Installation and Configuration Guide.
You need to perform this step for all new workflows you create in Informatica and for all workflows that you modify.
To create a task in the DAC for new or modified Informatica workflows
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the DAC, create custom logical and physical task folders for the custom folder you created in the Informatica repository.
In the DAC, navigate to Tools, then select Seed Data, then select Task Folders.
To create a custom logical folder, click New.
In the Name field, enter a name for the custom logical folder, for example, Custom Logical.
In the Type field, select Logical.
To create a custom physical folder, click New.
In the Name field, enter a name for the custom physical folder, for example, Custom Physical.
In the Type field, select Physical.
Register the folders you created in Step 2 in the Source System Folders tab.
Navigate to Design, then select Source System Folders.
Click New.
In the Edit child tab, enter the name of the custom logical folder in the Logical Folder field.
Enter the name of the custom physical folder in the Physical Folder field, and click Save.
Create new tasks for the workflows.
Navigate to Design, then select Tasks, and click New in the top pane toolbar.
In the Edit child tab, enter the workflow name as it appears in Informatica Workflow Manager.
Right-click and select Synchronize Tasks.
Select Selected Record Only, and click OK. Click OK in the informational message box.
This command synchronizes the source and target table information between the DAC and Informatica.
In the Tasks tab, enter the remaining information required for the task.
For a description of the fields in this tab, see "Tasks Tab".
The new table is now ready to be associated with a subject area. For information about creating a subject area, see "Creating a Subject Area".
A task phase dependency enables you to change the order in which tasks are executed.
To set a task phase dependency
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then Design, then select Tasks.
Query for the task for which you want to add a phase dependency, and make sure it is highlighted.
Click the Phase Dependency child tab.
Click Add/Remove.
In the Choose Phases dialog, select a phase and click Add.
Click OK in the message box that states the phase was added.
Select an Action and a Grain, click OK.
For information about these fields, see "Tasks Tab: Phase Dependency Subtab".
The task phase dependency appears in the Phase Dependency child tab.
Reassemble the appropriate subject area.
In the Subject Areas tab, query for the appropriate subject area.
Click Assemble.
Rebuild the execution plan.
Navigate to the Execution Plans tab in the Execute view.
Query for the appropriate execution plan.
In the Parameters child tab, click Generate.
In the top pane toolbar, click Build.
The DAC automatically organizes tasks into a dependency structure based on dependency rules. For information about the DAC's dependency rules, see "How the DAC Determines the Order of Task Execution within an Execution Plan". The DAC assigns priority randomly to tasks that have the same properties. You can use the Task Group feature to group such tasks that share the same properties and enforce a priority of your choosing.
This feature can be useful for the following: truncation and restartability purposes; when more than one task with similar properties writes to the same table; and when there is a circular read/write relationship between tables; for example, task 1 reads from table A and writes to table B, and task 2 reads from table B and writes to table A.
To create a task group
In the DAC, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Task Groups.
Create a new task group.
Click New in the top pane toolbar.
In the Edit child tab, enter a name and select the appropriate properties.
Click the Child Tasks child tab, and click Add/Remove in the toolbar.
In the left-hand window of the Choose Child Tasks dialog, query for the tasks you want to add to the task group.
Select the tasks, and click Add.
In the right-hand window, enter an execution order.
Click Save, and then click OK to close the window.
This section describes how the DAC handles parameters and how you can define and manage parameters at the source system and task levels. It contains the following topics:
The ETL logic in Oracle Business Intelligence Applications uses parameters in the Informatica mappings and sessions. You define and manage parameters using the DAC parameter management feature. A parameter can apply to all tasks under a source system container (referred to as a source system parameter) or it can apply to a particular task (referred to as a task level parameter). Parameters set up at the task level have priority over parameters set up at the source system level.
In the DAC, there are two types of parameters: static and runtime. The value of static parameters remains constant for all ETL runs. Examples of static parameters include language codes and currencies. The value of runtime parameters is dynamic, and the DAC updates this value for each ETL run. Examples of dynamic parameters include last refresh dates and last WID sequence numbers.
Parameters can have one of the following data types.
Text
The value for the parameter is defined as text. You can use the Text data type for both static and runtime parameters.
DB Specific Text
The value for the parameter is defined as database-specific text. This parameter should be used only if you have a heterogeneous database environment, and the parameter value needs to be different for the different database types. The DAC evaluates the string based on the source or target database type. If you do not specify database-specific text, the DAC returns the default value.
Timestamp
The value for the parameter is defined as a timestamp. You can use the Timestamp data type for both static and runtime parameters. A static timestamp can be any time that is constant. A runtime timestamp parameter is a variable for which the value is supplied by the DAC at runtime. You can define the timestamp in one of multiple formats or define a custom format. You can also use SQL to fetch any value that can be fired against the specified logical database connection. The DAC executes the SQL against a data source that maps to the specified logical connection and then formats the resulting value in the specified format. A SQL specified for a given timestamp parameter can include nested DAC parameters. For information about nested parameters, see "Nesting Parameters within Other Parameters".
SQL
The DAC fetches the value for the parameter from a database using SQL.
Oracle Business Intelligence Applications ships with preconfigured parameters. Some of these preconfigured parameters are held in text files named parameterfileDW.txt and parameterfileOLTP.txt, which are stored in the folder \OracleBI\DAC\Informatica\parameters\input. Other preconfigured parameters are held in the DAC. The parameters held in the DAC are specific to the different source system containers.
You can add new parameters in the DAC or change the existing parameters held in the DAC. However, Oracle recommends that you do not change the parameters held in the parameter text files. You can override the parameters in the text files by creating new parameters in the DAC.
If you do make changes to the parameter text files, however, make sure the files remain in the folder \OracleBI\DAC\Informatica\parameters\input.
During an ETL execution, the DAC reads and evaluates the parameters held in the text files parameterfileDW.txt and parameterfileOLTP.txt along with the parameters held in the DAC. The DAC then creates an individual parameter file for each session. This file contains the evaluated name-value pairs for all parameters, both static and runtime. The naming convention for this parameter file is <Informatica folder name>.<Informatica session name>.txt. The DAC copies this file to a location specified in the DAC system property InformaticaParameterFileLocation.
Note:
The Informatica Server must be configured to read parameter files from the location specified in the DAC system property InformaticaParameterFileLocation. For instructions on setting this property, see Oracle Business Intelligence Applications Fusion Edition Installation and Configuration Guide.You can nest any parameter definition within another parameter definition. For example, you could nest a runtime text parameter that returns the current run ID within a where clause of a SQL parameter, or you could use database specific text inside another parameter of the text or SQL data types. Parameters that are nested within other parameters must use the prefix @DAC_.
An example of a text parameter that returns the current run ID placed in a where clause of a SQL parameter would look similar to the following:
SELECT VALUE FROM PARAM_TEST WHERE ROW_ID=`@DAC_p1'
Note:
Avoid circular nesting, such as parameter A nested within parameter B, which is nested within parameter A. In such situations, the DAC randomly picks one of the parameters in the circle and evaluates it as an empty string.Follow this procedure to define a parameter using the Text data type. This procedure applies to parameters defined at both the source system and task levels.
To define a text type parameter:
Do one of the following:
To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.
To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.
Click New.
Enter a parameter name.
Select the Text data type.
Click in the Value field to open the Enter Parameter Value dialog.
Select one of the following options:
Static. This option specifies a value that remains constant for all ETL runs.
Runtime. This option specifies a value will be updated by the DAC before each ETL run.
If you selected the Static option, enter a text value in the text window, and click OK.
If you selected the Runtime option, select a DAC Variable from the list, and click OK.
(Optional) To inactivate the parameter, select Inactive.
Click Save.
Follow this procedure to define a parameter using the DB Specific Text data type. This procedure applies to parameters defined at both the source system and task levels.
To define a database specific text type parameter:
Do one of the following:
To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.
To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.
Click New.
Enter a parameter name.
Select the DB Specific Text data type.
Click in the Value field to open the Enter Parameter Value dialog.
Select one of the following Connection Type options:
@DAC_SOURCE_DBTYPE. This option specifies a source database connection.
@DAC_TARGET_DBTYPE. This option specifies a target database connection.
To define a parameter specific to all database types:
Click in the Default field to open the Default text box.
Enter the parameter definition, and click OK.
To define a parameter specific to a particular database type:
Click in the appropriate database type field to open the text box.
Enter the parameter definition, and click OK.
Click OK to close the Enter Parameter Value dialog.
(Optional) To inactivate the parameter, select Inactive.
Click Save.
Follow this procedure to define a parameter using the Timestamp data type. This procedure applies to parameters defined at both the source system and task levels.
To define a Timestamp type parameter:
Do one of the following:
To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.
To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.
Click New.
Enter a parameter name.
Select the Timestamp data type.
Click in the Value field to open the Enter Parameter Value dialog.
Select one of the following options:
Static. This option specifies a value that remains constant for all ETL runs.
Runtime. This option specifies the value will be updated by the DAC before each ETL run.
SQL. This option
If you selected the Static option:
Click in the Date field to open the Date dialog.
Enter a data and time, click OK.
If you selected the Runtime option:
Click in the Value field to open the Enter Parameter Value dialog.
Select a Variable from the list.
From the Function list, select a format to which the DAC will convert the date. If you select Custom, enter a custom date format.
If you select SQL Syntax or SQL Syntax (Date Only), select a Connection Type.
If you selected the SQL option:
Click in the SQL field to open the Enter Parameter Value dialog.
Select a Logical Data Source from the list.
Enter the parameter definition and click OK.
Click OK to close the Enter Parameter Value dialog.
(Optional) To inactivate the parameter, select Inactive.
Click Save.
Follow this procedure to define a parameter using the text data type. This procedure applies to parameters defined at both the source system and task levels.
To define a SQL type parameter:
Do one of the following:
To define a source system parameter, from the Views menu, select Design, and then select Source System Parameters.
To define a task level parameter, from the Views menu, select Design, then select Tasks, and then click the Parameters subtab.
Click New.
Enter a parameter name.
Select the SQL data type.
Click in the Value field to open the Enter Parameter Value dialog.
Select a Logical Data Source.
Enter a SQL statement, and click OK.
(Optional) To inactivate the parameter, select Inactive.
Click Save.
You can specify tablespaces for indexes by table type in the Setup view in the DAC. For example, you could specify a tablespace for indexes of the dimension table type.
Note:
You must create the tablespace in the database before you can specify tablespaces for indexes.To specify tablespaces for indexes by table type
In the DAC toolbar, click Setup, then click the Physical Data Sources tab, then click the Index Spaces subtab.
In the Index Spaces toolbar, click Generate.
A list of available table types appears in the Table Type list.
In the Index Space column, enter an index space for each table type, and click Save.
If the Index Space property is left empty for a table type, the default index space for the default database connection will be used. The default index space is specified on the Edit subtab of the Physical Data Sources tab in the Setup view. If the Default Index Space property is also empty, the default tablespace assigned for the table owner will be used.
A configuration tag is an object that controls the inclusion of tasks in subject areas. When a task is tagged, it is not eligible to be included in the collection of tasks for any subject area, unless the tag is part of the subject area definition "Include Task" property.
A configuration tag can function in one of the following ways:
Remove tasks from all subject areas
If you assign a task to a configuration tag, the task will not be eligible to participate in any subject area. For instructions, see "To remove tasks from all subject areas".
Reassign autogenerated tasks to a specific subject area
An autogenerated task is a task that the DAC automatically assigns to a subject area when the subject area is assembled.
For autogenerated tasks that were removed from participating in a subject area, you can set up the configuration tag to reassign a task to participate in specific subject areas. You do this by associating the configuration tag with the desired subject area. This method only applies to tasks that are autogenerated tasks of a subject area. For instructions, see "To reassign autogenerated tasks to a subject area".
Add non-autogenerated tasks to a subject area
You can set up a configuration tag to add non-autogenerated tasks to a subject area. The non-autogenerated tasks will participate in the subject area along with the subject area's autogenerated tasks. For instructions, see "To add non-autogenerated tasks to a subject area".
Assign only configuration tag tasks to a subject area (excludes the subject area's autogenerated tasks)
You can also set up a configuration tag so that only tasks that were assigned to the configuration tag participate in a specific subject area. In this case, the subject area's autogenerated tasks do not participate. For instructions, see "To assign only configuration tag tasks to a subject area (excludes the subject area's autogenerated tasks)"
To remove tasks from all subject areas
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Configuration Tags.
Create a new configuration tag.
Click New in the top pane toolbar.
In the Edit child tab, enter a name.
Make sure the Include Tasks check box is not selected.
Click Save.
Add tasks to the configuration tag.
With the new configuration tag highlighted in the top pane, click the Tasks child tab.
In the bottom pane toolbar, click Add/Remove.
In the Tasks dialog, query for the tasks you want to add to the configuration tag.
Highlight the tasks, and then click Add.
The tasks appear in the right-hand window.
Click Save, and then click OK to close the window.
These tasks will not be eligible to participate in any subject area.
To reassign autogenerated tasks to a subject area
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then Design, then select Configuration Tags.
Query for the configuration tag that contains the tasks you want to reassign to a subject area.
Verify the configuration tag contains the appropriate tasks by clicking the Tasks child tab and reviewing the list of tasks associated with this configuration tag.
Note:
Only a subject area's autogenerated tasks will be reassigned. If non-autogenerated tasks appear in the list, the DAC will ignore them.Associate the configuration tag with the subject areas to which you want to reassign the tasks.
With the configuration tag highlighted in the top pane, click the Subject Areas child tab.
Click Add/Remove in the bottom pane toolbar.
In the Subject Areas dialog, query for one or more subject areas to which you want to reassign the task or tasks.
Highlight the appropriate subject areas, and click Add.
Click Save, and then click OK to close the window.
Reassemble the subject area.
In the Subject Area tab, query for all the subjects areas you added to the configuration tag.
Highlight the subject areas, and click Reassemble.
To add non-autogenerated tasks to a subject area
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Configuration Tags.
Create a new configuration tag.
Click New in the top pane toolbar.
In the Edit child tab, enter a name.
Select the Include Tasks check box.
Click Save.
Add the non-autogenerated tasks to the configuration tag.
With the new configuration tag highlighted in the top pane, click the Tasks child tab.
In the bottom pane toolbar, click Add/Remove.
In the Tasks dialog, query for the extraneous tasks you want to add to the configuration tag.
Highlight the tasks, and then click Add.
Click Save, and then click OK to close the window.
Associate the configuration tag with the subject areas to which you want to add the non-autogenerated tasks.
With the configuration tag highlighted in the top pane, click the Subject Areas child tab.
Click Add/Remove in the bottom pane toolbar.
In the Subject Areas dialog, query for one or more subject areas to which you want to add the non-autogenerated tasks.
Highlight the appropriate subject areas, and click Add.
Click Save, and then click OK to close the window.
Reassemble the subject area.
In the Subject Area tab, query for all the subjects areas you added to the configuration tag.
Highlight the subject areas, and click Reassemble.
To assign only configuration tag tasks to a subject area (excludes the subject area's autogenerated tasks)
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Subject Areas.
Query for the subject area to which you want to add configuration tag tasks.
Note:
The autogenerated tasks for this subject area will be excluded.Select the Configuration Tag Tasks Only check box, and click Save.
Create a configuration tag.
Navigate to the Configuration Tags tab.
Click New in the top pane toolbar.
In the Edit child tab, enter a name.
Select the Include Tasks check box.
Click Save.
Add the tasks to the configuration tag.
With the new configuration tag highlighted in the top pane, click the Tasks child tab.
In the bottom pane toolbar, click Edit.
In the Tasks dialog, query for the tasks you want to add to the configuration tag.
Highlight the tasks, and then click Add.
Click Save, and then click OK to close the window.
Associate the configuration tag with the subject area.
With the configuration tag highlighted in the top pane, click the Subject Areas child tab.
Click Add/Remove in the bottom pane toolbar.
In the Subject Areas dialog, query for the appropriate subject area.
Highlight the subject area, and click Add.
Click Save, and then click OK to close the window.
Reassemble the subject area.
In the Subject Area tab, query for all the subjects areas you added to the configuration tag.
Highlight the subject areas, and click Reassemble.
Oracle Business Intelligence Applications provides preconfigured subject areas. You can change these preconfigured subject areas or create new subject areas to correspond to your particular business processes.
Note:
To change a preconfigured subject area or to create a new subject area, you must first make a copy of an existing source system container or create a new container. For instructions, see "Creating or Copying a Source System Container".In designing a subject area, you should consider the following questions:
Tables. Which tables need to be populated for the data warehouse? From which tables does your organization source data? What tables will create the star schemas.
Subject areas. Do the subject areas cover all the relevant tables?
Tasks. Are the tasks that load this table defined?
Indexes. Do the target tables have the correct indexes defined?
A subject area is a collection of tasks. When a subject area is defined, the DAC uses the following logic to assemble the collection of tasks:
Initial selection of tables.
Find all the fact tables that belong to the subject areas.
Recursive selection of related tables.
Recursively find all the tables directly related through foreign keys and all other logically related tables.
Initial selection of tasks.
Find all the tasks that load into the tables selected above, that is, tasks whose target tables are one of the tables identified above.
Recursive selection of all tasks.
Depending on the source and target table relationships, recursively figure out the prerequisite tasks.
An execution plan is a collection of subject areas and a unique collection of tasks. A task can have prerequisite tasks that need to be executed before its own execution. The DAC determines the order of tasks based on the following considerations:
A task's source and target table
The DAC server first looks at a task's source and target table. For example, suppose table A is populated by task T1 by reading from table B, and table B is populated by task T2 by reading from table C. The DAC server would determine task T2 should be executed before T1.
The DAC server next considers the following:
Task phase
An ETL process typically goes through several phases. An example of a typical order in which phases are executed is as follows:
Extract Dimension
Extract Fact
Load Dimension
Load Fact and Load Hierarchy (executed in parallel)
Load Aggregate tables
Update Dimensions
A table's Truncate Always properties
The order of execution based on Truncate Always properties is as follows:
Insert
Upsert
Physical data source
Priority
The DAC randomly organizes tasks that are the same in all properties. If some tasks need to be executed in a particular order, the DAC enables you to create a task group in which you can specify an execution order.
When you create a new subject area, you assign one or more fact tables to the subject area. The DAC then determines which dimension and other related tables are required as well as the tasks and their order of execution.
To create a new subject area
In the DAC, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then Subject Areas.
In the top pane toolbar, click New.
In the Edit child tab, enter a name for the subject area, and click Save.
Make sure the new subject area name is highlighted in the top pane, and click the Tables child tab.
Click Add/Remove in the child tab toolbar.
The Choose Tables dialog opens. The left-hand window lists all the tables held in the selected container.
Query for one or more fact tables.
Select the fact table (use Shift+click to select more than one table), and click Add.
The tables are added to the right-hand window, which represents the subject area.
Click OK to close the Choose Tables dialog.
In the top pane toolbar, click Assemble.
In the Assembling... dialog, select Selected Record Only.
If you select the option All Records in the List, the DAC will reassemble all the subject areas listed in the top pane.
The DAC assembles the selected subject area by determining what dimensions and other related tables are required and what tasks are needed to load these tables.
You will receive an informational message when the assemble process is completed.
Click the Tasks tab to view which tasks the DAC has determined are required for this subject area.
Tasks that are automatically assigned to the subject area by the DAC are indicated with the Autogenerated check mark.
You can inactivate a task from participating in the subject area by selecting the Inactive check box. When the Inactive check box is selected, the task remains inactive even if you reassemble the subject area.
You can also remove a task from the subject area using the Add/Remove command, but when you remove a task it is only removed from the subject area until you reassemble the subject area.
Execution plans are subject areas that are used to execute ETL processes. Before you attempt to run an execution plan, make sure you have completed the following:
Set database connections to the transactional and data warehouse databases (in the Physical Data Sources tab).
Registered the Informatica PowerCenter Services and Integration Service (in the Informatica Servers tab).
Note:
All subject areas in an execution plan must belong to the same source system container.To build and run an execution plan
Navigate to the Execute view, then select the Execution Plans tab.
Create a new execution plan.
In the top pane toolbar, click New.
In the Edit child tab, enter a name for the execution plan and other appropriate information.
For a description of the fields in this tab, see "Execution Plans Tab".
Click Save.
Associate one or more subject areas with the execution plan.
Click the Subject Areas child tab.
Click Add/Remove in the bottom pane toolbar.
In the Choose Subject Areas dialog, select the appropriate source system container.
Query for the subject area you want to associate with the execution plan.
Select the subject area and click Add.
You can associate multiple subject areas with an execution plan, but all the subject areas must be from the same source system container.
Click OK to close the window.
Generate the runtime parameters.
Click the Parameters child tab.
Click Generate in the bottom pane toolbar.
For each Datasource type, enter the appropriate name in the Value field.
For the data source type of FlatFileConnection, make sure you have copied all files into the directory specified in the DAC system property InformaticaParameterFileLocation.
In the top pane of the Execution Plans tab, make sure the new execution plan is highlighted, and click Build.
The DAC builds the execution plan.
To run the execution plan, select the execution plan in the top pane, and click Run Now.
Once the ETL process starts running you can monitor its progress in the Current Run tab.
For information about how refresh dates are tracked, see "About Refresh Dates".
To schedule an execution plan, see "Scheduling an Execution Plan".
Micro ETL execution plans are ETL processes that you schedule at very frequent intervals, such as hourly or half-hourly. They usually handle small subject areas or subsets of larger subject areas. The DAC tracks refresh dates for tables in micro ETL execution plans separately from other execution plans and uses these refresh dates in the change capture process.
After a micro ETL execution plan runs, the DAC populates refresh date values in the Refresh Dates child tab of the Execution Plans tab. If a subject area is used in a regular execution plan (an execution plan with the Keep Separate Refresh Dates option not selected) as well as a micro ETL execution plan, the DAC maintains refresh dates for the tables in the regular execution plan in the Refresh Dates child tab of the Physical Data Sources tab (Setup view).
In cases of a subject area being used in both a regular and micro ETL execution plan and the micro ETL execution plan is suspended for a few days but the regular execution plan runs nightly, the DAC automatically detects the last refresh date for the tables common to both execution plans and intelligently extracts only the most recent records for the micro ETL execution plan.
Caution:
Micro ETL processes can cause issues with data inconsistencies, data availability, and additional load on the transactional database. Therefore, you should consider the following factors before implementing a micro ETL process:For related star schemas, if one schema is omitted from a micro ETL execution plan, the cross-star reports may be inaccurate. For example, if the Person fact table is refreshed more frequently than the Revenue fact table, a report that spans the Person and Revenue dimensional schemas may produce inconsistent results.
If you omit dimension tables from a micro ETL execution plan, the foreign keys for the fact tables will point to Unspecified rows for the new dimension records. The foreign key references will be resolved when the Complete ETL execution plan is run, but users of the reports should be aware of such inconsistencies.
If you do not include aggregate tables in micro ETL execution plans, the reports that use data from these tables will be inconsistent with the reports that use data from the detailed fact tables. However, if aggregate tables are included in the micro ETL execution plan, the aggregate calculations are performed for each ETL process, which will take a constant amount of time and may be inefficient to perform at such frequent intervals.
Hierarchy tables are rebuilt during every ETL execution plan by querying the base dimension tables. This operation takes a constant amount of time. If the base tables are big, this operation may take a long time and may be inefficient if the micro ETL execution plan runs several times a day. However, if you avoid populating the hierarchy tables during micro ETL processes, data inconsistencies will occur.
With micro ETL execution plans, caching will occur more frequently, which may have performance implications.
Micro ETL execution plans will put more load on the transactional database because of the frequent extracts.
To create a micro ETL execution plan
In the DAC toolbar, select the appropriate source system container from the drop-down list in the toolbar.
In the Menu bar, select Views, then select Design, then select Subject Areas.
In the Subject Areas tab, assemble a small subject area.
In the Tasks child tab, inactivate all tasks that are not required for the execution plan.
Create a new execution plan.
Navigate to the Execute view, then select the Execution Plans tab.
Enter a name for the execution plan
Select the Keep Separate Refresh Dates check box.
Click Save.
Associate one or more subject areas with the execution plan.
Click the Subject Areas child tab.
Click Add/Remove in the bottom pane toolbar.
In the Choose Subject Areas dialog, select the appropriate source system container.
Query for the subject area you want to associate with the execution plan.
Select the subject area and click Add.
You can associate multiple subject areas with an execution plan, but all the subject areas must be from the same source system container.
Click OK to close the window.
Generate the runtime parameters.
Click the Parameters child tab.
Click Generate in the bottom pane toolbar.
For each Datasource type, enter the appropriate name in the Value field.
In the top pane of the Execution Plans tab, make sure the new execution plan is highlighted, and click Build.
The DAC builds the execution plan.
Navigate to the Ordered Tasks child tab and verify the sequence of tasks.
The execution plan is now ready to run as a micro ETL execution plan.
Create a schedule for the micro ETL execution plan. For instructions, see "Scheduling an Execution Plan".
Use this procedure to schedule an execution plan in the DAC.
To schedule an execution plan
In the DAC, navigate to the Scheduler tab.
The current list of schedules appears in the top pane.
Click New in the top pane toolbar.
The Edit tab in the bottom pane becomes active.
Enter a name for the schedule.
Select an execution plan.
If you want the schedule to run once, select the Run Only Once check box, and then select a start and end date and time.
To create a periodic schedule, select a recurrence pattern, and enter the appropriate date and time parameters.
Click Save.
Refresh dates refer to the date of the last ETL process (the last time data was extracted from tables in a given database or loaded into tables in a given database). The DAC uses the refresh dates to determine whether to run the incremental load commands or to run full load commands and whether to truncate the target tables.
Refresh dates are tracked only for tables that are either a primary source or a primary target on tasks in a completed run of an execution plan. The DAC runs the full load command for tasks on which a table is a primary source or target if the refresh date against the table is null. When there are multiple primary sources, the earliest of the refresh dates will trigger a full load or an incremental load. If any one of the primary source tables has no refresh date, then the DAC will run the full load command.
Table 5-1 shows the possible scenarios regarding refresh dates.
Table 5-1 Refresh Date Scenarios
Scenario | Table Type (in Tasks child tabs) | Refresh Date | Command DAC Will Use | Truncate Target Table? |
---|---|---|---|---|
1 |
Primary Source |
Null |
Full Load |
Yes |
1 |
Primary Target |
Null |
Not applicable |
Not applicable |
2 (See note below) |
Primary Source |
Null |
Full Load |
No |
2 |
Primary Target |
Not Null |
Not applicable |
Not applicable |
3 (See note below) |
Primary Source |
Not Null |
Full Load |
Yes |
3 |
Primary Target |
Null |
Not applicable |
Not applicable |
4 |
Primary Source |
Not Null |
Incremental Load |
No |
4 |
Primary Target |
Not Null |
Not applicable |
Not applicable |
Scenario 2. When two or more source tables load into the same target table as separate tasks, the source table in the second task may have refresh date as null while the target may have a refresh date.
Scenario 3. When a source loads into more than one target table in separate tasks, the refresh date may be null for the second target table while the source table may have refresh dates.
The Current Run tab in the Execute view provides predefined reports that enable you to monitor execution plan processes in order to isolate bottlenecks and enhance performance.
To monitor an execution plan process
In the DAC, navigate to the Current Run tab.
Right-click and select Get Run Information.
The following options are available:
Get log file
Analyze run
Get chart
Get phase chart
Get graph