7 Customizing ETL Processes

This chapter provides information about customizing DAC objects for inclusion in ETL processes.

This chapter contains the following topics:

Considerations When Defining Repository Objects
About Customizing the Data Warehouse
Adding a New Table and Columns to the Data Warehouse
Adding an Index to the Data Warehouse
Importing New Data Warehouse Objects into the Informatica Repository
Creating Informatica Mappings and Workflows
Creating Tasks in DAC for New or Modified Informatica Workflows
Setting a Task Phase Dependency
Creating a Task Group
Working with Configuration Tags
Using Actions to Manage Indexes, Tables and Tasks
Mapping Multiple Database-Specific Informatica Workflows to the Same DAC Task

Considerations When Defining Repository Objects

T his chapter contains the following topics:

Container Behavior and Best Practices
Task Behavior and Best Practices
Task Group Behavior and Best Practices
Table Behavior and Best Practices
Index Behavior and Best Practices
Column Behavior and Best Practices
Configuration Tag Behavior and Best Practices
Source System Parameter Behavior and Best Practices
Subject Area Behavior and Best Practices
Execution Plan Behavior and Best Practices

Note:

You should be familiar with the information in "About Source System Containers" before considering best practices.

Container Behavior and Best Practices

T he following behavior and best practices apply to containers:

When changes are made to objects in the container that owns them, the change is instantaneous.
Changes made to parent objects in the owner container are automatically pushed to the parent referenced objects.
When you add child objects to a parent object, you must use the Push to References right-click command (Design view) to push the changes to the child referenced objects. For example, if you add a column to a table that is registered in DAC, the new column is not automatically added to the references in the other containers referencing the parent object. You must use the Push to References command to effect the column changes in the other referenced tables.
When you delete a referenced object, only the referenced object is deleted. The original object is not deleted.
If you delete an object from the owner container, the object is deleted as well as referenced objects in other containers. This is referred to as a deep delete. For example, if you delete a table from the owner container, the table and columns are deleted from the owner container and all the containers that reference this object.
If you delete a column from the owner table, the column is deleted in all the referenced objects.
If you delete child objects from the owner object, the referenced child objects are automatically deleted.

Task Behavior and Best Practices

T he following behavior and best practices apply to tasks:

Start your work with tasks in Informatica. After you create a workflow, do the following in the DAC Task tab:
- Create a new task and assign it a logical (readable) name.
- Enter the command for a full load or incremental load.
  
  The commands can be the same. If the Command for Incremental Load field is left blank, no action occurs for this task while in incremental mode. If the Command for Full Load field is left blank, no action occurs for this task while in full mode.
- Ensure all the source and target tables are defined for the task.
  
  You can use the task synchronize functionality to import data from Informatica. You can also manually assign the source or target tables.
Select at least one primary table because the incremental and full mode properties are determined based on the refresh dates of the primary table.
Design tasks so that they load only one table at a time.
Define granular tasks rather than tasks that include bigger blocks of processes. Granular tasks are more efficient and have better restartability.
Do not truncate a table on the source system tables (for example, Oracle, Siebel or PeopleSoft sources).
Ensure the truncate property for the target tables is set properly.
For tables that need to get truncated regardless of the mode of the run (Full or Incremental), set the Truncate Always property to True.
For tables that need to get incrementally loaded, set the Truncate for Full Load property to True.
Select the Analyze Table option if the task should analyze the table. The default value for this option is True if either of the Truncate options are selected.
Do not truncate a table more than once within the single life span of an ETL.
If a task that writes to a target table is contingent upon another table being loaded, use conditional tables. This ensures that the task qualifies only if the conditional table is part of the subject area design.
Assign an appropriate phase to the task. An understanding of task phases is essential to understanding ETL processes.
If you want to force a relationship where none exists, consider using phase dependencies. For example, if you have an ETL process in which the extract facts and extract dimensions do not share any common source or target tables, but the design requires that the extract facts should run before extracting dimensions, then, for the task that extracts facts, add extract dimension as the phase that waits. For more information about phase dependencies, see "Tasks Tab: Phase Dependency Subtab".
Ensure you do not introduce conflicting phase dependencies. This can cause the DAC Server to hang.
If the source qualifier needs to use a data parameter, always use the DAC date parameter that can be formatted to the database-specific syntax.

Task Group Behavior and Best Practices

T he following best practices apply to task groups:

Do not create task groups unnecessarily. Doing so can adversely impact the DAC's auto-dependency functionality, which automatically orders tasks for an execution plan. Create task groups only to satisfy the points listed below.
Avoid circular relationships among tasks if the tasks are of the same phase. For example, avoid situations in which Task 1 reads from Table A and writes to Table B, and Task 2 reads from Table B and writes to Table A. You can use task groups to avoid these situations. Note: If tasks belong to different phases, this situation is acceptable.
If you have many tasks belonging to the same phase that write to the same table, you can use task groups to run the tasks in parallel. If the target tables need to be truncated before the tasks are run, select the properties Truncate Always and Truncate Full Load in the Task Group tab.
Do not mix extracts and loads under a single table group.
Do not make Task Groups for obvious ordering needs. DAC handles ordering in such cases.
If a source system container uses a task group, make other containers that reference the task also include the task group.

Table Behavior and Best Practices

T he following best practices apply to tables:

Always use all upper case characters for table names.
Ensure you set the Table Type property correctly in the Tables tab of the Design view.
For Teradata databases, pay attention to the Set/Multiset property. If you anticipate that the data will contain duplicate rows, choose Multiset as the value for this property.
DAC automatically associates foreign key tables with the referenced table. You can also define which other tables need to be loaded, such as aggregate tables, by associating these tables with the referenced table using the Related Tables subtab of the Tables tab.

Index Behavior and Best Practices

T he following best practices apply to indexes:

Index names must be unique. Oracle, DB2 and Teradata databases enforce this rule by default. However, SQL Server databases do not enforce this rule. If you are using a SQL Server database, you must ensure all index names are unique.
Always use all upper case characters for column names.
If you have a foreign key column, associate the foreign key table and the join column. DAC uses this information to identify all the related tables to be loaded when a certain table needs to be loaded in the data warehouse.
Do not register any columns for source system container tables.
Ensure you add all the appropriate system columns. For example, all tables should have the following:
- ROW_WID in number format.
- INTEGRATION_ID in varchar format.
- DATASOURCE_NUM_ID in number format.
For Teradata databases:
- Pay attention to the Teradata Primary Index property.
- Pay attention to which columns need to gather statistics. Note that column statistics are somewhat equivalent to indexes.
- If you would have had indexes that span multiple columns for other databases, consider defining multi-column statistics for Teradata.

Column Behavior and Best Practices

T he following best practices apply to columns:

Always use all upper case characters for table names.
Ensure you set the Table Type property correctly in the Tables tab of the Design view.
Always use all upper case characters for column names.
If you have a foreign key column, associate the foreign key table with the join column. DAC uses this information to identify all the related tables to be loaded when a certain table needs to be loaded in the data warehouse.
For Teradata databases:
- Pay attention to which columns need to gather statistics. Note that column statistics are somewhat equivalent to indexes.
- If you would have had indexes that span multiple columns for other databases, consider defining multi-column statistics for Teradata.
- Pay attention to the Teradata Primary Index property.
Do not register any columns for source system container tables.
Ensure you add all the appropriate system columns. For example, all tables should have the following:
- ROW_WID in number format.
- INTEGRATION_ID in varchar format.
- DATASOURCE_NUM_ID in number format.
- ETL_PROC_WID in number format.

Configuration Tag Behavior and Best Practices

Th e following best practices apply to configuration tags:

Use configuration tags to tag tasks that you do not want to be part of all the defined subject areas.
A tagged task can be re-associated with a subject area by assigning the configuration tag to the subject area.

Source System Parameter Behavior and Best Practices

The following best practices apply to source system parameters:

Use source system parameters when tasks in the source system container need a particular value.

Subject Area Behavior and Best Practices

T he following best practices apply to subject areas:

To define a subject area, associate only fact tables with it. DAC automatically computes which additional aggregate tables and dimension tables to associate with the subject area based on the related tables you define and foreign key relationships.
If you delete a task from a subject area using the Delete button on the Task tab, the next time you assemble the subject area the task may be included. However, if you inactivate the task by selecting Inactive in the Task tab, the task will remain inactive when you re-assemble the subject area.
Avoid adding tasks or inactivating tasks manually.

Execution Plan Behavior and Best Practices

T he following best practices apply to execution plans:

If many tasks with the same name across source system containers read and write to the same data source, DAC will consider them to be the same task.
If the logical source to physical mapping yields multiple records, DAC will produce as many runtime instances of the task.
DAC orders tasks in the most efficient manner possible based on the following:
- Phase of task
- Source and target tables
- Truncate table properties
- Data source priority

About Customizing the Data Warehouse

Note:

You cannot change the metadata for predefined containers. In order to customize the metadata in a predefined container, you must first make a copy of the container. For instructions, see "Creating or Copying a Source System Container".

You can add tables, columns, and indexes to the data warehouse, and you can modify the existing objects. Customizing the data warehouse in this way requires using DAC and Informatica client tools.

Figure 7-1 shows the major steps required for adding a new object to the data warehouse or modifying existing objects. As shown in Figure 7-1, you can begin the customization process by adding or modifying the new data warehouse object in DAC and then using the DAC's schema creation and upgrade functionality to propagate the changes to the data warehouse. Alternatively, you can add or modify the object directly in the data warehouse database and then use the DAC's Import from Database command to add the new object in DAC.

Figure 7-1 Process Flow to Add New Object to Data Warehouse

This image is described in the surrounding text.

Adding a New Table and Columns to the Data Warehouse

As shown in Figure 7-1, there are two alternative process flows for adding a new object to the data warehouse. You can enter the table and column definitions in DAC and then use the DAC's schema creation and upgrade functionality to propagate the changes to the data warehouse database. For this method, follow the procedure, "To add new tables and columns to DAC and propagate changes to the data warehouse:".

Alternatively, you can add the new table and column definitions directly in the data warehouse database and then use the DAC's Import from Database command to add the new table and columns in DAC. For this method, follow the procedure, "To add a new table and columns using the DAC's Import command:".

To add new tables and columns to DAC and propagate changes to the data warehouse:

In the Design view, select the appropriate source system container from the drop-down list.
In the Tables tab, create the new table:
1. Click New.
2. In the Edit subtab, enter the appropriate information about the table, and click Save.
  
  For a description of the fields in this tab, see "Tables Tab".
Add the columns for the new table:
1. In the Columns subtab, click New.
2. Enter the appropriate column information for each column you want to add to the table, and click Save.
3. Enter the appropriate foreign key table and column information.
  
  Note:
  
  For performance purposes, it is recommended that you do not enter more than 254 columns to a dimension or fact table.
Create the new tables and columns in the data warehouse database.

DAC provides several ways to create and upgrade data warehouse schemas. Carefully review Chapter 10, "Managing Data Warehouse Schemas," and select the method that best suits your requirements.

To add a new table and columns using the DAC's Import command:

Add the new table and column definitions into the data warehouse database.
In the DAC Design view, select the appropriate source system container from the drop-down list.
Import the new table definition.
1. In the Tables tab, right-click and select Import from Database, Import Database Tables.
2. In the Import Tables dialog, select DataWarehouse.
3. Optionally, enter filter criteria to identity the table name you entered in Step 0.
  
  See "DAC Query Commands and Operators" for available filter commands and operators.
4. Click Read Tables.
5. In the list of tables displayed, select the Import check box for the tables you want to import.
6. Click Import Tables.
  
  A message dialog indicates whether the process was successful.

Import the new column definitions.

In the Tables tab, query for the table you imported in Step 3.
With the table highlighted, right-click and select Import from Database, Import Database Columns.
In the Importing Columns... dialog, select Selected Record Only, and then click OK.

In the Import Table Columns dialog, click Read Columns.

The Changes column displays a description of column changes, which are explained below:

Change	Explanation
The object was added to the database.	The column is in the database but not the DAC repository. Importing it will add the column to the DAC repository.
The object was added to the repository.	The column is in the DAC repository but not in the database. Importing it will delete it from the DAC repository.
The object was modified.	The column definition in the database does not match the definition in the DAC repository.

In the list of columns displayed, select the Import check box for the columns you want to import.
Click Import Columns.

A message dialog indicates whether the process was successful.

Adding an Index to the Data Warehouse

Follow this procedure to add a new index to the data warehouse.

To add a new index to the data warehouse:

Add the new index definition into the data warehouse database.
In the DAC Design view, select the appropriate source system container from the drop-down list.
In the Tables tab, query for the table for which you want to import index definitions.
Right-click and select Import from Database, Import Indices.
Choose to import indexes for a selected table or for all the records retrieved in the query, and click OK.
In the Import Indices dialog, select the appropriate data warehouse from the Data Sources drop-down list.
Click Read Indices.
1. In the list of indexes displayed, select the Import check box for the indexes you want to import.
2. Click Import Indices.
  
  A message dialog indicates whether the process was successful.

Importing New Data Warehouse Objects into the Informatica Repository

This step requires using Informatica client tools to import new data warehouse objects into the Informatica repository. For instructions on this step of customizing the data warehouse, see the Informatica documentation.

Creating Informatica Mappings and Workflows

The process of creating Informatica mappings and workflows for the data warehouse objects that you imported into the Informatica repository requires using Informatica client tools. For instructions on this process, see the Informatica documentation.

Creating Tasks in DAC for New or Modified Informatica Workflows

You need to perform this step for all new workflows you create in Informatica and for all workflows that you modify.

To create a task in the DAC for new or modified Informatica workflows:

In the DAC Design view, select the appropriate source system container from the drop-down list.
Create a custom logical folder for the custom folder you created in the Informatica repository.
1. On the Tools menu, select Seed Data, and then select Task Logical Folders.
2. To create a custom logical folder, click New.
3. Enter a name for the custom logical folder, and click Save.
4. In the Type field, select Logical.
Create a custom physical folder for the custom folder you created in the Informatica repository.
1. On the Tools menu, select Seed Data, and then select Task Physical Folders.
2. To create a custom physical folder, click New.
3. Enter a name for the custom physical folder.
4. Enter a value for the Instance Priority.
  
  See "Task Physical Folder Instance Priority" for information about this property.
Register the custom logical and physical folders you created in steps 2 and 3 in the Source System Folders tab.
1. In the Design view, select the Source System Folders tab.
2. Click New.
3. In the Edit subtab, enter the name of the custom logical folder in the Logical Folder field.
4. Enter the name of the custom physical folder in the Physical Folder field, and click Save.
Create new tasks for the workflows.
1. In the Design view, select Tasks, and click New in the top pane toolbar.
2. In the Edit subtab, enter the workflow name as it appears in Informatica Workflow Manager.
3. Right-click, and select Synchronize Tasks.
4. Select Selected Record Only, and click OK.
  
  This command imports the source and target table definitions for a task from Informatica into the DAC repository.
5. In the Tasks tab, enter the remaining information required for the task.
  
  For a description of the fields in this tab, see "Tasks Tab".
The new table is now ready to be associated with a subject area. For information about creating a subject area, see "Creating a Subject Area".

Setting a Task Phase Dependency

A task phase dependency enables you to dictate additional dependency hints for tasks that are completely unrelated to each other through source/target table relationships.

To set a task phase dependency:

In the Design view, select the appropriate source system container from the drop-down list.
In the Tasks tab, query for the task for which you want to add a phase dependency, and make sure it is highlighted.
Click the Phase Dependency subtab.
Click Add/Remove in the subtab toolbar.
In the Choose Phases dialog, query for a task phase, select the phase, and click Add.
Click OK in the message box that states the phase was added.

In the window on the right side of the dialog, select the appropriate values for the following properties:

Property	Description
Action	Possible values: Wait. Indicates the task selected in the top pane will wait to be executed until all tasks of the phase specified in the Phase column have been executed. Block. Indicates the task selected in the top pane will block all tasks of the phase specified in the Phase column from being executed.
Grain	Possible values: All. Indicates the action will affect all tasks. Related. Indicates the action will affect related tasks.
Scope	Specifies how the Block action of the phase dependency behaves in relation to multi-source execution plans. Possible values: Both. Indicates the blocking action is active for tasks that have the same source and target physical data source connections Source. Indicates the blocking action is active for tasks that have the same source physical data source connection. Target. Indicates the blocking action is active for tasks that have the same target physical data source connection. None. Indicates the blocking action is active for all tasks regardless of the source and target physical data source connections.

Property

Description

Action

Possible values:

Wait. Indicates the task selected in the top pane will wait to be executed until all tasks of the phase specified in the Phase column have been executed.
Block. Indicates the task selected in the top pane will block all tasks of the phase specified in the Phase column from being executed.

Grain

Possible values:

All. Indicates the action will affect all tasks.
Related. Indicates the action will affect related tasks.

Scope

Specifies how the Block action of the phase dependency behaves in relation to multi-source execution plans.

Possible values:

Both. Indicates the blocking action is active for tasks that have the same source and target physical data source connections
Source. Indicates the blocking action is active for tasks that have the same source physical data source connection.
Target. Indicates the blocking action is active for tasks that have the same target physical data source connection.
None. Indicates the blocking action is active for all tasks regardless of the source and target physical data source connections.

The task phase dependency is displayed in the Phase Dependency subtab.

Reassemble the appropriate subject area.
1. In the Subject Areas tab, query for the appropriate subject area.
2. Click Assemble.
Rebuild the execution plan.
1. In the Execute view, query for the appropriate execution plan.
2. In the Connectivity Parameters subtab, click Generate.
  
  The Generating Parameters... dialog lists the containers that are involved in this execution plan.
3. Enter the number of copies of each container that are needed, and then click OK.
  
  DAC automatically generates the parameters required for each copy of the source system container. Note that not all copies require all of the possible parameters.
4. In the Value column, for each folder or database connection, select the appropriate physical folder name or database connection.
  
  Note:
  
  For the data source type of FlatFileConnection, make sure you have copied all files into the directory specified in the DAC system property InformaticaParameterFileLocation.
5. In the top pane toolbar, click Build.

Creating a Task Group

DAC automatically organizes tasks into a dependency structure based on dependency rules. For information about the DAC's dependency rules, see "Execution Plan Build Process Rules".

DAC does not allow parallel reads and writes on the same table; therefore, DAC randomly assigns priorities to tasks that have the same properties. You can group tasks by using the Task Group feature to enforce a particular behavior.

The task group feature can be useful in the following situations:

You want to enforce a particular order for multiple tasks writing to the same table.
You want tasks to run in parallel (because you know the tasks have no dependencies among them).
You want to create a task action to truncate a table when you have multiple tasks writing to the same table.
You have circular tables, and need to enforce an order. For example, task 1 reads from table A and writes to table B, and task 2 reads from table B and writes to table A.

To create a task group:

In the Design view, select the appropriate source system container from the drop-down list.
In the Task Groups tab, click New in the top pane toolbar.
In the Edit subtab, enter a name and select the appropriate properties.
Click the Child Tasks subtab, and click Add/Remove in the toolbar.
In the left-hand window of the Choose Child Tasks dialog, query for the tasks you want to add to the task group.
Select the tasks, and click Add.
In the window on the right side, enter a value in the Dependency Order field to specify an execution order.
Specify whether the task is a Heuristic Driver.

See "DAC Heuristics and Task Groups" for more information.
Click Save, and then click OK to close the window.

Working with Configuration Tags

A configuration tag is an object that controls the inclusion of tasks in subject areas. When a task is tagged, it is not eligible to be included in the collection of tasks for a subject area. You can override this behavior by using the "Include Task" property in the subject area definition.

A configuration tag can function in one of the following ways:

Remove tasks from all subject areas

If you assign a task to a configuration tag, the task will not be eligible to participate in any subject area. For instructions, see "To remove tasks from all subject areas:".
Reassign autogenerated tasks to a specific subject area

An autogenerated task is a task that the DAC automatically assigns to a subject area when the subject area is assembled.

For autogenerated tasks that were removed from participating in a subject area, you can set up the configuration tag to reassign a task to participate in specific subject areas. You do this by associating the configuration tag with the desired subject area. This method only applies to tasks that are autogenerated tasks of a subject area. For instructions, see "To reassign autogenerated tasks to a subject area:".
Add non-autogenerated tasks to a subject area

You can set up a configuration tag to add non-autogenerated tasks to a subject area. The non-autogenerated tasks will participate in the subject area along with the subject area's autogenerated tasks. For instructions, see "To add non-autogenerated tasks to a subject area:".

Assign only configuration tag tasks to a subject area (excludes the subject area's autogenerated tasks)

You can also set up a configuration tag so that only tasks that were assigned to the configuration tag participate in a specific subject area. In this case, the subject area's autogenerated tasks do not participate. For instructions, see "To assign only configuration tag tasks to a subject area (excludes the subject area's autogenerated tasks):"

To remove tasks from all subject areas:

In the Design view, select the appropriate source system container from the drop-down list.
Create a new configuration tag.
1. In the Configuration Tags tab, click New in the top pane toolbar.
2. In the Edit subtab, enter a name for the configuration tag.
3. Make sure the Include Tasks check box is not selected.
4. Click Save.
Add tasks to the configuration tag.
1. With the new configuration tag highlighted in the top pane, click the Tasks subtab.
2. In the bottom pane toolbar, click Add/Remove.
3. In the Tasks dialog, query for the tasks you want to add to the configuration tag.
4. Highlight the tasks, and then click Add.
  
  The tasks appear in the right-hand window.
5. Click Save, and then click OK to close the window.
  
  These tasks will not be eligible to participate in any subject area.

To reassign autogenerated tasks to a subject area:

In Design view, select the appropriate source system container from the drop-down list.
In the Configuration Tags tab, query for the configuration tag that contains the tasks you want to reassign to a subject area.
Verify the configuration tag contains the appropriate tasks by clicking the Tasks subtab and reviewing the list of tasks associated with this configuration tag.

Note:

Only a subject area's autogenerated tasks will be reassigned. If non-autogenerated tasks appear in the list, the DAC will ignore them.
Associate the configuration tag with the subject areas to which you want to reassign the tasks.
1. With the configuration tag highlighted in the top pane, click the Subject Areas subtab.
2. Click Add/Remove in the bottom pane toolbar.
3. In the Subject Areas dialog, query for one or more subject areas to which you want to reassign the task or tasks.
4. Highlight the appropriate subject areas, and click Add.
5. Click Save, and then click OK to close the window.
Reassemble the subject area.
1. In the Subject Area tab, query for all the subjects areas you added to the configuration tag.
2. Highlight the subject areas, and click Assemble.

To add non-autogenerated tasks to a subject area:

In the Design view, select the appropriate source system container from the drop-down list.
Create a new configuration tag.
1. In the Configuration Tags tab, click New in the top pane toolbar.
2. In the Edit subtab, enter a name for the configuration tag.
3. Select the Include Tasks check box.
4. Click Save.
Add the non-autogenerated tasks to the configuration tag.
1. With the new configuration tag highlighted in the top pane, click the Tasks subtab.
2. In the bottom pane toolbar, click Add/Remove.
3. In the Tasks dialog, query for the extraneous tasks you want to add to the configuration tag.
4. Highlight the tasks, and then click Add.
5. Click Save, and then click OK to close the window.
Associate the configuration tag with the subject areas to which you want to add the non-autogenerated tasks.
1. With the configuration tag highlighted in the top pane, click the Subject Areas subtab.
2. Click Add/Remove in the bottom pane toolbar.
3. In the Subject Areas dialog, query for one or more subject areas to which you want to add the non-autogenerated tasks.
4. Highlight the appropriate subject areas, and click Add.
5. Click Save, and then click OK to close the window.
Reassemble the subject area.
1. In the Subject Area tab, query for all the subjects areas you added to the configuration tag.
2. Highlight the subject areas, and click Assemble.

To assign only configuration tag tasks to a subject area (excludes the subject area's autogenerated tasks):

In the Design view, select the appropriate source system container from the drop-down list.
In the Subject Areas tab, query for the subject area to which you want to add configuration tag tasks.

Note:

The autogenerated tasks for this subject area will be excluded.
Select the Configuration Tag Tasks Only check box, and click Save.
Create a configuration tag.
1. In the Configuration Tags tab, click New in the top pane toolbar.
2. In the Edit subtab, enter a name for the configuration tag.
3. Select the Include Tasks check box.
4. Click Save.
Add the tasks to the configuration tag.
1. With the new configuration tag highlighted in the top pane, click the Tasks subtab.
2. In the bottom pane toolbar, click Edit.
3. In the Tasks dialog, query for the tasks you want to add to the configuration tag.
4. Highlight the tasks, and then click Add.
5. Click Save, and then click OK to close the window.
Associate the configuration tag with the subject area.
1. With the configuration tag highlighted in the top pane, click the Subject Areas subtab.
2. Click Add/Remove in the bottom pane toolbar.
3. In the Subject Areas dialog, query for the appropriate subject area.
4. Highlight the subject area, and click Add.
5. Click Save, and then click OK to close the window.
Reassemble the subject area.
1. In the Subject Area tab, query for all the subjects areas you added to the configuration tag.
2. Highlight the subject areas, and click Assemble.

Using Actions to Manage Indexes, Tables and Tasks

T he Actions feature enables you to define and run SQL scripts to carry out various actions in relation to indexes, tables, and tasks.

Index actions can override the default behavior for dropping and creating indexes by mode type (full load, incremental load, or both). The Index action types are Create Index and Drop Index.

For example, you can define an index action to create indexes for tasks with the commands defined for full loads, or incremental loads, or both types, or you can define an action to drop indexes at a certain time on specific tables. Index actions override all other index properties.
Table actions can override the default behavior for truncating and analyzing tables by mode type. The Table action types are Truncate Table and Analyze Table.

For example, you can define a table action to truncate tables with the commands defined for full loads, or incremental loads, or both. Table actions override all other table properties.
Task actions can add new functionality based on various task behaviors. The following task action types are available:
- Preceding Action. Executes a SQL script before a task runs.
- Success Action. Executes a SQL script after a task runs successfully.
- Failure Action. Executes a SQL script if a task fails during its execution.
- Upon Failure Restart. Executes a SQL script when a task that previously failed is restarted.

You can also use the Actions Template feature to:

Combine SQL templates to do synchronized actions, such as create and analyze indexes.
Combine object level properties with user-defined parameters in SQL statements and stored procedures.

This chapter contains the following topics:

Defining a SQL Script for an Action
Assigning an Action to a Repository Object
Functions for Use with Actions
Using a DAC Source System Parameter in an Action

Defining a SQL Script for an Action

T he first step in the process of creating an action is to define the SQL script for the action. After completing this procedure, proceed to "Assigning an Action to a Repository Object".

To define a SQL statement for an action:

In the DAC Client, on the Tools menu, select Seed Data, and then select one of the following:
- Index Actions
- Table Actions
- Task Actions
The Actions dialog is displayed.
In the toolbar, click New.
In the new record field, enter a descriptive name for the action, and then click Save.
Double-click in the Value field.

The Value dialog is displayed.
Select a format for the tree view.
- Flat View displays the SQL entries in a list format in their order of execution.
- Category View displays the entries by the categories SQL and Stored Procedure.
  
  You can reorder the entries in the tree by dragging and dropping them.
Click Add.

In the new record field, enter or select the appropriate information.

Field	Description
Name	Logical name for the SQL block.
Type	SQL or Stored procedure
Database Connection Type	Should be used only for SQL types (not stored procedures). Defines which database the SQL statement will run against. Possible values are: Source - SQL runs against the source connection defined for the task. Target - SQL runs against the source connection defined for the task. Both - SQL runs against both the source and target connection. Table Connection - SQL runs against the table-specific connection if a separate table connection is available.
Table Type	Specifies the table type against which the SQL will run. Possible values are: All Source - SQL runs against all source tables defined for the task. All Target - SQL runs against all target tables defined for the task. Source Lookup - SQL runs against all the source lookup tables defined for the task. Source Primary - SQL runs against all the source primary tables defined for the task. Source Auxiliary - SQL runs against all the source auxiliary tables defined for the task.
Continue on Fail	Specifies whether an execution should proceed if a given SQL block fails.
Retries	Specifies how many retries are allowed. If the number is not positive, a default number of one (1) will be used.
Valid Database Platforms	Specifies the valid database platforms against which the SQL will run. If this field is left empty, the SQL can be run against any database.

In the lower-right side text box, enter a SQL statement.

The SQL Statement tab to the left of the text box lists all the supported SQL functions and DAC source system parameters that you can use in constructing custom SQLs. Double-click a function or source system parameter to move it into the text box.

For a description of the available functions, see "Functions for Use with Actions".

The source systems parameters list contains the names of all source system parameters defined in the DAC repository, with the prefix @DAC_. During runtime, the DAC Server resolves the source system parameter and replaces its name with the runtime value.

For an example of how to use a source system parameter in a SQL statement, see "Using a DAC Source System Parameter in an Action".
(Optional) Enter a comment about the SQL in the Comment tab.
Click OK.

Note:

You can add multiple SQL statements and stored procedures to a single action.
To assign this action to a repository object, proceed to "Assigning an Action to a Repository Object".

Assigning an Action to a Repository Object

F ollow this procedure to assign an action to a DAC repository object. Before you do this procedure, you must have defined a SQL script for an action. For instructions, see "Defining a SQL Script for an Action".

To assign an action to a repository object:

In the Design view, navigate to one of the following tabs, depending on the object type for which you want to assign an action:
- Indices tab
- Tables tab
- Tasks tab
Select or query for the object for which you want to assign an action.
With the appropriate object selected in the top pane, select the Actions subtab.
Click New in the subtab toolbar.
In the new record field, do the following:
1. Select an Action Type.
  
  For a description of the available Action Types, see the following: "Indices Tab: Actions Subtab", "Tables Tab: Actions Subtab", and "Tasks Tab: Actions Subtab".
2. Select the Load Type:
  
  Full - To assign the action to a full load command.
  
  Incremental - To assign the action to an incremental load command.
  
  Both - To assign the action to both full and incremental load commands.
3. Double-click in the Action field to open the Choose Action dialog, and select an action.
4. Click OK to close the Choose Action dialog.
5. Click Save in the subtab toolbar.

Functions for Use with Actions

This section includes a list of functions that are available for Index, Table and Task actions.

Table 7-1, "Functions for Index Actions"
Table 7-2, "Functions for Table Actions"
Table 7-3, "Functions for Task Actions"

Table 7-1 Functions for Index Actions

Function	Description
getAdditionalColumns()	Returns a comma separated list of included index columns. This function is related to the #Unique Columns index property. For DB2 and DB2-390 databases, the list may include a combination of unique and included columns.
getAnalyzeStatement()	Returns the default DAC index analyze statement (for this particular index).
getAnalyzeTableStatement()	Returns the default DAC table analyze statement (for the parent table).
getBitMapString	Resolves to the string `BITMAP` if it is a bitmap index. Otherwise, the string is empty.
getClusteredString	Resolves to the string `CLUTSTERED` if it is a bitmap index. Otherwise, the string is empty.
getCreateIndexStatement()	Returns the default DAC index creation statement.
getDBType()	Returns the physical data source connection type (Oracle OCI8, Oracle Thin, DB2, DB2-390, MSSQL, Teradata, or BI Server).
getDropIndexStatement()	Returns the default DAC index drop statement.
getHashString()	Resolves to the string `HASH` if it is a hash index. Otherwise, the string is empty.
getImageSuffix()	Resolves to the table image suffix if one is specified. Otherwise, the string is empty.
getIndexColumns()	Returns a comma separated list of index columns.
getIndexName()	Returns the index name.
getIndexTableSpace()	Resolves to the index space name if one exists. Otherwise, the string is empty.
getNamedSource()	Returns the DAC physical connection name.
getRvrsScanString()	Resolves to the string `ALLOW REVERSE SCANS` if the index supports reverse scans. Otherwise, the string is empty.
getTableName()	Returns the table name.
getTableOwner()	Returns the table owner name.
getTableSpace()	Returns the table space name if one exists. Otherwise, the string is empty.
getTruncateTableStatement()	Returns the default DAC table truncate statement.
getUniqueColumns()	Returns a comma separated list of unique index columns. This function is a counterpart of the getAdditionalColumns() function.
getUniqueString()	Resolves to the string `UNIQUE` if the index is unique. Otherwise, the string is empty.

Table 7-2 Functions for Table Actions

Function	Description
getAnalyzeTableStatement()	Returns the default DAC table Analyze statement.
getDBType()	Returns the physical data source connection type (Oracle OCI8, Oracle Thin, DB2, DB2-390, MSSQL, Teradata or BI Server).
getImageSuffix()	Returns the table image suffix if one exists. Otherwise, the string is empty.
getCreateIndexStatement()	Returns the default DAC index creation statement.
getNamedSource()	Returns the DAC physical connection name.
getDropIndexStatement()	Returns the default DAC index drop statement.
getTableName()	Returns the table name.
getTableOwnerName()	Returns the table owner.
getTableSpace()	Returns the table space name if one exists. Otherwise, the string is empty.
getTruncateTableStatement()	Returns the default DAC table truncate statement.

Table 7-3 Functions for Task Actions

Function	Description
getAnalyzeTableStatement()	Returns the default DAC analyze table statement.
getDBType()	Returns the physical data source connection type (Oracle OCI8, Oracle Thin, DB2, DB2-390, MSSQL, Teradata, or BI Server).
getImageSuffix()	Returns the table image suffix if one exists. Otherwise, the string is empty.
getNamedSource()	Returns the physical connection name in DAC.
getTableName()	Returns the table name.
getTableOwner()	Returns the table owner.
getTableSpace()	Returns the table space name if one exists. Otherwise, the string is empty.
getTruncateTableStatement()	Returns the default DAC truncate table statement.

Note:

Table-related task action functions should be used only for SQL blocks with a specified table type. For these blocks, DAC will loop through the tables of the type you specify and execute custom SQL for each table. Functions will be substituted with table-specific values for each iteration.

For example, if you wanted to gather statistics in a particular way after creating specific indexes, you would need to create index actions with two SQL blocks:

getCreateIndexStatement()

begin
DBMS_STATS.GATHER_INDEX_STATS(ownname => '@TABLEOWNER',
indname => 'getIndexName()', estimate_percent => 50);
end;

Using a DAC Source System Parameter in an Action

The following example illustrates how to use a source system parameter in defining an action.

Assume there is a source system parameter called COUNTRY (note that source system parameter names are case sensitive). And, you use this parameter in an action using the following SQL statement:

DELETE FROM TABLE1 WHERE COUNRY_NAME='@DAC_COUNTRY'

Assume that during the ETL process, COUNTRY gets resolved to Canada. The resulting SQL that is executed by the DAC Server would be the following:

DELETE FROM TABLE1 WHERE COUNTRY_NAME='Canada'

Using a Task Action to Enable Failure Restarts When Extracting From Multiple Sources

Typically, the extract tasks truncate the table, and, therefore, will not have update strategies. To enable failure restarts when extracting from multiple sources, create an action that will predelete all the records that are loaded from the source. For example, you can create a task action for predeleting records called "Predelete Staging Data" using the following SQL in the action definition. This example assumes the data from that source is always loaded with a certain data source number and is defined as a parameter in DAC called $$DATASOURCE_NUM_ID.

delete from getTargetTableName()
where datasource_num_id = @DAC_$$DATASOURCE_NUM_ID

Associate this action for all the extract tasks from all the sources using the action types "Preceding Action" and "Upon Failure Restart Action." This will ensure that the task will, if restarted from failure, always predelete the records pertinent to that data source before running the command. You can also mass associate such actions to all extract tasks using the right-click command Add Actions.

Mapping Multiple Database-Specific Informatica Workflows to the Same DAC Task

You can map multiple, database-specific Informatica workflows to the same DAC task by parameterizing the Informatica workflow command. At runtime, DAC determines which workflow to run based on the parameterization.

This procedure uses SIL_PersonDimension_Full as an example of a full command and SIL_PersonDimension as an example of an incremental command on an Oracle database and SIL_PersonDimension_Full_TD and SIL_PersonDimension_TD as full and incremental commands, respectively, on a Teradata database.

To map multiple database-specific Informatica workflows to the same DAC task:

In the DAC Design view, go to the Tasks tab.
Query for the task to which you want add multiple workflows.
Select the task, and then click the Parameters subtab.
Create a new parameter for a full load command:
1. Click New in the subtab toolbar.
2. In the Name field, enter $$workflow_CMD_PARAMETER.
3. In the Data Type field, select DB Specific Text.
4. In the Load Type field, select Full.
5. Click in the Value field to open the Enter Parameter Value dialog.
6. In the Connection Type field, select @DAC_TARGET_DBTYPE.
7. In the appropriate database fields, enter the full command name for both database types.
  
  For example, enter SIL_PersonDimension_Full in the Oracle field and SIL_PersonDimension_Full_TD in the Teradata field.
Create a new parameter for an incremental load command:
1. Click New in the subtab toolbar.
2. In the Name field, enter $$workflow_CMD_PARAMETER.
3. In the Data Type field, select DB Specific Text.
4. In the Load Type field, select Incremental.
5. Click in the Value field to open the Enter Parameter Value dialog.
6. In the Connection Type field, select @DAC_TARGET_DBTYPE.
7. In the appropriate database fields, enter the incremental command name for both database types.
  
  For example, enter SIL_PersonDimension in the Oracle field and SIL_PersonDimension_TD in the Teradata field.
With the same task selected, click the Edit subtab.
In the Command for Incremental Load field, enter @DAC_$$workflow_CMD_PARAMETER.
In the Command for Full Load field, enter @DAC_$$workflow_CMD_PARAMETER.
Click Save.