| Oracle® Fusion Middleware User's Guide for Oracle Business Intelligence Data Warehouse Administration Console 11g Release 1 (11.1.1) Part Number E14849-01 | 
 | 
| 
 | View PDF | 
DAC provides functionality related to indexes, tables, tasks, and workflows that enables you to enhance performance.
This chapter contains the following topics:
Using Actions to Optimize Indexes and Collect Statistics on Tables
Customizing customsql.xml to Drop and Create Indexes and Analyze Tables
In general, during the ETL process, before a table is truncated, all the indexes, as defined in the repository, will be dropped before the data is loaded and recreated automatically after the data is loaded. The dropping of indexes before the data load and recreating them after the data load improves ETL performance.
If you create an index on the database and it is not registered in the DAC repository, the index will not be dropped and the load will fail.
For Teradata databases, only secondary indexes should be registered in DAC. You should not register primary indexes or the more complex indexes, such as single- and multi-table indexes, because they cannot be dropped and recreated.
In the DAC Design view, the Indices tab lists all the indexes associated with the selected source system container. It is recommended that you do not register in DAC any indexes for source tables.
This section contains the following topics:
The general sequence of actions related to indexes when a task runs is as follows:
Tables are truncated.
Indexes are dropped.
Data is loaded into target tables.
ETL indexes and unique indexes are recreated.
Successor tasks are notified that ETL indexes were created.
Query indexes and additional unique indexes are recreated.
Tables are analyzed.
Note: Successor tasks can start before query-related and unique indexes are created.
You can specify index spaces for indexes by table type. If you do not specify an index space by table type, then DAC will create indexes in the index space specified by the Default Index Space property in the Physical Data Sources tab of the Setup view. If you do not specify a value for the Default Index Space property, then DAC will create indexes in the default tablespace for the table owner of the database.
Note:
You must create the index space in the database before you can specify index spaces for indexes in DAC.To specify an index space for indexes by table type
In the Physical Data Sources tab of the Setup view, click the Index Spaces subtab.
In the subtab toolbar, click Generate.
When asked if you want to generate database properties based on table type, click Yes.
A list of available table types is displayed in the Table Type list.
Click OK to close the Generating Properties dialog.
In the Index Space column, enter an index space for each table type, and click Save.
You can specify how many indexes can be created in parallel for a specific table or how many indexes can be created in parallel for all tables associated with a specific physical data source connection.
To specify how many indexes can be created in parallel for a specific table
In the Physical Data Sources tab of the Setup view, select the appropriate physical data source in the top pane list.
Click the Parallel Indexes subtab.
Click New in the Parallel Indexes toolbar.
In the Name field, query for and enter the names of the table for which you want to specify how many indexes can be created in parallel.
For each table, in the Number of Parallel Indexes field, enter the number of indexes you want to create in parallel.
Click Save.
To specify how many indexes can be created in parallel for all tables associated with a physical data source
In the Physical Data Sources tab of the Setup view, select the appropriate physical data source in the top pane list.
Click the Edit subtab.
In the Num Parallel Indexes Per Table field, enter a numeric value to specify how many indexes can be created in parallel for a table on the specified physical data source.
Click Save.
This section applies to incremental loads only.
When an index is marked in DAC as Drop & Create Always or Drop & Create Bitmap Always (for Oracle databases), DAC drops the indexes before the data load and creates them after the data load. Performance can be improved by customizing and tuning how indexes are dropped and created during incremental loads.
When large fact tables are involved in an incremental load, dropping and creating indexes on the whole table can impact performance. Table partitioning can be a performance enhancing alternative to dropping and creating indexes. For example, if the data being loaded is mostly based on updates to recent data, you can partition the table based on time. If the incremental data affects the current year, it may be more beneficial to disable the current year partition than to drop the index on the whole table; and, similarly, it may be more beneficial to rebuild the indexes for the partition rather than create the indexes on the whole table. This can be achieved by using different syntaxes for full and incremental loads, as described in the following example.
This example illustrates, for incremental runs, how to create a parameter that can be used to override the drop and create index syntaxes. It also describes how to override the analyze table behavior so that only the current year partition is analyzed rather than the whole table.
This example assumes the table in the underlying schema has already been partitioned, with the partitioned names as PART_2001, PART_2002, and so on.
Create a source system parameter called CURRENT_YEAR, using the Timestamp data type and the runtime variable @DAC_ETL_START_TIME_FOR_TARGET.
For instructions on creating a source system parameter, see "Defining a Timestamp Type Parameter".
Create a custom index action to override the drop index behavior.
Create an index action called Drop Partitioned Index.
Enter the following SQL:
alter index getIndexName() modify partition PART_@DAC_$$CURRENT_YEAR unusable
For instructions on defining an index action, see "Defining a SQL Script for an Action".
Assign the Drop Partitioned Index action to indexes.
In the Indices tab, query for the table name and indexes that need to be dropped and created during the incremental runs.
For each record in the list of query results, right-click and select Add Actions.
Select Drop Index as the Action Type, Incremental as the Load Type, and Drop Partitioned Index as the Action.
Create a custom index action to override the create index behavior.
Create an index action called Enable Partitioned Index.
Enter the following SQL:
alter index getIndexName() rebuild partition PART_@DAC_$$CURRENT_YEAR nologging
Assign the Enable Partitioned Index action to indexes.
In the Indices tab, query for the table name and indexes that need to be dropped and created during the incremental runs.
For each record in the list of query results, right-click and select Add Actions.
Select Drop Index as the Action Type, Incremental as the Load Type, and Enable Partitioned Index as the Action.
Create a custom index action to override the analyze table behavior.
Create an index action called Analyze Current Partition.
Enter the following SQL:
DBMS_STATS.GATHER_TABLE_STATS( NULL, TABNAME => 'getTableName()' CASCADE => FALSE, PARTNAME => PART_@DAC_CURRENT_YEAR, ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE, GRANULARITY => 'FOR ALL INDEXED COLUMNS SIZE AUTO', DEGREE => DBMS_STATS.DEFAULT_DEGREE)
Select Stored Procedure as the Type.
Assign the Analyze Current Partition action to tables.
In the Tables tab, query for the table names that need to be analyzed during the incremental runs.
For each record in the list of query results, right-click and select Add Actions.
Select Analyze Table as the Action Type, Incremental as the Load Type, and Analyze Current Partition as the Action.
For a more detailed example of using partitions, see "Oracle Business Intelligence Applications Version 7.9.6.x Performance Recommendations [ID 870314.1]" on My Oracle Support.
The DAC heuristics functionality enables you to gather intelligence about the amount of incremental data that will be processed during an ETL execution. To optimize performance, you can use this intelligence to decide whether a task will run, whether tables are analyzed, and whether indexes are dropped and created.
This chapter contains the following topics:
Note:
DAC heuristics applies only to incremental data loads and not full loads.Typically, during an incremental ETL process, DAC drops indexes on target tables before a data load, recreates the indexes after the data load, and then analyzes the tables. These processes can impact overall ETL process times. DAC heuristics enables you to optimize performance by deciding whether tasks will run, tables are analyzed, and indexes dropped and created.
There are three parts to defining heuristics:
Defining the heuristics rule.
Associating the heuristics rule with a task.
Providing the means for computing the source and target counts to be used in the computation of heuristics.
You can determine whether or not you want a task to run as part of an ETL process. DAC heuristics looks at the primary source tables (or other tables you specify by writing custom SQL) before a data load to see if there are any incremental data records. The decision about whether a task will run is based on a threshold value you set. If the incremental record count is equal or less than the threshold value, DAC will not execute the task.
For example, assume you use DAC heuristics to count the records in a primary staging table for a dimension load task, and you set a threshold value of zero. If DAC heuristics finds no records, DAC will not execute this task when the ETL process runs. If DAC heuristics finds one or more records, the task will be executed.
You can determine whether indexes are dropped and created during an ETL process. For this action, you configure a heuristics rule that either counts the number of incremental records in the primary staging table (or other tables you specify by writing custom SQL) or that provides a ratio of the incremental records in the primary source table and the total records in the target table. This value is then compared to the threshold value you set. If the count or ratio is equal or less than the threshold value, DAC will not drop and create indexes.
To determine an appropriate threshold value, you need to have an understanding of the impact that dropping and recreating indexes has on your database.
Example Using the Count Option: Assume you configure DAC heuristics to count the incremental records in the primary source table. You have determined performance is only impacted during an incremental load with indexes in place if there are more than ten records to be loaded, so you set the threshold value at 10. If DAC heuristics finds between zero and nine incremental records in the primary source table, DAC will not drop and recreate indexes when the data is loaded into the target tables. The data will be loaded with indexes in place. If DAC heuristics finds ten or more incremental records in the primary staging table, DAC will drop and recreate indexes during the load process.
Example Using the Ratio Option: Assume you configure DAC heuristics to provide a ratio of the number of incremental records in the primary source table and the total number of records in the target table. You have determined that if incoming records are 5% or more of the total target table record count, then performance can be impacted with indexes in place. Therefore, you set the threshold value at 5. If DAC heuristics finds the ratio of incremental records in the primary source table to be 5% or more of the total records in the target table, then DAC will drop and recreate indexes during the load process. If the percentage of incremental records is equal or less than 5%, DAC will load the new records with the indexes in place.
You can determine whether or not you want tables to be analyzed after the ETL process runs. For this action, you configure a heuristics rule that either counts the number of incremental records in the primary staging table (or other tables you specify by writing custom SQL) or that provides a ratio of the incremental records in the primary source table and the total records in the target table. This value is then compared to the threshold value you set. If the count or ratio is equal or less than the threshold value, DAC will not analyze tables.
To determine an appropriate threshold value, you need to have an understanding of the impact that analyzing tables has on your database.
The count and ratio options work the same way for the action of analyzing tables as for dropping and creating indexes. For examples of these two options as they relate to dropping and creating indexes, see "Using Heuristics to Determine Whether Indexes Are Dropped and Created".
Note: This DAC heuristics setting overrides the Analyze Frequency (in Days) system property.
If a child task of a task group has heuristics defined, the task can optionally be specified as a heuristics driver. Any number of child tasks within a task group can be specified as a heuristics driver. Before a task group is executed, DAC computes the task heuristics of each of the child tasks. If the computation of any of the heuristics driver tasks falls under the threshold value, none of the child tasks are executed. The status of these tasks in the Current Run tab of the Execute view is "Not Executed." The Status Description is updated with the cumulative summary of all the driver children as to why the task was not executed.
A heuristics rule defines how the DAC heuristics functionality will manage the behavior of tasks, tables and indexes. Heuristics rules can apply to any DAC heuristics action; that is, you can use the same rule to indicate whether a task should run, whether tables should be analyzed, or indexes dropped and recreated. Heuristics rules apply to all containers. However, if you write custom SQL ("Writing Custom SQL for a Heuristics Rule") to specify tables other than the primary tables, the SQL is specific to the container in which you write it.
You use the Heuristics dialog to define heuristics rules. The Heuristics dialog contains the following fields:
Heuristic
The possible values are Count and Ratio. These options provide flexibility in how you gather intelligence. The Count option provides a count of the number of incremental records in the primary staging table that will be included in the ETL process. The Ratio option provides the number of incremental records in the primary source table and the total number of records in the target table. This value is expressed in the threshold field as a percentage.
The Count option can be useful if you are deciding whether or not to run a task. For example, if you only want a task to run if there is one or more records in a staging table, you would select the Count option and set the Threshold value at 1. DAC heuristics counts the number of records in the staging table. If it finds one or more records, the task will run. If it finds no records, the task will not run.
The Ratio option can be useful if you are deciding whether to drop and create indexes or analyze tables. For examples of using the count and ratio options, see "Using Heuristics to Determine Whether Indexes Are Dropped and Created".
Metric
Provides options for how multiple primary source tables are handled (that is, when more than one primary source table is associated with the same task). If you are defining a heuristics rule that involves a single primary source table, the Metric field does not apply.
The possible values are the following:
Sum
Aggregates the total of new or changed records in all of the primary source tables.
Average
Returns an average of the new or changed records in all of the primary source tables. That is, it will divide the sum of all the new or changed records in all of the primary tables by the number of primary tables.
Max
Returns the largest count of new or changed records of all of the primary source tables.
Min
Returns the smallest count of new or changed records of all of the primary source tables.
Source Table
The possible values are the following:
Primary
Specifies that intelligence will be gathered on one or more primary tables associated with the task that this heuristics rule is assigned to.
With SQL Overrides
Specifies that you will write custom SQL to include source tables other than the default primary source tables.
Threshold
The numerical value above which DAC heuristics will allow a particular action to take place.
For example, if you are doing a count of a primary staging table to determine whether a task should run, and if you want the task to run if one or more records appears in the table, you would set the Threshold value at 1.
If you use the Ratio option, the threshold value is a percentage. For example, suppose you were defining a heuristics rule for determining whether tables should be analyzed after an ETL process, and you knew you wanted to analyze tables if the incoming rows were more than 5% of the total target rows. In this case, you would set the threshold value at 5.
Follow this procedure to create a heuristics rule. After you create the rule, you need to assign it to a task. For instructions, see "Assigning a Heuristics Rule to a Task".
For an overview of DAC heuristics, see "About DAC Heuristics". For detailed information about the Heuristics dialog, see "About Heuristics Rules and the Heuristics Dialog".
To create a heuristics rule
In the DAC Client, on the Tools menu, select Seed Data, Heuristics.
The Heuristics dialog is displayed.
In the toolbar, click New.
In the Name field, enter a descriptive name for the heuristics rule.
In the Heuristics field, select one of the following:
Count
Provides a count of the number of records in the primary source table that will be included in the ETL process (based on the task to which you assign the heuristics rule).
Ratio
Provides a count of the number of records in the primary source table and the total number of records in the target table. If you select the Ratio option, the value of the Threshold field is a percentage.
If this heuristics rule applies to a task with multiple primary tables, select one of the following:
Sum. Aggregates the total of new or changed records in all of the primary source tables.
Average. Returns an average of the new or changed records in all of the primary source tables.
Max. Returns the largest count of new or changed records of all of the primary source tables.
Min. Returns the smallest count of new or changed records of all of the primary source tables.
In the Source Tables field, select one of the following:
Primary. Specifies that intelligence will be gathered on one or more primary tables associated with the task that this heuristics rule is assigned to.
With SQL Overrides. Specifies that you will write custom SQL to include tables other than the primary tables.
For instructions on writing custom SQL to create a heuristics rule, see "Writing Custom SQL for a Heuristics Rule".
In the Threshold field, enter a numerical value above which DAC heuristics will allow a particular action to take place.
For examples, see "About Heuristics Rules and the Heuristics Dialog".
(Optional) If you want to create this heuristics rule but leave it inactive, select Inactive.
Click Save.
After you create the heuristics rule, you need to assign the rule to a task. For instructions, see "Assigning a Heuristics Rule to a Task".
Before you perform this procedure, you need to have created a heuristics rule by following the instructions in "Creating a Heuristics Rule".
In this procedure, you will assign a heuristics rule to a task.
To assign a heuristics rule to a task
In the Design view, select the Tasks tab.
In the Tasks tab, select the task to which you want to assign the heuristics rule.
Select the Extended Properties subtab.
In the toolbar, click New.
A new, empty record is created.
Click in the Name field to expose the drop-down list, and then select Heuristics.
Double-click in the Value field to open the Property Value dialog.
The Property Value dialog enables you to select a heuristics rule that you created in the procedure "Creating a Heuristics Rule".
Assign the heuristics rule to a task by doing one of the following:
For a rule that determines whether a task should run, click in the Task Heuristic field, and then select the appropriate heuristics rule.
For a rule that determines whether tables should be analyzed, click in the Table Heuristic field, and then select the appropriate heuristics rule.
For a rule that determines whether indexes should be dropped and recreated, click in the Index Heuristic field, and then select the appropriate heuristics rule.
Click OK in the Select Value dialog, and then click OK in the Property Value dialog.
The heuristics rule is now assigned to a task and will be processed when the task to which it is assigned is called by an execution plan.
Heuristics rules are reflected in the Status Description field in the Task Details tab of the Details dialog, which is accessible by going to the Tasks subtab in the Current Runs or Run History tabs and clicking the Details button on subtab toolbar. The Status Description will indicate whether a step was skipped because of a heuristics rule, and the task status will be Not Executed.
When you create a heuristics rule, you have the option of gathering intelligence by looking at the primary source tables, which is the default option, or writing custom SQL to gather intelligence on tables other than primary source tables.
When you write custom SQL for a heuristics rule, it applies only to the source system container you choose before you write the SQL.
Note:
Before you perform this procedure, you must have already created a heuristics rule, by following the procedure "Creating a Heuristics Rule", and selected the option With SQL Overrides in the Source Tables field.To write custom SQL to create a heuristics rule
In the Design view, select the appropriate source system container from the drop-down list.
Select the Container Specific Actions tab.
Select Heuristic from the rightmost drop-down list on the toolbar.
In the toolbar, click New.
A new, empty record is created.
In the Name field, enter a descriptive name for the SQL, and then click Save.
Click in the Value field to open the Value dialog.
Select a format for the tree view on the left side of the window.
Flat view displays the SQL entries in a list format in their order of execution.
Category view displays the entries by the categories Custom SQL and Stored Procedure.
You can reorder the entries in the tree by dragging and dropping them.
Click Add in the right-hand toolbar.
A new, empty record is created in the top pane.
Enter or select the appropriate information.
| Field | Description | 
|---|---|
| Name | Enter a logical name for the SQL block. | 
| Type | Select one of the following: 
 | 
| Continue on Fail | Specifies whether an execution should proceed if a given SQL block fails. | 
| Retries | Specifies how many retries are allowed. If the number is not positive, a default number of one (1) will be used. | 
| Valid Database Platform | Specifies the valid database platforms against which the SQL will run. If this field is left empty, the SQL can be run against any database. | 
In the lower-right side text box, enter a SQL statement.
The SQL Statement tab to the left of the text box lists the supported SQL functions and DAC source system parameters that you can use in constructing custom SQL statements. Double-click a function or source system parameter to move it into the text box.
The source systems parameters list contains the applicable source system parameters defined in the DAC repository, with the prefix @DAC_. During runtime, the DAC Server resolves the source system parameter and replaces its name with the runtime value.
See Table 11-1 for a list of supported SQL functions.
Table 11-1 Functions for Heuristics SQL
| Function | Description | 
|---|---|
| getAnalyzeTableStatement() | Returns the default DAC analyze table statement. | 
| getDBType() | Returns the physical data source connection type (Oracle OCI8, Oracle Thin, DB2, DB2-390, MSSQL, Teradata, or BI Server). | 
| getImageSuffix() | Returns the table image suffix if one exists. Otherwise, the string is empty. | 
| getNamedSource() | Returns the physical connection name in DAC. | 
| getTableName() | Returns the table name. | 
| getTableOwner() | Returns the table owner. | 
| getTableSpace() | Returns the table space name if one exists. Otherwise, the string is empty. | 
| getTruncateTableStatement() | Returns the default DAC truncate table statement. | 
| toString() | Returns a string value. | 
(Optional) Enter a comment about the SQL in the Comment tab.
Click OK.
If the SQL statement contains source tables, assign the heuristics SQL to the source tables.
In the Design view, click the Tasks tab.
Select the task whose source table is the one to which you want to assign the heuristics SQL.
Click the Source Tables subtab.
Select or query for the appropriate source table.
Click in the Task Count Override field.
The Choose SQL Override dialog is displayed.
Select or query for the SQL heuristics rule you created in the preceding steps, then click OK.
Assign the heuristics SQL to a target table.
In the Design view, click the Tasks tab.
Select the task whose target table is the one to which you want to assign the heuristics SQL.
Click the Target Tables subtab.
Select or query for the appropriate target table.
Click in the Count Override field.
The Choose SQL Override dialog is displayed.
Select or query for the SQL heuristics rule you created in the preceding steps, then click OK.
The heuristics SQL is now assigned to a task and will be processed when the task to which it is assigned is called by an execution plan.
You can configure the full and incremental load commands for tasks to repeat (or loop) multiple times during the execution of an ETL process. This feature can be useful to enhance performance in the following ways:
To configure workflows to read and write in parallel.
To repeat application logic that applies to all repetitions of a task, for example the flattening of hierarchies.
The process for setting up workflows for looping involves two steps: First you define the looping properties, and then you create parameters to access the looping properties you defined.
This section includes the following topics:
Follow this procedure to define the properties for looping a workflow.
To define a workflow for looping
Enable the Informatica workflow property Configure Concurrent Execution:
In Informatica Workflow Manager, open the appropriate workflow in the Workflow Designer.
On the menu bar, select Workflows, and then select Edit.
In the Edit Workflow dialog, select the Enabled check box for the Configure Concurrent Execution property.
Click the Configure Concurrent Execution button.
In the Configure Concurrent Execution dialog, select Allow concurrent run with same instance name.
Click OK.
In the DAC Design view, select the appropriate container from the drop-down list.
In the Tasks tab, query for the task for which you want to configure looping.
With the appropriate task selected in the top pane, click the Extended Properties subtab.
In the bottom pane toolbar, click New.
A new record is created with the name Loops. You cannot change the name of this record.
Click in the Value field of the new record.
The Property Value dialog is displayed.
In the Loops field, enter a numeric value to specify how many times you want the workflow to repeat.
The number of loops that you specify is applicable to all task execution types, that is, Informatica, SQL File, Stored Procedure, and External Program.
(Optional) Select the Parallel check box to have the run instances of the same task execute in parallel. If you do not select the Parallel check box, the run instances of the same task will be executed serially.
Note:
Run instances will be executed in parallel only if resources are available. For example, if your environment is set up to run 10 workflows in parallel, and a task's Number of Loops property is set to 20, only the first 10 instances will be executed. These 10 instances will have a Current Runs status of Running. The remaining 10 instances will have a status of Runnable. For a description of all Current Runs statuses, see "Current Runs Tab".
When the instances are executed in serial, the first failure encountered stops the rest of the executions.
When the instances are executed in parallel, any failure in the running batch stops the rest of the executions.
(Optional) Select the Restart All check box to do the following:
Reissue the truncate command (if it exists as a task detail).
Mark all task details pertaining to the run instances as Queued. The Queued status enables them to be executed again.
If you do not select the Restart All check box, the execution of the task will resume from the point of failure. If any runtime instances of the workflow fails, the task itself is marked as Failed. If you enable the Restart All property, the truncate command (if it exists as a task detail) is reissued, and all the task details pertaining to the run instances are marked as Queued and get re-executed. If you do not enable the Restart All property, the execution resumes from the point of failure
Click OK in the Property Value dialog.
Click Save in the subtab toolbar.
When the task executes, DAC assigns an instance number to each instance of the workflow. You can view the instance number in the Task Details subtab of the Current Runs tab.
The instance number is accessible as a predefined DAC variable named @DAC_TASK_RUN_INSTANCE_NUMBER. You can use this DAC variable as a runtime value for parameters that you create. The DAC variable @DAC_TASK_RUN_INSTANCE_NUMBER is available for use with parameters at the source system and task level.
To access the information defined in the looping properties, you need to create parameters in DAC that will be consumed by Informatica at runtime. You can create static or runtime parameters at either the task or source system level. The parameter values can be any of the following:
Total number of loops
Current instance number
Current instance number in reverse
Conditional statement
For example, if the loop number is one, provide value x, or else provide the value y.
For more information about how parameters are used in DAC, see "Defining and Managing Parameters".
To create a parameter with a static value
In the DAC Design view, select the appropriate container from the drop-down list.
Do one of the following:
To set up a parameter at the task level, go to the Task tab, and click the Parameters subtab.
To set up a parameter at the source system level, go to the Source System Parameters tab.
Click New in the toolbar.
In the Name field, enter a name for the parameter.
In the Data Type field, select Text.
In the Load Type field, select one of the following.
Full. To specify a full load.
Incremental. To specify an incremental load.
Both. To specify both full and incremental loads.
Click Save to save the record.
You must save the record before proceeding.
Click in the Value field to open the Enter Parameter Value dialog.
Select Static.
Enter the appropriate value.
Click Save in the toolbar.
To parameterize the Loops property as a runtime value
In the Design view, select the appropriate container from the drop-down list.
Do one of the following:
To set up a parameter at the task level, go to the Task tab, and click the Parameters subtab.
To set up a parameter at the source system level, go to the Source System Parameters tab.
Click New in the toolbar.
In the Name field, enter a name for the parameter.
In the Data Type field, select Text.
In the Load Type field, select one of the following.
Full. To specify a full load.
Incremental. To specify an incremental load.
Both. To specify both full and incremental loads.
Click Save to save the record.
You must save the record before proceeding.
Click in the Value field to open the Enter Parameter Value dialog.
Select Runtime.
Select the appropriate runtime variable.
Click OK.
Click Save in the toolbar.
Typically, a long running task is constrained by single reader and writer threads on Informatica. The looping of workflows parallelizes the reads and writes such that mutually exclusive records are read and written without having overlaps.
For example, if a fact table load takes a long time to run because the volume of incremental data is high, then you can parallelize the load process by doing the following:
Modify the schema for the staging table and introduce a numerical column called LOOP_NUMBER.
In DAC, create a task level parameter for the extract process called NUMBER_OF_LOOPS, and set the value to 5.
Modify the extract mapping for the Informatica task so that each of the records loaded in the staging table gets a value of from 1 to 5.
Create a task level parameter in DAC for the load process called NUMBER_OF_LOOPS, and set it to the same value as the extract mapping.
Create a new parameter at the task level called CURRENT_LOOP_NUMBER, and set the runtime variable as @DAC_CURRENT_INSTANCE_NUMBER. This value will be populated with a different number, from 1 to 5, for each invocation of the task.
In the extended properties for the task, create a new record for "Loops" and set the value as @DAC_NUMBER_OF_LOOPS.
Modify the Informatica mapping to include a where clause for the load logic: "WHERE LOOP_NUMBER = $$CURRENT_LOOP
WHERE LOOP_NUMBER = $$current_LOOP
When DAC runs the load process, it will spawn off five processes in parallel, and will send a specific value for the CURRENT_LOOP_NUMBER parameter to Informatica. Because DAC is reading and writing in parallel, the data load time can be reduced.
The customsql.xml file, located in the <DAC_Config_Location>\CustomSQLs directory, contains the default syntax DAC uses for dropping and creating indexes and analyzing tables. You can edit the customsql.xml file to change the behavior of these operations.
Note: Modifying the customsql.xml file affects the syntax on a global level, for all execution plans. If you want to do customizations at the table or index level, you should consider using the Action functionality. See "Using Actions to Optimize Indexes and Collect Statistics on Tables" for more information.
To edit the Analyze Table syntax
Open the customsql.xml file located in the <DAC_Config_Location>\CustomSQLs directory.
Locate the Analyze Table syntax for the appropriate database type.
For example, the syntax for an Oracle database is as follows:
<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL COLUMNS SIZE AUTO',cascade => true ) </SqlQuery>
Edit the syntax.
For example, to gather statistics for only the indexed columns, edit the syntax as follows:
<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL INDEXED COLUMNS',cascade => true ) </SqlQuery>
Note:
The variables @TABLEOWNER, %1, %2, and so on, will be substituted appropriately by the DAC when the statement is executed.To edit the Create Index syntax
Open the customsql.xml file located in the <DAC_Config_Location>\CustomSQLs directory.
Locate the Create Index syntax for the appropriate database type, and edit the syntax.