Skip Headers
Oracle® Fusion Middleware User's Guide for Oracle Business Intelligence Data Warehouse Administration Console
11g Release 1 (11.1.1)

Part Number E14849-03
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

5 Building and Running Execution Plans

This chapter provides information about building and running execution plans.

This chapter contains the following topics:

Introduction to Execution Plans and Load Processes

An execution plan is the unit of work you use to organize, schedule, execute, and monitor ETL processes. To generate an execution plan, you specify one or more subject areas, connectivity parameters, and, optionally, pre- and post-ETL tasks. DAC then generates an ordered set of tasks, with dependencies, that will carry out the ETL process.

When you run an execution plan, data is extracted from one or more tables in the source system database, loaded into staging tables, and then transformed and loaded into tables in the data warehouse. The ETL process is carried out in either full or incremental mode, depending on the refresh dates for the tables associated with the execution plan.

If the source or target table refresh dates for a task are null, then DAC invokes a full load workflow command. If both the source and target tables have refresh dates, then DAC invokes the incremental workflow command. For a detailed description of refresh dates, see "About Refresh Dates and DAC's Incremental Load Strategy".

DAC supports the following extract and load combinations:

About Single-Source Execution Plans

A single-source execution plan extracts data from a single instance of a single source system container. As illustrated in Figure 5-1, in this scenario data is extracted from a single source, loaded into staging tables, and then loaded into the data warehouse.

Figure 5-1 Single Extract and Single Load Option

This diagram is described in the surrounding text.

Truncate Table Behavior in a Single Extract Scenario

When DAC truncates a table, it also drops and recreates indexes (if the table has indexes) and analyzes the table after loading the data. In a single-extract scenario, the extract tasks truncation process for a table is as follows:

  1. Truncate table.

  2. Drop indexes.

  3. Run the Informatica mapping.

  4. Create indexes.

  5. Analyze table.

For instructions on building a single-source execution plan, see "Building and Running Execution Plans".

About Multi-Source Execution Plans

When you have a multi-source environment, you can use a single execution plan to extract data for a given subject area for all the sources. When you build the execution plan, DAC creates as many instances of the extract tasks as there are sources, without having the need to duplicate the workflows or DAC metadata.

There are two types of multi-source execution plans:

For instructions on building a multi-source execution plan, see "Building and Running Execution Plans".

Considerations for Multi-Source Execution Plans

This section describes properties and behaviors you should consider when setting up multi-source execution plans.

Source Priority

The order in which DAC loads data from the different sources that are participating in the ETL process is determined by the priority of the physical data source connection, as set in the Source Priority field in the Physical Data Sources tab of the Setup view. This property ensures that tasks attempting to write to the same target table will not be in conflict. If two sources are given the same priority, DAC will randomly stagger the tasks.

Truncate Table Behavior

If a target table is shared across different sources, it will be truncated only once. The priority of the data source (specified by the Source Priority property) determines which of the extracts truncates the tables. For more information about how tables are truncated, see "Execution Plan Build Process Rules".

Note that truncate properties should be the same across all source system containers associated with a multi-source execution plan.

In a multiple extract scenario, the truncation process for a table is as follows:

  1. Truncate table (first extract task).

  2. Drop indexes (first extract task).

  3. Run the Informatica mapping (first extract task).

  4. Run the Informatica mapping (second extract task).

  5. Create indexes (second extract task).

  6. Analyze table.

Enabling Failure Restarts When Extracting From Multiple Sources

Typically, extract tasks truncate the target tables, and, therefore, will not have update strategies. To enable failure restarts when extracting from multiple sources, you can create an action that will predelete all the records that are loaded from the source. See "Using a Task Action to Enable Failure Restarts When Extracting From Multiple Sources" for instructions.

Task Groups

Tasks should belong to the same task groups across all source system containers associated with the multi-source execution plan. For example, if a task belongs to a task group in one container but is not associated with a task group in a different container, then an exception will be thrown. Also, if a task with the same name belongs to two different task groups, an exception will be thrown.

Full and Incremental Workflow Commands

Full and incremental workflow commands need to match for tasks across all source system containers.

Configuring Extracts and Notifications

You can delay extracts for a particular source system by using the Delay property in the Connectivity Parameters subtab of the Execution Plans tab. You can also use the Event Delay feature to configure the extracts for different source systems to occur independently.

The Notifications feature enables you to initiate email notifications and the execution of custom SQL based on the first or last read or write operations on a particular data source. See "Setting Up Extract Delays, Event Delays and Data Source Notifications" for more information.

Task Physical Folder Instance Priority

DAC deduplicates tasks from multiple source containers or multiple instances of the same container based on the task name, the primary physical source, and the primary physical target. DAC executes only a single instance of the task. If multiple containers contain the same task, with the same source and target databases but different task physical folders, then DAC will pick one folder. If folders have different priorities, the folder with the lowest priority will be picked. If folders have the same priority, DAC randomly selects a folder.

In Oracle BI Applications, the Instance Priority for task physical folders is predefined. In certain situations, such as if you are using a vertical application, you may want to execute a task from a different folder. You can specify the preference of task physical folders by modifying the Instance Priority. To do so, on the Tools menu, select Seed Data, and then select Task Physical Folders.

DATASOURCE_NUM_ID

  • All tables should contain the column DATASOURCE_NUM_ID, which describes the source from which the data is coming.

  • Unique indexes should always include the column DATASOURCE_NUM_ID.

  • All the extract mappings should populate the DATASOURCE_NUM_ID column from the parameter file produced by DAC.

  • All the load mappings should extract the value of the DATASOURCE_NUM_ID column from the staging tables.

Multi-Source Execution Plan Extract and Load Scenarios

Multi-source execution plans support two load scenarios:

  • Multiple extract and single load. In this scenario, as illustrated in Figure 5-2, data is extracted from multiple sources and loaded into central staging tables. After all of the extract tasks have completed, the data is loaded into the data warehouse in a single load process.

  • Multiple extract and multiple loads. In this scenario, as illustrated in Figure 5-3, data is extracted from multiple sources and loaded into non-centralized staging tables. After all of the extract tasks have completed, the data is loaded into multiple data warehouses using multiple load processes.

Figure 5-2 Multiple Extract and Single Load Option

This diagram is described in the surrounding text.

Figure 5-3 Multiple Extract and Multiple Load Scenario

This diagram is described in the surrounding text.

About Micro ETL Execution Plans

Micro ETL execution plans are ETL processes that you schedule at very frequent intervals, such as hourly or half-hourly. They usually handle small subject areas or subsets of larger subject areas.

For instructions on building a micro ETL execution plan, see "Building and Running Execution Plans".

This section contains the following topics:

Why Use a Micro ETL Execution Plan?

A regular execution plan includes one or more fact tables and all content associated with the fact tables, such as dimension tables and related tables, and so on. The more associated content that is added to an execution plan, the longer the ETL process runs. If a subject area is small, for example, with one fact table and two dimension tables, there is no need to create a micro ETL execution plan. Instead, you can schedule an execution plan with a single, small subject area to run at frequent intervals.

Micro ETL execution plans, on the other hand, accommodate a business need for specific users to access updated subsets of data at frequent intervals. For example, some users may find their reports are stale if data is only refreshed once nightly. In such cases, a micro ETL execution plan can refresh the results of a small subject area or a subset of a star schema hourly or half-hourly. If you have a fact table with 50 dimension tables as well as other multiple related stars, an execution plan containing this fact table would run for a long period of time and could not be scheduled to run at frequent intervals. You could create a micro ETL execution plan by removing the unnecessary tables and tasks, leaving only the fact and dimension tables required for a specific report. See "Designing a Micro ETL Execution Plan" for instructions.

You should be aware that inconsistencies can occur when running micro ETL execution plans, such as dimension keys pointing to "Unspecified" records and discrepancies between facts and aggregates. However, the subset of data extracted for a micro ETL execution plan will be updated and patched during a regularly scheduled nightly ETL process. See "Important Considerations When Using Micro ETL Execution Plans" for other points to consider.

Note: All tasks that need to participate in the micro ETL execution plan should have update strategies that can update the data even though it has already been loaded into the data warehouse.

About Refresh Dates and Micro ETL Execution Plans

DAC tracks refresh dates for tables in micro ETL execution plans separately from other execution plans and uses these refresh dates in the change capture process. After a micro ETL execution plan runs, DAC populates refresh date values in the Micro ETL Refresh Dates subtab of the Execution Plans tab. If a subject area is used in a regular execution plan (an execution plan with the Micro ETL option in the Execution Plans tab not selected) as well as a micro ETL execution plan, DAC maintains refresh dates for the tables in the regular execution plan in the Refresh Dates subtab of the Physical Data Sources tab (Setup view).

In cases where a subject area is used in both a regular and micro ETL execution plan and the micro ETL execution plan is suspended for a few days but the regular execution plan runs nightly, DAC automatically detects the last refresh date for the tables common to both execution plans and intelligently extracts only the most recent records for the micro ETL execution plan. Micro ETL refresh dates do not affect the refresh dates of regular execution plans.

As an example, suppose a micro ETL execution plan runs every hour and the complete (regular) execution plan runs once a day at 1:00 PM. When the micro ETL execution plan runs at 10:00 PM, it will get the refresh timestamp as 9:00 PM. When the complete execution plan runs at 1:00 PM, it will get the refresh timestamp as the previous day at 1:00 PM, even though the micro ETL execution plan ran at 12:00 PM. This will cause all the records to be re-extracted that were already loaded into the data warehouse using the micro ETL execution plan. As long as all tasks have proper update strategies, DAC will re-update the records that were loaded during the day.

Note:

The image sync for Siebel sources does not occur during micro ETL execution plans. This means that the records from the I image tables do not get moved into the R image tables. Therefore, the records are re-extracted during the complete (regular) execution plan.

Important Considerations When Using Micro ETL Execution Plans

Micro ETL processes can cause issues with data inconsistencies, data availability, and additional load on the transactional database. Therefore, you should consider the following factors before implementing a micro ETL process:

  • For related star schemas, if one schema is omitted from a micro ETL execution plan, the cross-star reports may be inaccurate. For example, if the Person fact table is refreshed more frequently than the Revenue fact table, a report that spans the Person and Revenue dimensional schemas may produce inconsistent results.

  • If you omit dimension tables from a micro ETL execution plan, the foreign keys for the fact tables will point to Unspecified rows for the new dimension records. The foreign key references will be resolved when the Complete ETL execution plan is run, but users of the reports should be aware of such inconsistencies.

  • If you do not include aggregate tables in micro ETL execution plans, the reports that use data from these tables will be inconsistent with the reports that use data from the detailed fact tables. However, if aggregate tables are included in the micro ETL execution plan, the aggregate calculations are performed for each ETL process, which will take a constant amount of time and may be inefficient to perform at such frequent intervals.

  • Ensure any tasks or tables for soft-deletes are included in the micro ETL to identify deleted records from the source system.

  • Ensure all downstream post load processes are included as well as aggregates. This can be actual business dependencies, such as the GL Reconciliation process in Financial Analytics. For example, suppose a customer wants to refresh just the AP (Payables) fact three times a day. This will require the whole GL Reconciliation process and tasks to run to match the GL with AP. Otherwise, there will be a mismatch between the balance and transaction reports.

  • Hierarchy tables are rebuilt during every ETL execution plan by querying the base dimension tables. This operation takes a constant amount of time. If the base tables are big, this operation may take a long time and may be inefficient if the micro ETL execution plan runs several times a day. However, if you avoid populating the hierarchy tables during micro ETL processes, data inconsistencies will occur.

  • With micro ETL execution plans, BI Server caching will occur more frequently, which may have performance implications.

  • Micro ETL execution plans will put more load on the transactional database because of the frequent extracts.

  • Ensure the business is aware of the frequency of micro ETL execution plans and how dashboard results and critical reports are impacted throughout the day.

Designing a Micro ETL Execution Plan

Follow this procedure to design a micro ETL execution plan.

To design a micro ETL execution plan

  1. In the Design view, select the appropriate source system container from the drop-down list in the toolbar.

  2. Go to the Subject Areas tab, and click New.

  3. Enter a name for the subject area, and click Save.

  4. In the Tables subtab, click Add/Remove.

    The Choose Tables dialog is displayed. The left-hand window lists all the tables held in the selected container.

  5. Query for one or more fact tables.

  6. Select the fact table (use Shift+click to select more than one table), and click Add.

  7. Click OK to close the Choose Tables dialog.

  8. Click Assemble in the Subject Areas tab toolbar.

  9. In the Assembling... dialog, select Selected record only.

    The tree view on the left side of the Subject Area Assembly dialog displays the fact tables that belong to the subject area. You can expand the fact table node to view its related tables.

  10. Deselect the unnecessary fact and dimension tables.

  11. Click Calculate Task List to assemble the tasks needed to load the tables displayed in the tree view.

    A list of tasks is displayed in the Task List tab on the right side of the window. Also, the Table List tab displays the tables included in the subject area.

  12. Click Accept to complete the subject area assembly process.

  13. Inactivate any unnecessary tasks:

    1. With the appropriate subject area selected in the Subject Areas tab, click the Tasks subtab.

    2. Select the Inactive check box for any unnecessary task.

    3. Reassemble the subject area by clicking Assemble in the top pane toolbar.

Note:

You can also create a micro ETL execution plan by creating a new subject area and then using the Tasks subtab to associate specific tasks with it.

Execution Plan Build Process Rules

DAC builds an execution plan (generates a task dependency graph) based on the following rules for the metadata in the Design and Setup views.

Building and Running Execution Plans

Follow this procedure to run a single-source, multi-source, or micro ETL execution plan.

Before you attempt to run an execution plan, make sure you have completed the following:

Before you attempt to run a multi-source execution plan, you must first define the priority for each source. The priority specifies the order in which DAC will load the data from the different sources.

To define a source priority for multi-source execution plans

  1. In the Setup view, select the Physical Data Sources tab.

  2. For each of the physical data sources that will participate in the multi-source execution plan, enter a numerical value in the Source Priority field.

    The lower the numerical value, the higher the priority. For example, if you enter a value of 1, data from this source will be loaded first.

To build and run an execution plan

  1. In the Execute view, select the Execution Plans tab.

  2. Create a new execution plan.

    1. In the top pane toolbar, click New.

    2. In the top pane window or in the Edit subtab, enter the following information.

      Field Description

      Name

      Enter a name for the execution plan

      Full Load Always

      Select to indicate the execution plan will always execute a full load

      Note that DAC does not store refresh dates when this option is selected.

      Micro ETL

      Select to indicate the execution plan is a micro ETL execution plan. See "About Micro ETL Execution Plans" for more information.

      Analyze

      Select to indicate the tables associated with this execution plan will be analyzed.

      Analyze Truncated Tables Only

      Select to indicate only truncated tables will be analyzed.

      Drop/Create Indices

      Select to indicate indexes on the tables associated with this execution plan will be dropped and created.

      Inactive

      Select to inactivate the execution plan.


    3. Click Save.

  3. For a micro ETL execution plan, create a subject area in the Subject Areas tab of the Design view by doing one of the following:

    • Create a new subject area by adding the appropriate tasks to it.

    • Create a new subject area by adding the appropriate tables and then deleting the unnecessary tasks.

    • Modify an existing subject area by adding or removing tasks.

    Assemble the subject area by clicking Assemble in the top pane toolbar. See "Creating a Subject Area" for more information.

  4. Associate one or more subject areas with the execution plan.

    1. Click the Subject Areas subtab.

    2. Click Add/Remove in the bottom pane toolbar.

    3. In the Choose Subject Areas dialog, select the appropriate source system container from the drop-down list.

    4. Query for the subject area you want to associate with the execution plan. For micro ETL execution plans, query for the subject area you created in step 3.

    5. Select the subject area and click Add.

      You can repeat this process to associate multiple subject areas from any available source system container with an execution plan.

    6. Click OK to close the window.

  5. Generate the runtime connectivity parameters.

    1. Click the Connectivity Parameters subtab, and then click Generate in the bottom pane toolbar.

      The Generating Parameters... dialog lists the containers that are involved in this execution plan.

    2. Enter the number of copies of each container that are needed, and then click OK.

      DAC automatically generates the parameters required for each copy of the source system container. Note that not all copies require all of the possible parameters.

    3. In the Value column, for each folder or database connection, select the appropriate physical folder name or database connection.

      Note:

      For the data source type of FlatFileConnection, make sure you have copied all files into the directory specified in the DAC system property InformaticaParameterFileLocation.

    4. (Optional) If you are extracting data from more than one source system and want to stagger the data extracts, in the Delay field for the appropriate data source, enter a value for the number of minutes you want to delay this extract after the first extract process has started.

    5. (Optional) Set the Prune Time property. This setting subtracts the Prune Time value from the LAST_REFRESH_DATE and supplies this value as the value for the $$LAST_EXTRACT_DATE parameter. See "Prune Time Column" for more information.

  6. (Optional) Add one or more tasks that will run before the ordered tasks generated by DAC.

    1. Click the Preceding Tasks subtab, and then click Add/Remove.

    2. Select the appropriate container from the drop-down list.

    3. Query for and select the task you want to add, and then click Add.

  7. (Optional) Add one or more tasks that will run after the ordered tasks generated by DAC.

    1. Click the Following Tasks subtab, and then click Add/Remove.

    2. Select the appropriate container from the drop-down list.

    3. Query for and select the task you want to add, and then click Add.

  8. In the top pane of the Execution Plans tab, make sure the new execution plan is highlighted, and click Build.

  9. In the Building... dialog, select the option Selected Record Only, to build only the selected execution plan.

  10. Click the Ordered Tasks subtab, and verify the following:

    1. Click Details in the toolbar, and review each task's predecessor and successor tasks to confirm tasks common to multiple sources are ordered in a manner consistent with the priority of the source connection.

    2. Confirm that load tasks appear only once even if there are multiple extracts for tables common to multiple sources.

    3. For tasks common to multiple sources, click Unit Test in the toolbar, and confirm that the first common task truncates the common target table and the following tasks do not. For instructions on unit testing a task, see "Unit Testing Execution Plan Tasks".

  11. Make sure the DAC Server is running. For instructions on starting the DAC Server, see "Managing the DAC Server".

  12. Start the execution plan:

    1. Click Run Now.

    2. In the Starting ETL dialog, click Yes.

      Note: The Honor Time Delays property determines whether the Delay property in the Connectivity Parameters subtab will be active.

    Once the execution plan starts running you can monitor its progress in the Current Runs tab. For instructions, "Monitoring Execution Plan Processes".

    For information about how refresh dates are tracked, see "About Refresh Dates and DAC's Incremental Load Strategy".

    To schedule an execution plan, see "Scheduling an Execution Plan".

Running Execution Plans Concurrently

This section contains the following topics:

Introduction to Running Execution Plans Concurrently

You can configure DAC to run multiple execution plans concurrently if the execution plans are not dependent on one another. To avoid data inconsistency, execution plans that have dependencies with other execution plans cannot run concurrently (unless you override the default behavior).

Dependent Execution Plans

Execution plans are considered dependent if 1) they load data into tables that have the same name on the same physical data source; and 2) one execution plan writes to a table and the other execution plan reads from the same table on the same physical data source. Execution plans that have dependencies with other execution plans are not be eligible to run concurrently if any of the following scenarios are true:

  • The execution plans share one or more target tables on the same physical data source.

  • An execution plan has a target table that is a source for another execution plan. For example, execution plan A writes to table 1 on the physical data source DW and execution plan B reads from table 1 on DW.

  • An execution plan has a source table that is a target table for another execution plan. For example, execution plan C reads from table 2 on the physical data source DW and execution plan D writes to table 2 on DW.

Execution Plans That Are Eligible to Run Concurrently

DAC allows execution plans to run concurrently as long as they write to unique tables. Tables are considered to be unique in relation to other tables if the table names are unique or if tables have the same name but reside in different physical data sources.

DAC does not distinguish whether the same Informatica workflows are used by multiple execution plans. Therefore, as long as execution plans write to unique tables, if they share the same Informatica workflows, they can still run concurrently.

For example, consider a scenario in which you have development and QA environments that each have a different OLTP and data warehouse but share the same DAC and Informatica repositories, and, therefore, use the same Informatica workflows. Execution plans for these two environments that share Informatica workflows can run concurrently.

DAC provides two options for enabling Informatica workflows to be shared by multiple execution plans:

  • Defining an execution plan prefix

    You can define a prefix for each execution plan using the DAC_EP_PREFIX execution plan level parameter. When you use this option you are able to use single instances of the DAC Server, Informatica Integration Service, and DAC and Informatica repositories to run execution plans from multiple sources. That is, you can run multiple execution plans at the same time using the same metadata.

    This solution is useful if you have multiple sources with execution plans that have the same subject area. It can also be useful if the sources are in the same time zone and the execution plans need to run simultaneously or the sources are in different time zones but have overlapping execution plan schedules.

    The workflows that are common to different execution plans must be enabled in Informatica Workflow Manager to run concurrently. DAC will invoke the workflows with an instance name that is in the format <prefix>_workflowname. DAC captures the Informatica pmcmd and pmrep outputs and stores them as log files for each ETL process run. For more information about log file naming conventions, see "DAC Session Log File Naming Conventions" and "Informatica Log File Naming Conventions".

    For instructions on defining an execution plan prefix, see "Defining an Execution Plan Prefix".

  • Copying the Informatica workflow folders and creating new folder mappings

    You can make copies of the Informatica workflow folders so that there is one extract and load folder combination for each execution plan, and then create new mappings in DAC for the physical to logical folders. For instructions, see "Copying Informatica Workflow Folders and Creating New Folder Mappings".

About Resource Usage

When you run execution plans concurrently, the following resource configurations apply to each ETL process. You should consider these properties when deciding how many execution plans you will configure to run concurrently.

  • Num Parallel Workflows per EP

    This property is located on the Informatica Servers tab in the Setup view. It specifies the maximum number of workflows that can be executed in parallel on the Informatica Server for each ETL run. For example, if this value is set to 10 and three execution plans are running concurrently, 30 Informatica workflows may be running at any given time.

  • Num Connections per EP

    This property is located on the Physical Data Sources tab in the Setup view. It specifies the number of database connections that each ETL will open to the data source.

  • Generic Task Concurrency Limit per EP

    This system property is located on the DAC System Properties tab in the Setup view. It specifies for each execution plan the number of tasks with an Execution Type other than Informatica that can run concurrently.

Note:

When execution plans run concurrently, they do not share resources. Keep in mind that the values you set for these properties apply to each ETL process.

For example, if you configure DAC to run two execution plans concurrently, your environment must have the resources for twice the value you set for the Num Parallel Workflows per EP, Num Connections per EP, and Generic Task Concurrently Limit properties.

Make sure you have reviewed the system requirements and certification documentation for information about hardware and software requirements, platforms, and databases available on Oracle Technology Network as well as the Oracle BI Applications configuration information in Oracle Fusion Middleware Installation and Configuration Guide for Oracle Business Intelligence Applications.

Viewing Execution Plan Concurrent Dependencies

In considering whether to run execution plans concurrently, you should have an understanding of the concurrent dependencies (common target tables) between the execution plans.

To view execution plan concurrent dependencies

  1. In the Execute view, select the Execution Plans tab.

  2. Right-click one of the execution plans you want to compare, and then select Concurrent Dependency.

  3. From the drop-down list in the Concurrent Dependency dialog, select the execution plan you want to compare with the execution plan you selected in step 2.

    The common target tables between the two execution plans are displayed as well as the tables that one execution plan writes to and the tables that the other execution plan reads from.

Configuring DAC to Run Execution Plans Concurrently

To configure DAC to run execution plans concurrently, you need to do the following:

  • Set the Concurrency Level system property, which controls the number of independent execution plans that DAC will allow to run in parallel. See the instructions below.

    When the value of the Concurrency Level system property is more than one, each execution plan starts in a separate Java process. An identifier for each process is stored in the OS Process Name column on the Current Runs tab in the Execute view.

  • Add the directory path to the JAVA_HOME directory to the PATH environment variable.

To set the Concurrency Level system property

  1. In the DAC Setup view, select the DAC System Properties tab.

  2. In the list of system properties, find Concurrency Level.

  3. Click in the Value column, and enter a value that specifies the number of execution plans you want to run in parallel.

  4. Click Save.

  5. Restart the DAC Server by selecting Tools, DAC Server Management, Restart DAC Server.

Enabling Informatica Workflows to Be Shared by Multiple Execution Plans

DAC provides two options for enabling Informatica workflows to be shared by multiple execution plans:

For information about the requirements for running execution plans concurrently, see "Execution Plans That Are Eligible to Run Concurrently".

Defining an Execution Plan Prefix

Follow this procedure to define an execution plan prefix, which enables multiple execution plans to use the same Informatica workflows.

To define an execution plan prefix

  1. For each workflow associated with an execution plan that will run concurrently, enable the Informatica workflow property Configure Concurrent Execution:

    1. In Informatica Workflow Manager, open the appropriate workflow in the Workflow Designer.

    2. On the menu bar, select Workflows, and then select Edit.

    3. In the Edit Workflow dialog, select the Enabled check box for the Configure Concurrent Execution property.

    4. Click the Configure Concurrent Execution button.

    5. In the Configure Concurrent Execution dialog, select Allow concurrent run with same instance name.

    6. Click OK.

  2. In the DAC Execute view, go to the Execution Plans tab.

  3. Query for and select the execution plan for which you want to define a prefix.

  4. Click the Execution Parameters subtab.

  5. Click New in the subtab toolbar.

  6. In the Name field, enter DAC_EP_PREFIX.

    Note: This value is case sensitive. You must enter the value exactly as it appears, using upper-case characters.

  7. Click Save in the subtab toolbar.

  8. Leave the default values as Text for the Data Type and Both as the Load Type.

  9. Click in the Value field to open the Enter Parameter Value dialog.

  10. Make sure Static is selected.

  11. In the text box, enter a text string of one to four characters that defines the value of the DAC_EP_PREFIX parameter.

    This value will be appended as a prefix to the execution plan name. For example, if you define the parameter value as AMER, and the execution plan name is Extract_Forecast. When the execution plan runs, its name will appear in the Current Runs tab and the log files as AMER_Extract_Forecast.

Copying Informatica Workflow Folders and Creating New Folder Mappings

Follow this procedure to copy Informatica workflow folders and create new folder mappings in order to configure Informatica workflows so that they can be shared by multiple execution plans.

To make copies of the Informatica workflow folders and create new folder mappings

  1. In Informatica Workflow Manager, make copies of the SDE and SIL folders. You will need one copy of each folder for each execution plan.

  2. Define the new physical folders in DAC:

    1. In the DAC menu bar, select Tools, Seed Data, Task Physical Folders.

    2. In the Task Physical Folders dialog, click New.

    3. Enter a name and instance priority for the physical folder.

      For information about the instance priority property, see "Task Physical Folder Instance Priority".

      Repeat this step for each copy of the physical folder you made in Informatica Workflow Manager.

    4. Click OK to close the Task Physical Folders dialog.

  3. Define the new logical folders in DAC:

    1. In the DAC menu bar, select Tools, Seed Data, Task Logical Folders.

    2. In the Task Logical Folders dialog, click New.

    3. Enter a name for the logical folder, and click OK.

      Repeat this step for each copy of the logical folder you made in Informatica Workflow Manager.

  4. Map the logical and physical folders:

    1. In the DAC Design view, go to the Source System Folders tab.

    2. In the toolbar, click New.

    3. Click in the Logical Folder column and select the appropriate folder from the drop-down list.

    4. Click in the Physical Folder column and select the appropriate folder from the drop-down list.

    5. Click Save.

      Repeat this step for each new combination of physical and logical folders.

Explicitly Defining Execution Plans as Independent or Dependent

You can override the DAC execution engine to define dependent execution plans as independent and independent execution plans as dependent.

Defining Execution Plans as Independent

You should exercise caution when overriding the DAC execution engine to define two dependent execution plans as independent of each other. You should perform this procedure only if: 1) you have an advanced understanding of the DAC ETL process; and 2) you know the load mappings handle mutually exclusive sets of data across execution plans, and the execution plans do not truncate tables; or 3) you have two execution plans that use time-based dimensions and are only loaded during full loads and not incremental loads.

Caution:

If your incremental ETL process involves dropping and recreating indexes, you should not declare dependent execution plans as independent.

Defining Execution Plans as Dependent

You may want to define independent execution plans as dependent if you have resource issues. For information about potential resource issues when running execution plans concurrently, see "About Resource Usage".

To define execution plans as independent or dependent

  1. In the Execute view, select the Execution Plans tab.

  2. Select the execution plan for which you want to define the dependency attribute of another execution plan.

    For example, if you want to define the execution plan Oracle 11.5.10 Enterprise Sales as independent from the execution plan Procurement and Spend: Oracle 11.5.10, select Procurement and Spend: Oracle 11.5.10 in this step.

  3. Select the Concurrent Dependency subtab.

  4. Locate the execution plan whose dependency attribute you want to set:

    1. In the bottom pane toolbar, click Add/Remove.

    2. Select the execution plan, and click Add.

    3. Click OK to return to the Concurrent Dependency subtab.

  5. In the Concurrent Dependency list, select the appropriate execution plan.

  6. Click in the Type column, and select one of the following:

    • Independent

      Defines the execution plan selected in the Concurrent Dependency subtab as independent of the execution plan selected in the top window.

    • Dependent

      Defines the execution plan selected in the Concurrent Dependency subtab as dependent on the execution plan selected in the top window

  7. Click Save in the bottom pane toolbar.

Setting Up Extract Delays, Event Delays and Data Source Notifications

This section contains the following topics. These topics apply to multi-source execution plans.

Setting Up Extract Delays

You can delay the extract of a source system in a multi-source environment by setting the Delay property in the Connectivity Parameters subtab of the Execution Plans tab. See "Building and Running Execution Plans" for instructions on setting the Delay property.

Setting Up Event Delays

The Event Delay feature applies to multi-source environments. It enables you to configure the extracts for the different data sources to occur independently.

The following are examples of how you might use the Event Delay feature:

  • In a multi-source environment, one data source is not available at the beginning of the ETL. For example, suppose an ETL process extracts CRM data from one source system and financial data from another source system. Nightly maintenance of the CRM system is scheduled for 11:00 PM, and the system is expected to be back up at 11:30 PM. The ETL process is scheduled to begin at 12:00 AM. If the CRM system is still down at the ETL start time, the ETL process will fail immediately because the connection pool to the CRM source will not be created.

    In such situations, you can define an event delay that would allow the tasks that read and write to the financial data source to be executed while the tasks for the CRM source wait until a connection pool can be created. You define the event delay using the Lazy Initialization property, which prevents a connection pool from being created until the connection itself is available. When you enable Lazy Initialization, you also set the following properties:

    • Polling Frequency - Specifies in minutes how often DAC will try to create the connection.

    • Timeout - A period of time (specified in minutes) indicating how long DAC will continue to poll to see if the connection is available.

    • Upon Timeout - The possible action DAC will take if the Timeout period elapses without a connection being available. The options are:

      Fail - DAC will fail all of the tasks that read and write to the unavailable source. The failed tasks are marked as Stopped.

      Skip - DAC will skip the tasks that read and write to the unavailable source. The refresh dates for primary and auxiliary source and target tables will not be persisted. This allows for the next ETL process to pick up the data since the prior ETL event was not successful. The skipped tasks are marked as Completed, and the task details reflect a Not Executed status.

    In this example, the Lazy Initialization property is enabled on the CRM source. The Polling Frequency is set at one minute, and the Timeout period is set to 120 minutes. Therefore, DAC will poll the CRM source every minute for 120 minutes.

    If the connection becomes available within 120 minutes, DAC will create the connection pool and the ETL will proceed. If the Timeout period elapses, and Fail is selected for the Upon Timeout property, the tasks that read and write to the unavailable source will fail. If the Timeout period elapses, and Skip is selected, the tasks will be skipped, and the extract from the financial source will proceed.

  • Specifying a Condition Using SQL. For example, suppose an automated process inserts and updates contact information into a CRM source, but this data load process does not complete before the scheduled ETL process. If the ETL process starts to extract data, only some of the data will be available for extraction, which will result in incomplete data loading into the data warehouse, and, consequently, inaccurate reports.

    In such cases, you can use write custom SQL to define a condition that must occur before the ETL process starts; for example, you can define a condition in which the ETL process will not start until all of the records have been inserted or updated. The process that loads the contact information can be enhanced to insert a completed status in a relational table, indicating the completion of the process. This information would then signal the ETL to start from the CRM system.

    • Condition SQL - A SQL that can be executed against the data source to verify if the data source is ready.

    • Cleanup SQL - A SQL that can be executed upon the successful occurrence of the event for the data source and upon successfully executing the Condition SQL. This property applies only if a Condition SQL is defined.

You can set up an event delay for situations similar to those described above as well as a combination of the examples described above.

To set up an event delay

  1. Go to the Physical Data Sources tab in the Setup view.

  2. In the top pane, select the physical data source for which you want to define an event delay.

  3. Click the Extended Properties subtab.

  4. Click New in the bottom pane toolbar.

  5. Click in the Name field to expose the drop-down list, and select EventBasedDelay.

  6. Click Save in the toolbar.

  7. Double-click in the Value field to display the Property Value dialog.

  8. Complete the fields in the Property Value dialog using the following information:

    Field Description

    Lazy Initialization

    When this check box is selected, DAC prevents a connection pool from being created until the connection itself is available.

    When you select Lazy Initialization, you then need to set the Timeout, Upon Timeout, and Polling Frequency properties.

    Condition SQL

    Click in the field to open a text box where you can enter a SQL statement that defines a condition that DAC will execute against the data source to verify if the data source is ready.

    This SQL should return a value in the form of a number. A result of 0 indicates the data source is not ready. A non-zero result means the event has happened.

    Cleanup SQL

    A SQL that can be executed upon the successful occurrence of the event defined in the Condition SQL.

    This property applies only if a Condition SQL is defined.

    Timeout (min)

    The period of time (specified in minutes) indicating how long DAC will continue to poll to see if the connection or source is available.

    Upon Timeout

    The possible action DAC will take if the Timeout period elapses without a connection being available or the source being ready. The options are:

    • Fail. DAC will fail all of the tasks that read and write to the unavailable source. The failed tasks are marked as Stopped.

      For example, using the scenario described in the introduction to this section, the financial data extract will occur and the financial-specific load tasks will run, but the CRM data extract will not occur and load tasks that are specific to both financial and CRM data will not run.

    • Skip. DAC will skip the tasks that read and write to the unavailable source. The refresh dates for primary and auxiliary source and target tables will not be persisted. This allows for the next ETL process to pick up the data since the prior ETL event was not successful. The skipped tasks are marked as Completed, and the task details reflect a Not Executed status.

      For example, using the scenario described in the introduction to this section, the tasks related to CRM and to both CRM and financial data will be skipped. Extracts and loads related to the financial source will occur.

    Polling Frequency (min)

    Specifies in minutes how often DAC will try to create the connection.


  9. Click OK to close the Property Value dialog.

  10. Click Save in the toolbar.

Setting Up Data Source Usage Notifications

The Data Source Usage Notification feature enables you to initiate email notifications about data source usage and to define custom SQL to be executed based on the usage.

The data source usage categories about which DAC will send notifications are the following:

  • Notification Before First Read

  • Notification Before First Write

  • Notification Before First Read or Write

  • Notification After Last Read

  • Notification After Last Write

  • Notification After Last Read and Write

The email notification contains the execution plan name, the ETL run name and timestamp, the data source name, and the data source usage category. The recipients of the email are the email accounts set up in the procedure "Setting Up Email Notifications in the DAC Client and Server".

When you define the data source usage notification properties, you can also define custom SQL that will be executed upon the occurrence of the specified data source usage category. In the custom SQL, you can use the DAC variable @DAC_NOTIFICATION_RECORD in a parameter.

To set up a notification

  1. Go to the Physical Data Sources tab in the Setup view.

  2. In the top pane, select the physical data source for which you want to define an event delay.

  3. Click the Extended Properties subtab.

  4. Click New in the bottom pane toolbar.

  5. Click in the Name field to expose the drop-down list, and select the appropriate data source usage category.

  6. Click Save in the toolbar.

  7. Double-click in the Value field to display the Property Value dialog.

  8. Complete the fields in the Property Value dialog using the following information:

    Field Description

    Send Email

    Select this check box to indicate an email should be sent when the selected operation occurs. Email recipients with a Notification Level of 5 or higher will receive the email

    The DAC email administrator account and email recipients must be set up before DAC can send data source usage notifications. For instructions, see "Setting Up Email Notifications in the DAC Client and Server".

    Execute SQL

    Click in the field to open a text box where you can enter a SQL statement that DAC will execute against the data source when the selected operation occurs.

    Is Stored Procedure

    Select this check box if the custom SQL is a stored procedure.


  9. Click OK to close the Property Value dialog.

  10. Click Save in the toolbar.

How the DAC Server Handles Requests to Start and Stop Execution Plans

Starting an Execution Plan

When the DAC Server receives a request to start an execution plan, it performs a series of checks to verify that the execution plan can be started. It first checks that an execution plan with the requested name exists and that the execution plan is active.

Next, it checks the status of the execution plan that last ran. If an execution plan is still running and the DAC Server receives a request to start the same execution plan, the request will be rejected. If an execution plan failed, a request to run the same execution plan again will be executed to restart the failed execution plan.

However, a request to run a different execution plan will be put on hold if the execution plan has dependencies on the failed execution plan (dependencies such as having the same source and target tables). If the execution plans are mutually exclusive, the new request will result in a new execution plan run.

When the DAC Server receives a request to start an execution plan, it will issue a warning if any of the following conditions are true. (A warning is for informational purposes and does not mean the execution plan will not start.)

Stopping an Execution Plan

When the DAC Server receives a request to stop the operation of a running execution plan, the request will fail in the following cases: