Skip Headers
Oracle® Fusion Middleware User's Guide for Oracle Business Intelligence Data Warehouse Administration Console
11g Release 1 (11.1.1)

Part Number E14849-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

5 DAC Quick Start

This chapter provides the essential information you need to get started using DAC to run ETL processes.

Note:

Before you can perform the operations in this chapter, you need to have completed certain administrative setup tasks. If you installed and set up Oracle BI Applications by following the steps in Oracle Fusion Middleware Configuration Guide for Oracle Business Intelligence Applications, you have already completed the necessary tasks, and you are ready to begin running ETL processes, as described in this chapter.

In this guide, Chapter 4, "Setup Tasks and Concepts for DAC Administrators," also provides the administrative tasks that must be performed in order for a DAC environment to be operable.

The key configurations that you must perform before running an ETL process are the following:

Before running ETL processes, you should also be familiar with the concepts discussed in "About Source System Containers".

This chapter contains the following topics:

Logging into DAC for the First Time as a DAC User

To perform this procedure you must have a DAC user account and you must know the location of the authentication file that you will use to connect to the DAC repository.

If you are a DAC administrator and are logging into DAC for the first time, see "Logging Into DAC for the First Time as an Administrator".

When you log into DAC for the first time as a DAC user, you configure a connection to connect to the DAC repository. DAC stores this connection information for subsequent logins.

To log into DAC for the first time as a DAC user

  1. Start the DAC Client by navigating to the <Domain_Home>\dac directory and double-clicking the startclient.bat file.

    The Login... dialog is displayed.

    This dialog box is described in the surrounding text.
  2. Click Configure.

  3. In the Configuring... dialog, select Create Connection, and then click Next.

  4. Enter the appropriate connection information:

    Field Required Value
    Name Enter a unique name for the connection to the DAC repository.
    Connection type Select the type of database in which the DAC repository will be stored.
    Connection String, or Database name, or TNS Name, or Service Name Select the database name or database account name of the DAC repository.

    If you are using:

    • Oracle (OCI8), use the tnsnames entry.

    • Oracle (Thin), use the service name.

    • SQL Server, use the database name.

    • DB2-UDB, use the connect string as defined in the DB2 configuration.

    Database Host Enter the name of the machine where the DAC repository resides.
    Database Port Enter the port number on which the database listens.

    The default port for an Oracle database is 1521. The default port for a SQL Server database is 1433.

    DB URL (Optional) Use this option to specify a unique URL for this connection.
    DB Driver (Optional) Use this option to specify a unique driver for this particular connection.
    Authentication File Click in this field to select an existing authentication file.

    Proceed to step 5 for detailed instructions.

    SSL Trust Store File (Optional) For deployments in Web mode, location of the SSL Trust Store file.
    SSL JKS Password File (Optional) For deployments in Web mode, location of the SSL JKS password file.
    Log Level Specifies a client log level for the client session. The logs are saved in <Domain_Home>\dac\log\client <client's logical connection name>.

  5. To select an existing authentication file, do the following:

    1. Click in the Authentication File field.

    2. In the Authentication File dialog, select Choose existing authentication file, and click OK.

    3. Navigate to the appropriate directory, and select the appropriate cwallet.sso file. Click OK.

    4. In the Configuring... dialog, click Test Connection to confirm the connection works.

    5. Click Apply, and then click Finish.

      Note:

      A DAC administrator must distribute this authentication file to all user accounts that need to access this DAC repository.
  6. Click Apply, and then click Finish

  7. To log in using Web mode, do the following:

    1. In the Login... dialog, select the appropriate Connection from the drop-down list.

    2. Enter your User Name and Password.

      This must match the user name and password stored in the WebLogic Server identity store.

    3. Select FMW as the Authentication Type.

    4. If you want DAC to remember the password for this connection, select Remember Password.

    5. Click Login.

      The DAC Client is displayed.

  8. To log in using DAC standalone mode, do the following:

    1. In the Login... dialog, select the appropriate Connection from the drop-down list.

    2. Enter your DAC user account User Name and Password.

    3. Select DAC as the Authentication Type.

    4. If you want DAC to remember the password for this connection, select Remember Password.

    5. Click Login.

  9. In the Starting tab dialog, specify the initial view and tab that you want to appear each time you log in, or leave the default. Click OK.

Running an Execution Plan

An execution plan is an ETL process that comprises one or more subject areas, connectivity parameters for the physical source and target data sources, and pre- and post-ETL tasks. Before an execution plan runs, DAC checks the connections to the data sources and establishes connections to the ETL tool.

Predefined execution plans are available with an Oracle BI Applications installation. You can run predefined execution plans from the predefined container if you do not make any changes to the predefined source system container.

Predefined execution plans should be run mainly for proof-of-concept purposes. After running a predefined execution plan, you can then review the predefined dashboards and reports in order to perform a gap analysis that will help you understand what customizations are required for your organization. With this knowledge, you can then create a copy of the source system container and modify the predefined execution plan or create a new one by selecting the appropriate subject areas for your ongoing ETL requirements.

Note: You cannot change any of the metadata in a predefined container. You must first make a copy of the container, and then modify objects within the custom container. See "Creating or Copying a Source System Container" for instructions.

For more detailed information about execution plans, see "Building and Running Execution Plans".

To run a predefined execution plan

  1. Make sure the DAC Server is running. For instructions on starting the DAC Server, see "Starting and Stopping the DAC Server (Standalone Mode)".

  2. In the Execute view, go to the Execution Plans tab.

  3. Select the execution plan you want to run.

  4. Start the execution plan:

    1. Click Run Now.

    2. In the Starting ETL dialog, click Yes.

      Note: The Honor Time Delays property determines whether the Delay property in the Connectivity Parameters subtab will be active. The Delay property indicates how many minutes an extract of a data source will be delayed after the first extract of a multiple source extract process has started. For information about the Delay property, see "Setting Up Extract Delays".

    Once the execution plan starts running you can monitor its progress in the Current Runs tab. See "Monitoring Execution Plan Processes" for instructions.

To run a customized execution plan

  1. Create a custom container based on the existing predefined container. See "Creating or Copying a Source System Container" for instructions.

  2. In the Execute view, go to the Execution Plans tab, and click New.

  3. Enter a name for the execution plan, and click Save.

  4. Associate one or more subject areas with the execution plan.

    1. Click the Subject Areas subtab.

    2. Click Add/Remove in the bottom pane toolbar.

    3. In the Choose Subject Areas dialog, select the custom container from the drop-down list that you created in step 1.

    4. Query for the subject area you want to associate with the execution plan.

    5. Select the subject area and click Add.

      You can repeat this process to associate multiple subject areas from any available source system container with an execution plan.

    6. Click OK to close the window.

  5. Generate the runtime connectivity parameters.

    1. Click the Connectivity Parameters subtab, and then click Generate in the bottom pane toolbar.

      The Generating Parameters... dialog lists the containers that are involved in this execution plan.

    2. In the Generating Parameters dialog, enter the number of copies for each source system container. Unless you are running a multi-source execution plan, enter the value 1, and then click OK.

      DAC automatically generates the parameters required for the source system container.

    3. In the Value column, for each folder or database connection, select the appropriate physical folder name or database connection.

      Note:

      For the data source type of FlatFileConnection, make sure you have copied all files into the directory specified in the DAC system property InformaticaParameterFileLocation.
    4. Leave the default values in the Delay and Prune Time fields.

      For a description of all the fields in this subtab, see "Execution Plans Tab: Connectivity Parameters Subtab".

  6. (Optional) Add one or more tasks that will run before the ordered tasks generated by DAC.

    1. Click the Preceding Tasks subtab, and then click Add/Remove.

    2. Select the appropriate container from the drop-down list.

    3. Query for and select the task you want to add, and then click Add. Note: Only tasks that have the task phase Pre ETL Process will appear in this list.

  7. (Optional) Add one or more tasks that will run after the ordered tasks generated by DAC.

    1. Click the Following Tasks subtab, and then click Add/Remove.

    2. Select the appropriate container from the drop-down list.

    3. Query for and select the task you want to add, and then click Add. Note: Only tasks that have the task phase Post ETL Process will appear in this list

  8. In the top pane of the Execution Plans tab, make sure the new execution plan is highlighted, and click Build.

  9. In the Building... dialog, select the option Selected Record Only, to build only the selected execution plan.

  10. Review the ordered tasks for the execution plan:

    1. Click the Ordered Tasks subtab.

    2. Click Refresh in the bottom pane toolbar to populate the list of ordered tasks.

    3. To view details about a specific task, select the task, and then click Details in the bottom pane toolbar.

      For instructions on unit testing a task, see "Unit Testing Execution Plan Tasks".

  11. Make sure the DAC Server is running. For instructions on starting the DAC Server, see "Starting and Stopping the DAC Server (Standalone Mode)".

  12. Start the execution plan:

    1. Click Run Now.

    2. In the Starting ETL dialog, click Yes.

    Once the execution plan starts running you can monitor its progress in the Current Runs tab. For instructions, "Monitoring Execution Plan Processes".

    For information about how refresh dates are tracked, see "About Refresh Dates and DAC's Incremental Load Strategy".

    To schedule an execution plan, see "Scheduling an Execution Plan".

Creating or Copying a Source System Container

In DAC, the metadata for a source system is held in a container. Included with Oracle BI Applications are predefined containers that hold metadata specific for the supported source systems. An important feature of DAC is that the metadata for predefined containers cannot be changed. To customize the metadata in a predefined container, you must first make a copy of the container. DAC keeps track of all customizations in the copied container, so that at any time you can find the newly created objects and modified objects, as well as the original objects. DAC is also able to compare the modified objects with the predefined object definitions in the predefined container. This feature enables you to selectively rollback changes to the objects if necessary.

For information about managing object ownership, see "Ownership Right-Click Commands".

You can also create a new, empty container if you want to build your own container with customized metadata.

To create a new container or copy an existing container

  1. In DAC menu bar, select File, and then New Source System Container.

  2. Enter an ID and a Name for the container.

    The ID and Name fields are alphanumeric. The Name can contain spaces but the ID cannot. The Name field must be at least five characters long. It is strongly recommended that you use a descriptive Name and ID. For example:

    • Name: Customized Fusion V1

    • ID: CUST_FUSN_V1

  3. Select one of the following:

    • Create Empty New Source System Container

    • Create as a Copy of Source System Container

  4. If you are creating an empty, new container, click OK.

  5. If you are making a copy of an existing container, select the existing container from the drop-down list, and then click OK.

    Note: All records in the base container will be referenced to the newly created custom container, and parent-child relationships between the containers will be established.

Monitoring Execution Plan Processes

The Current Runs tab in the Execute view provides a number of predefined reports that enable you to monitor execution plan processes and diagnose errors and performance issues.

The Current Runs tab lists all execution plans that are currently running or that have not yet completed. For each execution plan the tab displays important information about the ETL process.

The Tasks and Task Details subtabs of the Current Runs tab provide detailed information about the status of tasks associated with the execution plan.

This section contains the following topics:

Generating Run Reports

Follow this procedure to generate reports for execution plans that are running or have not yet completed.

To generate run reports

  1. Go to the Current Runs tab in the Execute view.

  2. Right-click and select Get Run Information.

  3. The following options are available:

    • Get Log File. Saves a log file in the log\statistics directory.

    • Analyze Run. Saves a description of the run in HTML format in the <Domain_Home>\dac\log\statistics directory. For the period of the run, the file summarizes the number of queued, running, runnable, completed, failed, and stopped tasks. You can review this report to determine whether any tasks are creating performance issues that could be resolved to reduce the overall ETL process time.

    • Get Task Gantt Chart. Displays a Gantt chart in a separate window showing all tasks based on the start and end times. This chart can be useful to identify any long running ETL phases or to summarize the ETL run.

    • Get Phase Gantt Chart. Displays a Gantt chart in a separate window showing all the execution plan phases based on the start and end times. This chart can be useful to identify any long running execution plan phases or to summarize the ETL run.

    • Get Run Graph. Displays a graph showing how the number of tasks belonging to different statuses varies across the ETL run.

    • Statistics. Displays the number of times the execution plan attempted to run, the time from the start of the first attempt until the end of the last attempt, and the cumulative running time of all attempts.

Viewing the Life Span of a Task

Once an execution plan is started, the life span of a task consists of the following statuses. You can view the status and other detailed information about the task in the Task and Task Details subtabs of the Current Runs tab. The list of tasks in these subtabs is color coded based on the status.

  1. Queued. Task is waiting for one or more predecessor tasks to complete. Appears as light yellow.

  2. Runnable. Task is waiting on a resource token to be available. All predecessor tasks have completed. Appears as yellow.

  3. Waiting. Task is eligible to run but is waiting because a time delay for the source was defined in the connectivity parameters or because an event delay was defined at the data source level (see "Setting Up Extract Delays, Event Delays and Data Source Notifications"). Appears as white.

  4. Running. Task obtained a resource token and has started running. Appears as blue.

  5. Paused. Task is a member of a task group and has paused until the child tasks of the task group are completed. Appears as blue.

  6. Completed. All task details of the task have executed successfully. Appears as green.

  7. Failed. One or more of the task details of the task have failed. Appears as red.

  8. Stopped. Task has stopped because one or more predecessor tasks failed. Appears as ochre.

  9. Not Executed. Task has a heuristics definition that prevents it from running (see "Using Heuristics to Manage Tasks, Tables and Indexes"). Appears as white.

In general, when an execution plan starts, the first tasks to run are those with no dependencies (or predecessors). A task becomes "Runnable" if all of its predecessor tasks are completed. When all of the tasks of an execution plan are completed, the status of the execution plan becomes "Completed."

Identifying Why a Task Failed

When one or more tasks in an execution plan fail, the status of the execution plan becomes "Failed."

To identify why a task failed

  1. In the Execute view, go to the Current Runs tab.

  2. Click the Tasks subtab.

  3. In the drop-down list on the right side of the subtab toolbar, select Failed.

    The failed tasks are displayed.

  4. Select a task, and click Details.

    Information about the task details is displayed. (Task details are the building blocks of the task.) The Status Description field provides extensive information about the cause of failure of a task detail. After you have understood and fixed the problem, you can restart the execution plan. DAC will attempt to re-execute the failed task.

    If you fixed the problem and executed the task outside of DAC (for example, by running the task in Informatica), within DAC you can set the task details status to "Completed." Then, when you re-run the execution plan, this task will be ignored.

    See "Restarting an Execution Plan That Failed" for more information.

Scheduling an Execution Plan

Follow this procedure to schedule an execution plan.

To schedule an execution plan

  1. Go to the Scheduler tab in the Execute view.

    The current list of schedules is displayed in the top pane.

  2. Click New in the top pane toolbar.

    The Edit tab in the bottom pane becomes active.

  3. Enter a name for the schedule.

  4. Select an execution plan from the drop-down list.

  5. Do one of the following:

    • If you want the schedule to run once, select the Run Only Once check box, and then select a start date.

    • To create a periodic schedule, select a recurrence pattern, and enter the appropriate date and time parameters.

  6. Click Save.

Unit Testing Execution Plan Tasks

You can test how DAC will execute an individual task (and its details) without running a fully functional execution plan. You can test any task regardless of its position in the dependency graph. For example, you can test a task at depth 10, without running the execution plan. You can also test tasks that belong to single-source or multi-source execution plans. From the Unit Test dialog, you can execute the task after reviewing the task details. The unit test occurs within the DAC Client; the command is not sent to the DAC Server.

Note: When you test a task, DAC generates the required parameter files and issues the proper pmcmd command to execute the appropriate workflow. You cannot re-run the corresponding workflow in Informatica because DAC generates parameter files and log files using variable names.

To unit test an execution plan task

  1. Go to the Execution Plans tab in the Execute view, and select the appropriate execution plan.

  2. Click the Ordered Tasks subtab.

  3. Query for and select the task that you want to test.

  4. Click Unit Test in the toolbar.

    The Unit Test dialog displays the steps that DAC will carry out if the task is executed.

  5. To execute the task, click Execute in the Unit Test dialog.

    Note: The DAC system property Dryrun must be set to False for the task to run.

About Refresh Dates and DAC's Incremental Load Strategy

In DAC, refresh dates refer to the timestamp on the source and target tables that are associated with an execution plan. DAC uses refresh dates (indicated in the Refresh Dates subtab of the Physical Data Sources tab) to determine whether to issue full or incremental load commands. DAC automatically populates and updates refresh dates for all tables involved in the ETL process. DAC stores refresh dates for the appropriate tables by data source.

DAC tracks refresh dates for primary source and auxiliary tables and primary target tables on tasks in a successful, completed run of an execution plan. Note that DAC updates refresh dates only when the ETL process completed successfully.

DAC runs the full load command for tasks if the refresh date against the table is null. When there are multiple primary source or auxiliary tables, the earliest of the refresh dates will trigger a full load or an incremental load. If any one of the primary source or auxiliary tables has no refresh date, then DAC will run the full load command. A source refresh date is the minimum of both the source and target tables. A target refresh date is the minimum of the refresh dates for only the target tables.

At the successful end of the ETL process, DAC updates the refresh dates for all the source (primary and auxiliary) and target tables with the actual ETL start time as the refresh date. Even if the ETL process fails multiple times, the timestamp of the first attempt of the ETL process is used as the last refresh timestamp.

For example, suppose an ETL process started on January 1, 2011 at 10:00 PM Central time and ran for one hour before some tasks failed, thus causing the ETL process to fail. The ETL process was then restarted at 8:00 AM on January 2, 2011, and completed successfully. At the end of the ETL process, DAC updates the refresh dates for all primary/auxiliary source and target tables that participated in the ETL process as January 1, 2011, 10:00 PM, adjusted to the time zone of the transaction system. This means that if the transactional system from which the data is sourced is in the Pacific time zone (-2 hours), the tables for the source database will have a refresh timestamp as January 1, 2011, 9:00 PM. And, if the data warehouse is in the Eastern time zone (+1), the data warehouse tables will have a refresh timestamp as January 1, 2011, 11:00 PM.

Note: If an extract task fails and is restarted any number of times, the last_extract_timestamp value provided by DAC will always be the same as the last ETL process start time. The last_extract_timestamp is not the last time the task was attempted. This behavior is an intentional design feature in DAC. If a task fails N number of times, the value of this variable should not fluctuate, because there could be a potential for data loss.

Refresh Date Scenarios

Table 5-1 shows the possible scenarios regarding refresh dates and load and truncate commands.

Table 5-1 Refresh Date Scenarios

Scenario Source Refresh Date (Primary and Auxiliary only) Target Refresh Date Command DAC Will Use Truncate Target Table?

1

Null

Null

Full Load

Yes

2

Null

Not Null

Full Load

No

3

Not Null

Null

Full Load

Yes

4

Not Null

Not Null

Incremental Load

No


Note: Refresh dates for micro ETL execution plans are kept separate from the refresh dates for regular execution plans. DAC saves refresh dates for micro ETL execution plans in the Micro ETL Refresh Dates subtab of the Execution Plans tab (Execute view) and refresh dates for regular execution plans in the Refresh Dates subtab of the Physical Data Sources tab (Setup view).

Adding or Resetting Refresh Dates

You can manually add refresh dates to tables in the DAC repository or reset refresh dates before running an execution plan. The Add Refresh Dates command in the Execution Plans tab right-click menu prepopulates records with the appropriate table name in the Refresh Dates subtab of the Physical Data Sources tab. After you execute the Add Refresh Dates right-click command, you can manually add refresh dates to the prepopulated records.

How DAC Computes Timestamps for Refresh Dates

The baseline timestamp for a refresh date is the ETL process start time of the DAC Server host machine. DAC computes and uses timestamps in the following ways:

  • For timestamps that appear in the Run History tab of the Execute view, DAC uses the ETL start time of the DAC Server host.

  • For refresh dates, which are stored in the Refresh Dates subtab of the Physical Data Sources tab in the Setup view, DAC stores the timestamp of the data source database.

    For example, an ETL process is started on 1/1/2011 at 10 PM Central standard time. It runs for an hour and some tasks fail, and, therefore, the ETL process fails. The ETL process is restarted on 1/2/2011 at 8:00 AM. This time the ETL process successfully completes. At the end of the ETL process, DAC updates the refresh dates for all the primary and auxiliary source tables and the target tables that participated in the ETL process as 1/1/2011 10:00 PM Central standard time but adjusts the times based on the time zone of the transactional system and the data warehouse. In this example, the data is sourced in Pacific standard time (-2 hours), so the timestamp of the source tables is 1/1/2011 8:00 PM. The data warehouse is in the Eastern standard time (+1 hour), so the timestamp of the target tables is 1/1/2011 11:00 PM.

  • For flat file sources, DAC uses the timestamp of the DAC Server host.

  • For incremental ETL processes, the ETL logic uses two types of timestamps:

    • LAST_REFRESH_DATE. The last time data was extracted from the table.

    • PRUNED_LAST_REFRESH_DATE. The last refresh timestamp minus the prune date period, to ensure that no records are missed from previous extracts.