11 Common Tasks Performed in the DAC

This chapter provides instructions for performing common DAC tasks.

This chapter contains the following topics:

Importing DAC Metadata
Exporting DAC Metadata
Distributing DAC Metadata
Running the DAC Server Automatically
Command Line Access to the DAC Server
DAC Repository Command Line Options
Replacing an Informatica Workflow with a Custom SQL File
Determining the Informatica Server Maximum Sessions Parameter Setting
Determining the Number of Transactional and Data Warehouse Database Connections
Running Two DAC Servers on the Same Machine
Customizing Index and Analyze Table Syntaxes
Overview of Change Capture Process (Siebel Sources Only)
Using the Change Capture Filter
Tracking Deleted Records
Pointing Multiple Informatica PowerCenter Services to a Single Informatica Repository
Handling ETL Failures with the DAC
Forcing Password Encryption When DAC Interacts With Informatica
Upgrading the Data Warehouse Schema on Teradata
Using JDBC Connection URL Override to Handle Database Connections
Using Parameters to Specify Full or Incremental Loads
Mapping Multiple Database-Specific Informatica Workflows to the Same DAC Task
Connecting to the DAC Repository When Using Oracle RAC

Importing DAC Metadata

The DAC's Import/Export feature enables you to import or export source system-specific DAC metadata into or out of the DAC repository. You can use this feature to migrate DAC metadata from one environment to another, such as from a development environment to test or production environments.

To import DAC metadata

In the DAC menu bar, select Tools, then select DAC Repository Management, then select Import.
Select the directory from which you want to import DAC metadata, or accept the default directory.
Select the appropriate source system containers.
Select the appropriate categories of metadata you want to import:
- Logical. Imports all information contained in the Design view and database connection information.
- System. Imports all information contained in the Setup view, except passwords for servers and database connections.
- Run Time. Imports information about ETL runs (contained in the Execute view)
If you are importing metadata into a blank repository or to completely replace the current metadata in the repository, select Truncate Repository Tables.

This action overwrites the content in the current repository. Selecting the Truncate Repository Tables option greatly increases the speed of the import process.
(Optional) Select Enable Batch Mode to insert the imported metadata into the repository as an array insert.

This action increases the speed of the import process.
Click OK.
Verify the import process by reviewing the log file \OracleBI\DAC\log\import.log.

Exporting DAC Metadata

To export DAC metadata

In the DAC menu bar, select Tools, then select DAC Repository Management, then select Export.
Select the directory to which you want to export DAC metadata, or accept the default directory.
Select the appropriate source system containers.
Select the appropriate categories of metadata you want to export:
- Logical. Exports all information contained in the Design view and database connection information.
- System. Exports all information contained in the Setup view, except passwords for servers and database connections.
- Run Time. Exports information about ETL runs (contained in the Execute view).
Click OK.
Verify the export process by reviewing the log file \OracleBI\DAC\log\export.log.

Distributing DAC Metadata

Typically, you may have multiple environments, such as development, QA, production, and so on. When you make changes to the development environment, you test it, and then deliver it, exporting the whole environment and distributing it to the other environments. The data is exported as XML files, which are stored in the DAC\export directory on the client machine where the export is done.

To apply changes from the development environment to any other, you copy all of the XML files into the DAC\export folder and then import the data. To export the DAC metadata, follow the instructions in the procedure, "Exporting DAC Metadata". To import the DAC metadata, follow the instructions in the procedure, "Importing DAC Metadata".

Running the DAC Server Automatically

Follow this procedure to set up the DAC server to be run automatically when your machine reboots.

To set up the DAC server to run automatically upon rebooting the machine

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Double-click Add Scheduled Task.
In the Scheduled Task Wizard, browse to the startserver.bat file, and click Open.
Select the option "When my computer starts," and click Next.
Enter the domain user account to start the DAC server and a password, and click Finish.

The startserver task appears in the Scheduled Task window.
Right-click the task and select Properties.
In the Settings tab, remove the check from the "Stop the task if it runs for 72 hours" check box.

To start the DAC server as a scheduled task

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Right-click startserver, and then click Run.

To stop the DAC server as a scheduled task

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Right-click startserver, and then click End Task.

To check if the DAC server is running

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Select the startserver task.
In the Windows menu bar, select View, then select Details.

Command Line Access to the DAC Server

This section covers accessing the DAC server through a command line. It includes the following topics:

Setting Up Command Line Access to the DAC Server
Using the Command Line to Access the DAC Server

You can access the DAC server through a command line to start and stop execution plans and to get status information for servers, databases, and execution plans. This feature enables you to access the DAC server using third-party administration or management tools, without using the DAC client.

Refer to the file daccommandline.bat/.sh for usage.

Command Line Operations

The command line feature enables you to start an execution plan and stop the operation of a running execution plan.

Starting an Execution Plan

When the DAC server receives a request to start an execution plan, it performs a series of checks to verify that the execution plan can be started. It first checks that an execution plan with the requested name exists and that the execution plan is active. Next, it checks the status of the execution plan that last ran. If an execution plan is still running and the DAC server receives a request to start another execution plan, the request will be rejected. If an execution plan failed, a request to run the same execution plan again will be executed; however, a request to run a different execution plan will be rejected. If the execution plan that last ran completed successfully, a request to run a new execution plan will be executed.

When the DAC server receives a request to start an execution plan, it will issue a warning if any of the following conditions are true. (A warning is for informational purposes and does not mean the execution plan will not start.)

The Generic Task Concurrency Limit value in the DAC System Properties tab is not a positive number.
There are no active Informatica PowerCenter Services or Integration Service registered in the Informatica Servers tab.
One or more Informatica PowerCenter Services or Integration Service do not have the passwords defined in the Informatica Servers tab.
One or more Informatica PowerCenter Services or Integration Service do not have a Maximum Sessions number properly defined in the Informatica Servers tab.
One or more data sources do not have the Table Owner or Table Owner Password values properly defined in the Physical Data Sources tab.
One or more data sources do not have a maximum number of connections (Max Num Connections) value properly defined in the Physical Data Sources tab.
One or more data sources do not have a Data Source Number defined in the Physical Data Sources tab.

Stopping the Operation of a Running Execution Plan

When the DAC server receives a request to stop the operation of a running execution plan, the request will fail in the following cases:

The name of the execution plan that is running is different from the name in the request.
There is no execution plan currently running.

Command Line Status Monitoring Queries

The command line feature enables you to get the following status information:

Summary of the requested execution plan. If there are multiple instances of the same execution plan, a summary of the instance that last ran is returned. Below is an example of the information contained in the summary.

(c) 2003 Siebel Systems, Inc.
Siebel DAC Server comprising the etl execution-management,  scheduler, logger, and network server.
ETL details for the last run:
ETL Process Id : 255 ETL Name : Complete ETL Run Name : DRY RUN OF Complete ETL: ETL Run - 2004-06-17 18:30:13.201 DAC Server : (aqamarD510) DAC Port : 3141 Status: Stopped Log File Name: Complete_ETL.255.log Database Connection(s) Used : 


OLTP jdbc:microsoft:sqlserver://vranganaw8:1433;DatabaseName=OLTP Data Warehouse jdbc:microsoft:sqlserver://vranganaw8:1433;DatabaseName=olap

Informatica Server(s) Used  : 

InformaticaServer4-vranganaw8:(4)  InformaticaServer2-vranganaw8:(2)  InformaticaServer3-vranganaw8:(3)  InformaticaServer1-vranganaw8:(10) 

Start Time: 2004-06-17 19:00:06.885 Message: ETL was interrupted Actual Start Time: 2004-06-17 18:30:13.357 End Time: 2004-06-17 19:05:56.781 Total Time Taken: 35 Minutes
Start Time For This Run: 2004-06-17 19:00:06.885 Total Time Taken For This Run: 5 Minutes
Total steps: 212 Running steps: 0 Complete steps: 142 Failed/Stopped steps:70

Summary of connection status to all active databases and Informatica servers.

Setting Up Command Line Access to the DAC Server

The Command Line utility enables you to invoke commands on a DAC server running on a remote or local machine. The Command Line utility does not need the entire DAC environment. The machine on which you invoke the commands for the DAC server requires only the files DAWSystem.jar, dac.properties, and dacCmdLine.bat.

To set up command line access to the DAC server

Make sure you have installed the supported version of the Java SDK.
Copy the following files from the OracleBI\DAC directory to a local directory:
- DAWSystem.jar
- dac.properties
- dacCmdLine.bat
In the dacCmdLine.bat file, do the following:
1. Edit the JAVA_HOME variable to point to the directory where the Java SDK is installed.
Make sure there are no spaces in the path reference.
1. Edit the DAC_HOME variable to point to the directory where the DAC is installed.

In the dac.properties file, edit the following parameter values.

Parameter	Value
ServerHost=	Host name of the DAC server.
ServerPort=	Port of the DAC server. The default is 3141.
RepositoryStampVal=	Repository stamp that appears in the DAC client Login Details screen. To find this value, in the DAC client navigate to Help, then select Login Details.

Parameter

Value

ServerHost=

Host name of the DAC server.

ServerPort=

Port of the DAC server. The default is 3141.

RepositoryStampVal=

Repository stamp that appears in the DAC client Login Details screen.

To find this value, in the DAC client navigate to Help, then select Login Details.

Your dac.properties file should look similar to the following:

ServerHost=vranganaw8 ServerPort=3141 RepositoryStampVal=851E0677D5E1F6335242B49FCCd6519

Using the Command Line to Access the DAC Server

Follow this procedure to use the command line to access the DAC server.

To use the command line to access the DAC server

At the command prompt, enter the following:

dacCmdLine <method name> <optional execution plan name>

where method name is one of the following:

Method Name	Description
StartETL	Starts an execution plan. You must specify an execution plan name. -wait option lets you start the execution plan in synchronous mode.
StopETL	Stops the operation of an execution plan. You must specify an execution plan name.
ETLStatus	If you do not specify an execution plan name, the status of the execution plan that last ran is returned. If you specify an execution plan name, the status of the specified execution plan is returned.
DatabaseStatus	Verifies whether the DAC server can connect to all active database connections. You do not need to specify an execution plan name.
InformaticaStatus	Verifies whether the DAC server is able to ping all active Informatica PowerCenter Services machines.

Note:

The method names are case insensitive. Execution plan names are case sensitive. Also, if the execution plan name contains spaces, place beginning and ending double quotes around the name.

For example:

Command Line	Description
dacCmdLine EtlStatus	Returns the status of the execution plan that last ran.
dacCmdLine EtlStatus Forecast	Returns the status of the last instance of the Forecast execution plan.
dacCmdLine StopEtl Forecast	If the execution plan currently running is Forecast, the operation will be terminated. Otherwise, the request is ignored.
dacCmdLine databasestatus	Returns the health status of all the database connections as defined in the DAC repository from the DAC server.
dacCmdLine InformaticaStatus	Returns the health status of all the Informatica PowerCenter Services connections as defined in the DAC client on the Informatica Services tab.

DAC Repository Command Line Options

This section describes the DAC repository command line parameters that are exposed by the AutomationUtils.bat file, which is located in the OracleBI\DAC folder.

Import DAC Metadata by Application

The IMPORT option imports DAC metadata into the DAC repository for specified source system containers. The import process truncates all imported tables. You cannot perform an incremental import with this command.

Syntax:

IMPORT <folderName> <contName1> <contName2> ...

where:

Parameter	Description
`folderName`	Full path to the root of the import file structure.
`contName`	(Optional) Name of the source system container for which you want to import DAC metadata. If no container is named, all containers that are found in the file structure will be imported.

Export DAC Metadata by Application

The EXPORT option exports DAC metadata from the DAC repository for specified source system containers.

Syntax:

EXPORT <folderName> <contName1> <contName2> ...

where:

Parameter	Description
`folderName`	Full path to the root of the export file structure.
`contName`	(Optional) Name of the source system container for which you want to export DAC metadata. If no container is named, all containers that are found in the file structure will be exported.

Import DAC Metadata by Categories

The IMPORTCATEGORY option imports DAC metadata into the DAC repository based on the Logical, Run Time, or System categories. The import process truncates all imported tables. You cannot perform an incremental import with this command.

Syntax:

IMPORTCATEGORY <folderName> <logical> <runtime> <system>

where:

Parameter	Description
`folderName`	Full path to the root of the import file structure.
`logical`	Imports all data categorized as logical (information contained in the DAC Design view).
`runtime`	Imports all data categorized as run time (information contained in the DAC Execute view).
`system`	Imports all data categorized as run time (information contained in the DAC Setup view).

Export DAC Metadata by Categories

The EXPORTCATEGORY option exports DAC metadata from the DAC repository based on the Logical, Run Time, or System categories.

Syntax:

EXPORTCATEGORY <folderName> <logical> <runtime> <system>

where:

Parameter	Description
`folderName`	Full path to the root of the import file structure.
`logical`	Exports all data categorized as logical (information contained in the DAC Design view).
`runtime`	Exports all data categorized as run time (information contained in the DAC Execute view).
`system`	Exports all data categorized as run time (information contained in the DAC Setup view).

Create Schema

The CREATESCHEMA option creates the schema of a new DAC repository.

Syntax:

CREATESCHEMA <unicodeFlag> <workSpace name>

where:

Parameter	Description
`unicodeFlag`	If the value of this parameter is `true`, the schema is created as unicode. If the value is `false`, it is not created as unicode.
`workSpace name`	The name of the workspace in which the schema is created.

Drop Schema

The DROPSCHEMA option drops the schema of the DAC repository.

Syntax:

DROPSCHEMA

Analyze

The ANALYZE option analyzes the DAC repository tables.

Syntax:

ANALYZE

Upgrade

The UPGRADE option upgrades the DAC repository.

Syntax:

UPGRADE

Set Password

The SETPASSWORD option sets the passwords for the PowerCenter Services, Integration Service, and physical data sources in the DAC repository.

Syntax:

SETPASSWORD <type> <logicalName> <password>

where:

Parameter	Description
`type`	Possible values are server or dbconn.
`logicalName`	Logical name of the server or data source record in the DAC.

Note: :

If the logical name or password contains spaces, quotes are required.

Replacing an Informatica Workflow with a Custom SQL File

You can improve the performance of loads sometimes by replacing Informatica workflows with the Actions feature. For more information about actions, see "About Index, Table and Task Actions".

To replace an Informatica workflow with a custom SQL file

Create a SQL action to be used to load the table, and unit test it.

You can create one action for a full load and one for an incremental load, or you can use the same file for both full and incremental loads.
In the Tasks tab of the DAC Design view, query for the task for which you want to replace the Informatica workflow.
Change the Execution Type to SQL.
Replace the workflow name in the Command for Incremental Load or Command for Full Load fields with the SQL action.

Determining the Informatica Server Maximum Sessions Parameter Setting

You set the Maximum Sessions parameter value when you register the Informatica PowerCenter Services in the Informatica Servers tab of the DAC client. This parameter specifies the maximum number of workflows that can be executed in parallel on the PowerCenter Services. If the number of sessions is zero or is not specified, the DAC server assigns the default value of 10.

You should consider the following factors when determining the Maximum Sessions parameter value:

How powerful the machine is that hosts the PowerCenter Services.
How many instances of PowerCenter Services are available.
The number of Runnable tasks in the queue. A Runnable task is a task for which the Depends On tasks have completed and is ready to be run but is waiting for an Informatica slot to be available. For information about the different task run statuses, see "Current Run Tab".

For an optimal run, the runnable queue should be at zero or should reach zero in a short time. For example, Figure 11-1 shows an ideal performance run when 15 sessions were run in parallel. There were many runnable tasks before the process began, but the queue soon reached zero.

You can generate a run analysis such as Figure 11-1 from the right-click menu (select Get Run Information, then select Get Graph) on the DAC Current Run and Run History tabs. If you find that the runnable curve does not reach zero in a short time, you should increase the Maximum Sessions parameter value to make more Informatica slots available.

Figure 11-1 Sample Performance Run

This figure is a graph showing various ETL runs.

Determining the Number of Transactional and Data Warehouse Database Connections

This section describes how to determine the maximum number of database connections you need between the DAC server and the transactional database and the DAC server and the data warehouse database. You set the Max Num Connections parameter when you create the transactional and data warehouse database connections.

For the transactional database, the DAC server uses these connections to perform change capture. The number of connections you set for this connection pool determines how many change capture processes can run concurrently. If you have a powerful transactional database server and are going to run ETL processes during off-peak times, you can increase the Max Num Connections setting to 15 or 20 (10 is the default). If you have a less powerful transactional database server, you should not overload the operational system with ETL processes. Therefore, you should set the value below 10.

For the data warehouse database, the DAC server uses these connections to perform processes such as truncate tables, drop and create indexes, and analyze tables. You should not set the Max Num Connections value higher than the Maximum Sessions parameter value (the maximum number of workflows that can be executed in parallel in the PowerCenter Services) because these values have a one to one relationship.

Running Two DAC Servers on the Same Machine

You can run two DAC servers on the same machine as long as they are listening on different ports and pointing to two different repositories.

To run two DAC servers on the same machine

Copy the OracleBI\DAC folder to a different folder on the same machine.

For example, you might copy the C:\OracleBI\DAC folder to C:\DAC_SERVER2\DAC.
Edit the config.bat file to set the DAC_HOME variable appropriately for each instance.

For example if you copy the C:\OracleBI\DAC folder to C:\DAC_SERVER2\DAC, make sure that the C:\DAC_SERVER2\DAC\config.bat file is configured correctly.
Launch each of the DAC clients by navigating to the DAC directories and double-clicking the startclient.bat file.
For each instance, configure the DAC repository connection.
1. Navigate to Tools, then select DAC Server Management, then select DAC Server Setup.
  
  An informational dialog states this operation should be performed on the machine running the DAC server. It asks whether you want to continue.
2. Click Yes.
3. In the Repository Connection Information tab, enter the appropriate information for each instance. The Database Host should be the same for each instance, and the Database Port should be different.
For each instance, set up the DAC server system properties.
1. Navigate to Setup, then select DAC System Properties.
2. Set the DAC Server Host, OS, and Port properties.
Start each DAC server from its directory.

Customizing Index and Analyze Table Syntaxes

The customsql.xml file, located in the OracleBI\DAC\CustomSQLs directory, contains the syntax for dropping and creating indexes and analyzing tables. You can edit the customsql.xml file to change the behavior of these operations.

To edit the Analyze Table syntax

Open the customsql.xml file located in the OracleBI\DAC\CustomSQLs directory.

Locate the Analyze Table syntax for the appropriate database type.

For example, the syntax for an Oracle database is as follows:

<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL COLUMNS SIZE AUTO',cascade => true ) </SqlQuery>

Edit the syntax.

For example, to gather statistics for only the indexed columns, edit the syntax as follows:

<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL INDEXED COLUMNS',cascade => true ) </SqlQuery>

Note:

The variables @TABLEOWNER, %1, %2, and so on, will be substituted appropriately by the DAC when the statement is executed.

To edit the Create Index syntax

Open the customsql.xml file located in the OracleBI\DAC\CustomSQLs directory.
Locate the Create Index syntax for the appropriate database type, and edit the syntax.

Overview of Change Capture Process (Siebel Sources Only)

This section describes the change capture process used to extract data from the Siebel transactional database. It includes the following topics:

Initial Data Capture

For each Siebel transactional source table (S_) from which data is extracted, there is one S_ETL_I_IMG_ table and one S_ETL_R_IMG_ table.

The first time a staging table is extracted, rows that are in the specified period are selected and inserted into the appropriate S_ETL_R_IMG_ table. The specified period is set in the Prune Days parameter, in the Execution Plans tab. For more information about the Prune Days parameter, see "Execution Plans Tab".

Change Capture Mechanisms

There are two kinds of change capture mechanisms used:

Change capture using tables

This is the most common method. It uses S_ETL_I_IMG_ and S_ETL_R_IMG_ table types and the LAST_UPD column of the source tables.
Change capture using the date column

In some cases, a predefined date column is used to enable change capture (without use of S_ETL_I_IMG_ and S_ETL_R_IMG_ tables).

Change Capture Using Tables

When S_ tables are extracted, the process looks for row ID information (the combination of ROW_ID and MODIFICATION_NUM from the S_ table) that does not exist in the row image table (S_ETL_R_IMG_) and in columns where the value for LAST_UPD is more recent than that of LAST_REFRESH_DATE minus the Prune Days parameter setting. This information is inserted into S_ETL_I_IMG table. The S_ETL_I_IMG_ table is joined with the base table during the SDE extraction process to extract only the change capture rows during refresh.

S_ETL_R_IMG tables store the ROW_ID, MODIFICATION_NUM, and LAST_UPD for the rows in the S_ table that have the LAST_UPD in the defined Prune Days period. The LAST_UPD column is used to delete the records from the S_ETL_R_IMG table. Records are deleted as soon as they go beyond the Prune Days period. This table is used to make sure that records that fall in the Prune Days period are not captured as updates unless they have actually been updated. This guarantees an efficient and faster change capture refresh. For information about tracking deleted records, see

Once the ETL process is completed, the data from the change capture image tables (S_ETL_I_IMG_) is pushed into the row image (S_ETL_R_IMG) table. The S_ETL_R_IMG_ information is subsequently used in the next refresh.

Although the LAST_UPD column in Siebel transactional tables is used for change capture, the timestamp reflects the time the data is committed in the database, rather than the actual transaction event time. This may happen because of remote synchronization, handheld synchronization, UTC conversion, and other processes in which there may be a significant lag between the time of a transaction and its commitment to the database. It is possible, therefore, for a data row to get committed to the transactional database with a LAST_UPD date that is older than the date on which last refresh was executed. Consequently, if the extract is based purely on the date in LAST_UPD, some rows might be missed during extraction.

The LAST_UPD date column, however, still permits change capture process efficiency by limiting the number of rows that have to be compared. The rows from transactional tables are filtered based on the LAST_UPD date being more recent than the LAST_REFRESH_DATE, minus the prune days. Then the ROW_ID and MODIFICATION_NUM combination is compared with the row image table to discover the changed records.

The Prune Days parameter ensures that the rows having LAST_UPD values older than LAST_REFRESH_DATE are not missed. This is a parameter that customers can set based on experience with processes (such as remote synchronization) that may cause records to be missed.

Primary and Auxiliary Tables

The DAC performs change capture for both primary and auxiliary tables. When more than one source table is involved, then both the auxiliary and primary table records need to be marked as changed. For auxiliary tables, you need to write auxiliary mappings to mark the primary tables as changed. The SQL queries that do this are part of the mapping SDEINC_FindAux_.

The extract logic sometimes requires that rows be considered as changed, even if the record has not been updated in the primary table (and therefore extracted during the SDE process). This situation occurs when child table rows have changed and the header/master rows need to be extracted so that data warehouse tables are loaded with a consistent set.

When the S_CONTACT_X row is changed, the corresponding S_CONTACT also needs to be extracted. In this case, rows in S_CONTACT are also marked as changed by inserting rows in the change capture row image table.

When the S_ORDERITEM row is changed, the corresponding S_DOC_ORDER also needs to be extracted. In this case, rows in S_DOC_ORDER are also marked as changed (by inserting rows in the change capture row image table).

These auxiliary changes capture processes are heavily dependent on the data warehouse data model and are required to support the ETL logic.

Example: Building S_ETL_I_IMG_ Table for Loading Account Dimension

This section gives an extended example of the process of change capture using tables.

Load image tables for all relevant source tables.

The content of this entity comes from the S_ORG_EXT and S_ORG_EXT_X tables. Whenever any of the rows change in either of these tables, the record is marked as changed.

The image table for S_ORG_EXT is S_ETL_I_IMG_26. The image table prefix can be found using the DAC to view any source table. This table is truncated before loading with fresh data during every refresh.

During the ETL, process rows are inserted into S_ETL_I_IMG_26 by selecting ROW_ID information from S_ORG_EXT, for rows (combined ROW_ID and MODIFICATION_NUM) that do not exist in the S_ETL_R_IMG_26 and for which LAST_UPD is more recent than LAST_REFRESH_DATE minus the Prune Days setting. This is done during the ETL execution by the DAC's internal image building tasks.

Similarly, the image table S_ETL_I_IMG_27 for S_ORG_EXT_X is loaded.
Load the image table for auxiliary table-based changes.

In addition to the basic change capture, extra processing might be required due to special ETL requirements. In this example, it happens that S_ORG_EXT needs to be extracted for processing even if only S_ORG_EXT_X changes. This is because both the tables are joined to form W_ORG_D, and the extract process of W_ORG_D (a SDE mapping) looks for a changed ROW_ID in the change capture row image table for the primary table S_ORG_EXT only. Therefore, the extract happens only when the ROW_ID for S_ORG_EXT exists in the row image table.

In this case, the SDEINC_FindAux_ mapping is needed to insert corresponding rows of S_ORG_EXT.ROW_ID in the change capture row image table whenever S_ORG_EXT_X changes. The following logical statement shows the method:

Identify the records that have changed in the S_ORG_EXT_X (rows in S_ETLI_IMG_27) table and then find the corresponding rows in S_ORG_EXT. Insert the ROW_ID and MODIFICATION_NUM of those corresponding rows from S_ORG_EXT into S_ETL_I_IMG_26 table.

Using Informatica, the auxiliary mapping SDEINC_FindAux_ has to be written for each primary table that requires it, depending on data warehouse extract logic. Using the DAC, this auxiliary task has to be linked as a parent to the extract mapping for the base table (S_ORG_EXT in this case).

This is the SQL override for the SDEINC_FindAux Informatica mapping:
```
SELECT
   S_ORG_EXT.ROW_ID,
   S_ORG_EXT.MODIFICATION_NUM,
   S_ORG_EXT.LAST_UPD
FROM
   S_ORG_EXT,
   S_ORG_EXT_X,
   S_ETL_I_IMG_27 IMG
WHERE
   (
   IMG.ROW_ID = S_ORG_EXT_X.ROW_ID
   AND
   S_ORG_EXT_X.PAR_ROW_ID = S_ORG_EXT.ROW_ID
   )
   AND NOT EXISTS
   (  SELECT 'X'
      FROM
      S_ETL_I_IMG_26 IMG1
      WHERE
      IMG1.ROW_ID = S_ORG_EXT.ROW_ID
   )
```
Extract source table information using change capture image information.

After the records that are new or modified are identified, those rows are loaded into the staging tables. The Informatica mappings that load the staging tables use the ROW_ID information captured in the image tables.

This example shows the loading of staging table W_ORG_DS. The main logic of populating this table lies in the SQL override of the mapping SDE_OrganizationDimension.

The DAC creates views on tables that are being extracted. The views are different, depending on whether a table is extracted the first time or is a change capture extract.
- If extracting for the first time, the view is created as SELECT * FROM S_ORG_EXT.
- If it is a change capture extract, the view is created as SELECT * FROM S_ORG_EXT, S_ETL_I_IMG_26 IMG WHERE S_ORG_EXT.ROW_ID = IMG.ROW_ID.
The SQL override in the mapping uses the view to extract the data.
```
SELECT
   S_ORG_EXT.ROW_ID,
   S_ORG_EXT.NAME, …..
   ….
   ….
FROM
   V_ORG_EXT,
   S_ORG_EXT_X,
   …..
WHERE
   {
      V_ORG_EXT S_ORG_EXT
      LEFT OUTER JOIN S_ORG_EXT_X ON
      S_ORG_EXT.ROW_ID = S_ORG_EXT_X.PAR_ROW_ID
      …..
   }
AND
   S_ORG_EXT.ROW_ID <> 'INT_COMPANY_ID'
```

Change Capture Using the Date Column

Forecasts are extracted without using the image table. The value S_FCSTSER_DATE is tracked using the date column ARCHIVE_TS. The administrator sets the ARCHIVE_TS for forecasts that are submitted, frozen, and ready to be loaded into the Oracle Business Analytics Warehouse. S_ETL_RUN stores the previous ETL date when forecasts were extracted and the current ETL date when the forecasts are being extracted. All forecasts with ARCHIVE_TS values greater than that of the previous ETL date and ARCHIVE_TS (less than the current ETL date) are extracted in the current ETL. Both ETL date and ARCHIVE_TS (less than the current ETL date) are stored in S_ETL_CURR_RUN.

Note:

Forecasts in the Oracle Business Analytics Warehouse are never updated. Once loaded, they are frozen.

SELECT
…..
FROM
   S_FCSTSER_DATE,
   S_FCSTSER,
   S_ETL_CURR_RUN,
   ……..
WHERE
   S_FCSTSER_DATE.FCSTSER_ID =  S_FCSTSER.ROW_ID
   AND S_FCSTSER_DATE.ARCHIVE_TS > S_ETL_CURR_RUN.PREV_LOAD_DT 
   AND S_FCSTSER_DATE.ARCHIVE_TS <= S_ETL_CURR_RUN.LOAD_DT

Using the Change Capture Filter

The change capture filter enables you to selectively load data from the Siebel transactional database into the data warehouse. You can set a condition in the ChangeCaptureFilter.xml file to filter data from specific tables. This file is located in the OracleBI\DAC\CustomSQLs directory. It provides an XML sample that you can copy and alter to fit your needs. Detailed instructions for using this feature are included at the top of the file.

Tracking Deleted Records

The Oracle Business Analytics Warehouse change capture process uses delete triggers to identify records for deletion on the Siebel transactional database. The deleted records are stored in S_ETL_D_IMG tables. During the change capture process, the DAC server moves the data from the S_ETL_D_IMG tables to the S_ETL_I_IMG tables, where D appears in the OPERATION column to show the records were deleted. During the change capture sync process, the records in the S_ETL_D_IMG tables that were moved to the S_ETL_I_IMG tables are flushed. In the DAC, you can view the SQL that runs during the change capture and change capture sync processes by using the Output to File right-click command in the Tasks tab of the Design view.

The preconfigured ETL process captures deleted records for the target tables W_ORG_D and W_PERSON_D, the source tables for which are S_ORG_EXT, S_CONTACT, and S_PRSP_CONTACT. These source tables need to have delete triggers created in the Siebel transactional database in order for deleted records to be tracked.

For vertical applications, the preconfigured ETL process captures deleted records for W_FUND_F and W_ALIGNMT_DH. You need to create delete triggers in the transactional database for the following additional tables: S_MDF_TXN, S_ASGN_GRP_POSTN, S_ASGN_RULE_ITEM.

In the Oracle Business Analytics Warehouse, preconfigured visibility tables are inactivated. If you activate visibility tables, you should also create delete triggers on the optional tables.

The preconfigured SIA Account and Contact visibility tables are activated by default for vertical applications. If your organization is not going to use any of the visibility tables, you need to inactivate them in the DAC.

On the target tables for which deleted records are tracked, a D appears in the INACTIVE_FLG column to show the records as deleted when the source records are deleted. This method of flagging a record as deleted is known as a soft delete, as compared to a hard delete when the record is physically deleted. When deleted records are tracked on visibility-related data warehouse tables, the records are physically deleted. The general rule is that soft deletes should be used for tables that are referenced by other tables. If a table is not referenced by any other table, then you can use hard deletes.

Aggregate tables are rebuilt during each ETL process. Therefore, records can be physically deleted from the base tables without consequence. If you want to use the soft delete method, you should consider changing the aggregate building mappings so that the deleted records are omitted.

Note:

The Oracle BI Server does not recognize soft deletes. Therefore, you have to modify the .rpd file so that it does not pick up soft-deleted records for reporting.

To create delete triggers for preconfigured ETL change capture

From the DAC menu bar, select Tools, then select ETL Management, then select Configure.
In the Sources dialog, select the database platform for the target and transactional databases, and click OK.
In the Data Warehouse Configuration Wizard, select the Create Delete Triggers in Transaction Database check box, and click Next.

The Delete Triggers tab is active.

Select one of the following:

Option	Description
Create Triggers	Executes the trigger statements directly.
Write Script to File	Writes the trigger statements to a file, which can be executed by a database administrator.

Select the database type as defined in the DAC.
For DB2 zSeries databases, enter the base table owner.
(Optional) Select the Include Optional Triggers check box to create triggers for the optional tables.
Click Start.

To create delete triggers for new source tables

In the DAC, navigate to the Design view, then select Tables.
Select the table for which you want to track deleted records.

Make sure the table has an image suffix.
Right-click the table and select Change Capture Scripts, then select Generate Image and Trigger Scripts.
In the Triggers and Image Tables dialog, select the database type of the source database.
Make sure the Generate Image Table Scripts and Generate Trigger Script(s) options are selected.
Execute the script on the database.

To track deleted records

Make sure the delete triggers are enabled for the appropriate tables.
Write custom Informatica workflows with a clause WHERE operation = 'D' to the appropriate I_IMG table to take them across to the dimension and fact tables.
In the DAC, register the workflows as tasks.
Define the appropriate dependencies.

For an example of such a workflow, see the preconfigured task SDE_OrganizationDimension_LoadDeletedRows.

Pointing Multiple Informatica PowerCenter Services to a Single Informatica Repository

You can install multiple Informatica PowerCenter Services and point them to a single Informatica repository. You need to register each instance of PowerCenter Services in the DAC and specify a unique machine name and server name. For instructions on registering PowerCenter Services in the DAC, see the Oracle Business Intelligence Applications Installation and Configuration Guide.

Handling ETL Failures with the DAC

This section includes the following topics:

When the Execution of an Execution Plan Fails
Discarding the Current Run Execution Plan
Failure of Aggregator Transformation Tasks with Sorted Input

When the Execution of an Execution Plan Fails

When an execution plan is executed, if a task fails, the status of the tasks that are dependent on the failed task is changed to Stopped. While tasks are still running, the execution plan's status is Running. When all the tasks have been run, and if one or more tasks have failed, the execution plan's status is changed to Failed.

You can check the tasks that have failed in the Current Run tab of the Execute view, fix the problems, and then requeue the failed tasks by changing the status to Queued. You can then restart the ETL. All the tasks will then be rerun. You can also manually run a task, change its status to Completed, and then restart the ETL. Tasks with a Completed status are skipped.

Caution:

The DAC server does not validate tasks that have been run manually.

To restart a failed ETL, click Run Now from the Current Run tab of the Execute View.

In Case of Abnormal Termination of the DAC Server

If the DAC server fails during the execution of the ETL, the status of the ETL execution will remain as Running. When the DAC server is started again, it will automatically run the ETL if the Auto Restart ETL DAC system property is set to True. If the same system property is set to False, when the server restarts, it will set the correct status as Failed. In order to execute the ETL from the point of failure, submit the request to the server again.

The DAC server will automatically terminate if it looses connection to the DAC repository.

Discarding the Current Run Execution Plan

You can discard an execution plan that failed by navigating to the Current Run tab, right-clicking on the execution plan and changing its status to Mark as Completed. This will force the run status to be updated as Completed. When you submit a request for another run, the DAC server creates another instance of it.

Caution:

Perform this procedure in a development or testing environment only, since it might leave the data in an inconsistent state, causing you to have to reload all of the data.

Failure of Aggregator Transformation Tasks with Sorted Input

Tasks that use Informatica Aggregator transformation can fail when the Sorted Input option is active. The tasks SDE_DTLFORECASTFACT and SDE_COSTLIST are examples of tasks that can fail in such a situation.

To prevent such tasks from failing, in PowerCenter Designer, navigate to Mapping Designer, open the corresponding mapping, and in the Aggregator transformation, remove the check from the Sorted Input check box.

Forcing Password Encryption When DAC Interacts With Informatica

DAC sends the Informatica repository and server passwords un-encrypted when communicating with Informatica through pmcmd and pmrep commands.

You can force password encryption by following the procedure below:

Note:

In the examples included in the following procedure, the Informatica server and Informatica Repository server use the password Administrator.

Open a command window and type the following command to create an encrypted Informatica password for pmcmd and pmrep

pmpasswd Administrator -e CRYPT_SYSTEM

This step will produce something similar to the following text:

Informatica PowerMart Encryption Utility, 32-bit
Copyright (c) Informatica Corporation 1996-2008
All Rights Reserved

Encrypted string 
–>dMGpMvpsuQwXD5UvRmq0OZxhppTWK0Y7fzBtxHL04Gg=<–
Will decrypt to –>Administrator<–

Create a new environment variable with the following properties.

Name– INFA_PASSWORD (Or any name that you choose.)

Value– dMGpMvpsuQwXD5UvRmq0OZxhppTWK0Y7fzBtxHL04Gg=

Note:

The value should be exactly as shown in the encrypted message in the Command window (the value between --> and <--).
Modify the file DAC\conf\infa_command.xml by replacing all occurrences of <-p> with <-pv> and <-x> with <-X>.
Stop the DAC server.
Log into the DAC client and navigate to the Setup menu and choose the Informatica Servers tab.
Highlight the Informatica Server record and enter the name of the environment variable that you created in Step 2 of this procedure as the value of Password. Save the record.
Repeat the previous step for the Informatica Repository Server.
Close and re-open the client to test the connections.
If the DAC server is located on the same machine, start the DAC server and run ETL.

Verify that DAC issues the following pmcmd command.

pmcmd startworkflow -sv BI_DW_Server -d <Domain> -u Administrator -pv **** -f
<folder> -lpf <filename><workflow>
INFORMATICS TASK:<workflow> has finished execution with Completed status.

If the DAC server is on another Windows machine, do the following:
1. Set the environmental variable on the server machine and modify the DAC\conf\infa_command.xml.
2. Shut down the DAC server when testing Informatica connectivity or unit testing a DAC task via the DAC client.
If the DAC server is on a non-Windows machine, do the following:
1. Set the environmental variable on the server machine and modify the DAC\conf\infa_command.xml.
2. Shut down the DAC server when testing Informatica connectivity or unit testing a DAC task via the DAC client.

Upgrading the Data Warehouse Schema on Teradata

When the data warehouse schema is on a Teradata database and you use the Data Warehouse Configuration Wizard to upgrade the data warehouse schema, DAC generates four files: two SQL files and two log files. DAC stores these files in DAC\conf\sqlgen\sql\teradata. A description of the files follows.

upgrade-regular.sql. This file contains scripts that DAC has verified are necessary to upgrade the data warehouse schema. The scripts handle new or altered tables and columns. For example, they may increase the size of simple data types or change a column from null to not null.
upgrade-questionable.sql. This file contains scripts that DAC suggests may be required to upgrade the data warehouse schema. These scripts are intended to be an aid in determining upgrade requirements and must be reviewed by a Teradata DBA and corrected if necessary.
upgradedwtables_sql.log. This file is the log file that corresponds to the upgrade-regular.sql file. It is for information purposes only.
upgrade-issues.log. This file is the log file that corresponds to the file upgrade-questionable.sql. It is for information purposes only.

Using JDBC Connection URL Override to Handle Database Connections

DAC provides a mechanism for overriding the JDBC connection strings for some database types. This is done by modifying the connection_template.xml document. The location of this document is DAC\conf\connection_template.xml. Note that these changes to the connection_template.xml document apply to all instances of the database type.

For example, if the URL to connect to MSSQL is modified in the file connection_template.xml, the modifications apply to all MSSQL connections that appear in DAC (for example, source connections, target connections, and connection to the DAC Server). Note that in the case where there are multiple source connections of type MSSQL and users need to connect to each of them using a different URL, this process will not work.

You can override connection strings with different JDBC URLs for each connection. You can configure the connections using the following methods:

You can configure connections to different databases by using the fields JDBC URL Override and JDBC Driver Override to add connection specific JDBC URLs. To access these fields, go to the Setup menu and choose Physical Data Sources tab. When entering text into these fields, you must be sure to conform to the database specifications.
You can configure the DAC Client connection to create or connect to the DAC repository with user specified URLs in the DAC client login screen.
The DAC Server can be configured using specific JDBC URLs.
- If the DAC Server is installed on Windows, you can configure specific JDBC URLs by accessing the client and using the Server Set Up menu.
- If the DAC Server is installed on Unix or Linux, you can configure the JDBC URLs by executing ServerSetupPrompt.sh and following the on screen instructions.

Using Parameters to Specify Full or Incremental Loads

Parameters can be registered in DAC as either Source System Parameters or Task Level Parameters.

Source System Parameters – A list of parameter names and the values applicable to each source system can be registered by accessing DAC, then Design View, and using the Source System Parameter tab.

Task Level Parameters – Parameters that apply to all tasks under a source system may be registered under the Source System Parameters. If there are parameters that are specific to particular tasks, developers can create such task specific parameters by accessing the Task tab and using the Parameters subtab. Task level parameters have priority over source system parameters. If the same parameter is registered as a source system parameter and as a task level parameter, DAC will evaluate the parameter with the task level value.

You can specify different values for both Source System Parameters and Task Level Parameters for full load ETL runs and/or for incremental load ETL runs. The "Load Type" is for specifying a value as it applies to the parameter.

For example, if you have a parameter called "SourceSystemParameter" at the source system level, you can define this parameter as shown in Table 11-1.

Table 11-1 Example of Source System Parameter


SourceSystemParameter	Full	Source_System_Parameter_Full_Value
SourceSystemParameter	Incremental	Source_System_Parameter_Incremental_Value
SourceSystemParameter	Both	Source_System_Parameter_Both_Value

During ETL runtime, DAC automatically evaluates this parameter to "Source_System_Parameter_Full_Value" if the ETL is in Full mode and to "Source_System_Parameter_Incremental_Value" if the ETL is in an incremental run.

In this example, load type "Both" is redundant, as there is a value defined for full and incremental values. If a value is undefined for Full or Incremental, then DAC picks the value defined for both.

Note that the behavior is the same for parameters at the task level.

Mapping Multiple Database-Specific Informatica Workflows to the Same DAC Task

This section includes instructions for mapping multiple, database-specific Informatica workflows to the same DAC task. This is accomplished by parameterizing the Informatica workflow command. At runtime, DAC determines which workflow to run based on the parameterization.

Follow the instructions below to map multiple, database-specific workflows to the same DAC task. These instructions use SIL_PersonDimension_Full as an example of a full command and SIL_PersonDimension as an example of an incremental command on an Oracle database and SIL_PersonDimension_Full_TD and SIL_PersonDimension_TD as full and incremental commands, respectively, on a Teradata database.

To map multiple database-specific Informatica workflows to the same DAC task:

In the DAC Design view, go to the Tasks tab.
Query for the task to which you want add multiple workflows.
Select the task, and then click the Parameters subtab.
Create a new parameter for a full load command:
1. Click New in the subtab toolbar.
2. In the Name field, enter $$workflow_CMD_PARAMETER.
3. In the Data Type field, select DB Specific Text.
4. In the Load Type field, select Full.
5. Click in the Value field to open the Enter Parameter Value dialog box.
6. In the Connection Type field, select @DAC_TARGET_DBTYPE.
7. In the appropriate database fields, enter the full command name for both database types.
  
  For example, enter SIL_PersonDimension_Full in the Oracle field and SIL_PersonDimension_Full_TD in the Teradata field.
Create a new parameter for an incremental load command:
1. Click New in the subtab toolbar.
2. In the Name field, enter $$workflow_CMD_PARAMETER.
3. In the Data Type field, select DB Specific Text.
4. In the Load Type field, select Incremental.
5. Click in the Value field to open the Enter Parameter Value dialog box.
6. In the Connection Type field, select @DAC_TARGET_DBTYPE.
7. In the appropriate database fields, enter the incremental command name for both database types.
  
  For example, enter SIL_PersonDimension in the Oracle field and SIL_PersonDimension_TD in the Teradata field.
With the same task selected, click the Edit subtab.
In the Command for Incremental Load field, enter @DAC_$$workflow_CMD_PARAMETER.
In the Command for Full Load field, enter @DAC_$$workflow_CMD_PARAMETER.
Click Save.

Connecting to the DAC Repository When Using Oracle RAC

When the DAC repository is on an Oracle database that uses Oracle Real Application Cluster (RAC), the standard URL for configuring the connections between the DAC Client and the DAC repository and the DAC Server and the DAC repository does not work. (The standard URL is jdbc:oracle:thin:@<HOST>:1521:<SID>.)

To configure the connection between the DAC client and the DAC repository and between the DAC Server and DAC repository use the following URL:

jdbc:oracle:thin:@(DESCRIPTION=(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=<host1>) (PORT=<port1>))
(ADDRESS=(PROTOCOL=TCP)(HOST=<host2>) (PORT=<port2>))
(CONNECT_DATA=(SERVICE_NAME=<service name>)))
...

Enter this URL in the "DB URL" field in the Configuring… dialog. This dialog is used to configure the connection between the DAC Client and the DAC repository.
Repository Connection Information tab in the Server Configuration dialog. This dialog is used to configure the connection between the DAC Server and the DAC Client.