6 Common Tasks Performed in the DAC

This chapter contains the following topics:

Importing DAC Metadata
Exporting DAC Metadata
Distributing DAC Metadata
Running the DAC Server Automatically
Command Line Access to the DAC Server
DAC Repository Command Line Options
Replacing an Informatica Workflow with a Custom SQL File
Determining the Number of Transactional and Data Warehouse Database Connections
Running Two DAC Servers on the Same Machine
Customizing Index and Analyze Table Syntaxes
Using SQL Files as an Execution Type in the DAC
Overview of Change Capture Process (Siebel Sources Only)
Using the Change Capture Filter
Tracking Deleted Records
Pointing Multiple Informatica PowerCenter Services to a Single Informatica Repository
Handling ETL Failures with the DAC

Importing DAC Metadata

The DAC's Import/Export feature enables you to import or export source system-specific DAC metadata into or out of the DAC repository. You can use this feature to migrate DAC metadata from one environment to another, such as from a development environment to test or production environments.

To import DAC metadata

In the DAC menu bar, select Tools, then select DAC Repository Management, then select Import.
Select the directory from which you want to import DAC metadata, or accept the default directory.
Select the appropriate source system containers.
Select the appropriate categories of metadata you want to import:
- Logical. Imports all information contained in the Design view and database connection information.
- System. Imports all information contained in the Setup view, except passwords for servers and database connections.
- Run Time. Imports information about ETL runs (contained in the Execute view)
If you are importing metadata into a blank repository or to completely replace the current metadata in the repository, select Truncate Repository Tables.

This action overwrites the content in the current repository. Selecting the Truncate Repository Tables option greatly increases the speed of the import process.
(Optional) Select Enable Batch Mode to insert the imported metadata into the repository as an array insert.

This action increases the speed of the import process.
Click OK.
Verify the import process by reviewing the log file \OracleBI\DAC\log\import.log.

Exporting DAC Metadata

To export DAC metadata

In the DAC menu bar, select Tools, then select DAC Repository Management, then select Export.
Select the directory to which you want to export DAC metadata, or accept the default directory.
Select the appropriate source system containers.
Select the appropriate categories of metadata you want to export:
- Logical. Exports all information contained in the Design view and database connection information.
- System. Exports all information contained in the Setup view, except passwords for servers and database connections.
- Run Time. Exports information about ETL runs (contained in the Execute view).
Click OK.
Verify the export process by reviewing the log file \OracleBI\DAC\log\export.log.

Distributing DAC Metadata

Typically, you may have multiple environments, such as development, QA, production, and so on. When you make changes to the development environment, you test it, and then deliver it, exporting the whole environment and distributing it to the other environments. The data is exported as XML files, which are stored in the DAC\export directory on the client machine where the export is done.

To apply changes from the development environment to any other, you copy all of the XML files into the DAC\export folder and then import the data. To export the DAC metadata, follow the instructions in the procedure, "Exporting DAC Metadata". To import the DAC metadata, follow the instructions in the procedure, "Importing DAC Metadata".

Running the DAC Server Automatically

Follow this procedure to set up the DAC server to be run automatically when your machine reboots.

To set up the DAC server to run automatically upon rebooting the machine

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Double-click Add Scheduled Task.
In the Scheduled Task Wizard, browse to the startserver.bat file, and click Open.
Select the option "When my computer starts," and click Next.
Enter the domain user account to start the DAC server and a password, and click Finish.

The startserver task appears in the Scheduled Task window.
Right-click the task and select Properties.
In the Settings tab, remove the check from the "Stop the task if it runs for 72 hours" check box.

To start the DAC server as a scheduled task

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Right-click startserver, and then click Run.

To stop the DAC server as a scheduled task

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Right-click startserver, and then click End Task.

To check if the DAC server is running

Navigate to Programs, then select Accessories, then select System Tools, then select Scheduled Tasks.
Select the startserver task.
In the Windows menu bar, select View, then select Details.

Command Line Access to the DAC Server

This section covers accessing the DAC server through a command line. It includes the following topics:

Setting Up Command Line Access to the DAC Server
Using the Command Line to Access the DAC Server

You can access the DAC server through a command line to start and stop execution plans and to get status information for servers, databases, and execution plans. This feature enables you to access the DAC server using third-party administration or management tools, without using the DAC client.

Command Line Operations

The command line feature enables you to start an execution plan and stop the operation of a running execution plan.

Starting an Execution Plan

When the DAC server receives a request to start an execution plan, it performs a series of checks to verify that the execution plan can be started. It first checks that an execution plan with the requested name exists and that the execution plan is active. Next, it checks the status of the execution plan that last ran. If an execution plan is still running and the DAC server receives a request to start another execution plan, the request will be rejected. If an execution plan failed, a request to run the same execution plan again will be executed; however, a request to run a different execution plan will be rejected. If the execution plan that last ran completed successfully, a request to run a new execution plan will be executed.

When the DAC server receives a request to start an execution plan, it will issue a warning if any of the following conditions are true. (A warning is for informational purposes and does not mean the execution plan will not start.)

The Generic Task Concurrency Limit value in the DAC System Properties tab is not a positive number.
There are no active Informatica PowerCenter Services or Integration Service registered in the Informatica Servers tab.
One or more Informatica PowerCenter Services or Integration Service do not have the passwords defined in the Informatica Servers tab.
One or more Informatica PowerCenter Services or Integration Service do not have a Maximum Sessions number properly defined in the Informatica Servers tab.
One or more data sources do not have the Table Owner or Table Owner Password values properly defined in the Physical Data Sources tab.
One or more data sources do not have a maximum number of connections (Max Num Connections) value properly defined in the Physical Data Sources tab.
One or more data sources do not have a Data Source Number defined in the Physical Data Sources tab.

Stopping the Operation of a Running Execution Plan

When the DAC server receives a request to stop the operation of a running execution plan, the request will fail in the following cases:

The name of the execution plan that is running is different from the name in the request.
There is no execution plan currently running.

Command Line Status Monitoring Queries

The command line feature enables you to get the following status information:

Summary of the requested execution plan. If there are multiple instances of the same execution plan, a summary of the instance that last ran is returned. Below is an example of the information contained in the summary.

(c) 2003 Siebel Systems, Inc.
Siebel DAC Server comprising the etl execution-management,  scheduler, logger, and network server.
ETL details for the last run:
ETL Process Id : 255 ETL Name : Complete ETL Run Name : DRY RUN OF Complete ETL: ETL Run - 2004-06-17 18:30:13.201 DAC Server : (aqamarD510) DAC Port : 3141 Status: Stopped Log File Name: Complete_ETL.255.log Database Connection(s) Used : 


OLTP jdbc:microsoft:sqlserver://vranganaw8:1433;DatabaseName=OLTP Data Warehouse jdbc:microsoft:sqlserver://vranganaw8:1433;DatabaseName=olap

Informatica Server(s) Used  : 

InformaticaServer4-vranganaw8:(4)  InformaticaServer2-vranganaw8:(2)  InformaticaServer3-vranganaw8:(3)  InformaticaServer1-vranganaw8:(10) 

Start Time: 2004-06-17 19:00:06.885 Message: ETL was interrupted Actual Start Time: 2004-06-17 18:30:13.357 End Time: 2004-06-17 19:05:56.781 Total Time Taken: 35 Minutes
Start Time For This Run: 2004-06-17 19:00:06.885 Total Time Taken For This Run: 5 Minutes
Total steps: 212 Running steps: 0 Complete steps: 142 Failed/Stopped steps:70

Summary of connection status to all active databases and Informatica servers.

Setting Up Command Line Access to the DAC Server

The Command Line utility enables you to invoke commands on a DAC server running on a remote or local machine. The Command Line utility does not need the entire DAC environment. The machine on which you invoke the commands for the DAC server requires only the files DAWSystem.jar, dac.properties, and dacCmdLine.bat.

To set up command line access to the DAC server

Make sure you have installed the supported version of the Java SDK.
Copy the following files from the OracleBI\DAC directory to a local directory:
- DAWSystem.jar
- dac.properties
- dacCmdLine.bat
In the dacCmdLine.bat file, do the following:
1. Edit the JAVA_HOME variable to point to the directory where the Java SDK is installed.
Make sure there are no spaces in the path reference.
1. Edit the DAC_HOME variable to point to the directory where the DAC is installed.

In the dac.properties file, edit the following parameter values.

Parameter	Value
ServerHost=	Host name of the DAC server.
ServerPort=	Port of the DAC server. The default is 3141.
RepositoryStampVal=	Repository stamp that appears in the DAC client Login Details screen. To find this value, in the DAC client navigate to Help, then select Login Details.

Your dac.properties file should look similar to the following:

ServerHost=vranganaw8 ServerPort=3141 RepositoryStampVal=851E0677D5E1F6335242B49FCCd6519

Using the Command Line to Access the DAC Server

Follow this procedure to use the command line to access the DAC server.

To use the command line to access the DAC server

At the command prompt, enter the following:

dacCmdLine <method name> <optional execution plan name>

where method name is one of the following:

Method Name	Description
StartETL	Starts an execution plan. You must specify an execution plan name.
StopETL	Stops the operation of an execution plan. You must specify an execution plan name.
ETLStatus	If you do not specify an execution plan name, the status of the execution plan that last ran is returned. If you specify an execution plan name, the status of the specified execution plan is returned.
DatabaseStatus	Verifies whether the DAC server can connect to all active database connections. You do not need to specify an execution plan name.
InformaticaStatus	Verifies whether the DAC server is able to ping all active Informatica PowerCenter Services machines.

Note:

The method names are case insensitive. Execution plan names are case sensitive. Also, if the execution plan name contains spaces, place beginning and ending double quotes around the name.

For example:

Command Line	Description
dacCmdLine EtlStatus	Returns the status of the execution plan that last ran.
dacCmdLine EtlStatus Forecast	Returns the status of the last instance of the Forecast execution plan.
dacCmdLine StopEtl Forecast	If the execution plan currently running is Forecast, the operation will be terminated. Otherwise, the request is ignored.
dacCmdLine databasestatus	Returns the health status of all the database connections as defined in the DAC repository from the DAC server.
dacCmdLine InformaticaStatus	Returns the health status of all the Informatica PowerCenter Services connections as defined in the DAC client on the Informatica Services tab.

DAC Repository Command Line Options

This section describes the DAC repository command line parameters that are exposed by the AutomationUtils.bat file, which is located in the OracleBI\DAC folder.

Import DAC Metadata by Application

The IMPORT option imports DAC metadata into the DAC repository for specified source system containers. The import process truncates all imported tables. You cannot perform an incremental import with this command.

Syntax:

IMPORT <folderName> <contName1> <contName2> ...

where:

Parameter	Description
`folderName`	Full path to the root of the import file structure.
`contName`	(Optional) Name of the source system container for which you want to import DAC metadata. If no container is named, all containers that are found in the file structure will be imported.

Export DAC Metadata by Application

The EXPORT option exports DAC metadata from the DAC repository for specified source system containers.

Syntax:

EXPORT <folderName> <contName1> <contName2> ...

where:

Parameter	Description
`folderName`	Full path to the root of the export file structure.
`contName`	(Optional) Name of the source system container for which you want to export DAC metadata. If no container is named, all containers that are found in the file structure will be exported.

Import DAC Metadata by Categories

The IMPORTCATEGORY option imports DAC metadata into the DAC repository based on the Logical, Run Time, or System categories. The import process truncates all imported tables. You cannot perform an incremental import with this command.

Syntax:

IMPORTCATEGORY <folderName> <logical> <runtime> <system>

where:

Parameter	Description
`folderName`	Full path to the root of the import file structure.
`logical`	Imports all data categorized as logical (information contained in the DAC Design view).
`runtime`	Imports all data categorized as run time (information contained in the DAC Execute view).
`system`	Imports all data categorized as run time (information contained in the DAC Setup view).

Export DAC Metadata by Categories

The EXPORTCATEGORY option exports DAC metadata from the DAC repository based on the Logical, Run Time, or System categories.

Syntax:

EXPORTCATEGORY <folderName> <logical> <runtime> <system>

where:

Parameter	Description
`folderName`	Full path to the root of the import file structure.
`logical`	Exports all data categorized as logical (information contained in the DAC Design view).
`runtime`	Exports all data categorized as run time (information contained in the DAC Execute view).
`system`	Exports all data categorized as run time (information contained in the DAC Setup view).

Create Schema

The CREATESCHEMA option creates the schema of a new DAC repository.

Syntax:

CREATESCHEMA <unicodeFlag> <workSpace name>

where:

Parameter	Description
`unicodeFlag`	If the value of this parameter is `true`, the schema is created as unicode. If the value is `false`, it is not created as unicode.
`workSpace name`	The name of the workspace in which the schema is created.

Drop Schema

The DROPSCHEMA option drops the schema of the DAC repository.

Syntax:

DROPSCHEMA

Analyze

The ANALYZE option analyzes the DAC repository tables.

Syntax:

ANALYZE

Upgrade

The UPGRADE option upgrades the DAC repository.

Syntax:

UPGRADE

Set Password

The SETPASSWORD option sets the passwords for the PowerCenter Services, Integration Service, and physical data sources in the DAC repository.

Syntax:

SETPASSWORD <type> <logicalName> <password>

where:

Parameter	Description
`type`	Possible values are server or dbconn.
`logicalName`	Logical name of the server or data source record in the DAC.

Note: :

If the logical name or password contains spaces, quotes are required.

Replacing an Informatica Workflow with a Custom SQL File

You can improve the performance of loads by replacing Informatica workflows with custom SQL files.

To replace an Informatica workflow with a custom SQL file

Create a SQL file to be used to load the table, and unit test it.
Create an XML or SQL file with one or more SQL statements in the format that the DAC can understand.

For more information about creating an XML or SQL file, see "Using SQL Files as an Execution Type in the DAC".

You can create one file for a full load and one for an incremental load, or you can use the same file for both full and incremental loads.
Save the file in the OracleBI\DAC\CustomSQLs directory.
In the Tasks tab of the DAC Design view, query for the task for which you want to replace the Informatica workflow.
Replace the workflow name in the Command for Incremental Load or Command for Full Load fields with the XML or SQL file.
Change the Execution Type to SQL.

Determining the Informatica Server Maximum Sessions Parameter Setting

You set the Maximum Sessions parameter value when you register the Informatica PowerCenter Services in the Informatica Servers tab of the DAC client. This parameter specifies the maximum number of workflows that can be executed in parallel on the PowerCenter Services. If the number of sessions is zero or is not specified, the DAC server assigns the default value of 10.

You should consider the following factors when determining the Maximum Sessions parameter value:

How powerful the machine is that hosts the PowerCenter Services.
How many instances of PowerCenter Services are available.
The number of Runnable tasks in the queue. A Runnable task is a task for which the Depends On tasks have completed and is ready to be run but is waiting for an Informatica slot to be available. For information about the different task run statuses, see "Current Run Tab".

For an optimal run, the runnable queue should be at zero or should reach zero in a short time. For example, Figure 6-1 shows an ideal performance run when 15 sessions were run in parallel. There were many runnable tasks before the process began, but the queue soon reached zero.

You can generate a run analysis such as Figure 6-1 from the right-click menu (select Get Run Information, then select Get Graph) on the DAC Current Run and Run History tabs. If you find that the runnable curve does not reach zero in a short time, you should increase the Maximum Sessions parameter value to make more Informatica slots available.

Figure 6-1 Sample Performance Run

This figure is a graph showing various ETL runs.

Determining the Number of Transactional and Data Warehouse Database Connections

This section describes how to determine the maximum number of database connections you need between the DAC server and the transactional database and the DAC server and the data warehouse database. You set the Max Num Connections parameter when you create the transactional and data warehouse database connections.

For the transactional database, the DAC server uses these connections to perform change capture. The number of connections you set for this connection pool determines how many change capture processes can run concurrently. If you have a powerful transactional database server and are going to run ETL processes during off-peak times, you can increase the Max Num Connections setting to 15 or 20 (10 is the default). If you have a less powerful transactional database server, you should not overload the operational system with ETL processes. Therefore, you should set the value below 10.

For the data warehouse database, the DAC server uses these connections to perform processes such as truncate tables, drop and create indexes, and analyze tables. You should not set the Max Num Connections value higher than the Maximum Sessions parameter value (the maximum number of workflows that can be executed in parallel in the PowerCenter Services) because these values have a one to one relationship.

Running Two DAC Servers on the Same Machine

You can run two DAC servers on the same machine as long as they are listening on different ports and pointing to two different repositories.

To run two DAC servers on the same machine

Copy the OracleBI\DAC folder to a different folder on the same machine.

For example, you might copy the C:\OracleBI\DAC folder to C:\DAC_SERVER2\DAC.
Edit the config.bat file to set the DAC_HOME variable appropriately for each instance.

For example if you copy the C:\OracleBI\DAC folder to C:\DAC_SERVER2\DAC, make sure that the C:\DAC_SERVER2\DAC\config.bat file is configured correctly.
Launch each of the DAC clients by navigating to the DAC directories and double-clicking the startclient.bat file.
For each instance, configure the DAC repository connection.
1. Navigate to Tools, then select DAC Server Management, then select DAC Server Setup.
  
  An informational dialog states this operation should be performed on the machine running the DAC server. It asks whether you want to continue.
2. Click Yes.
3. In the Repository Connection Information tab, enter the appropriate information for each instance. The Database Host should be the same for each instance, and the Database Port should be different.
For each instance, set up the DAC server system properties.
1. Navigate to Setup, then select DAC System Properties.
2. Set the DAC Server Host, OS, and Port properties.
Start each DAC server from its directory.

Customizing Index and Analyze Table Syntaxes

The customsql.xml file, located in the OracleBI\DAC\CustomSQLs directory, contains the syntax for dropping and creating indexes and analyzing tables. You can edit the customsql.xml file to change the behavior of these operations.

To edit the Analyze Table syntax

Open the customsql.xml file located in the OracleBI\DAC\CustomSQLs directory.

Locate the Analyze Table syntax for the appropriate database type.

For example, the syntax for an Oracle database is as follows:

<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL COLUMNS SIZE AUTO',cascade => true ) </SqlQuery>

Edit the syntax.

For example, to gather statistics for only the indexed columns, edit the syntax as follows:

<SqlQuery name = "ORACLE_ANALYZE_TABLE" STORED_PROCEDURE = "TRUE"> DBMS_STATS.GATHER_TABLE_STATS(ownname => '@TABLEOWNER', tabname => '%1', estimate_percent => 30, method_opt => 'FOR ALL INDEXED COLUMNS',cascade => true ) </SqlQuery>

Note:

The variables @TABLEOWNER, %1, %2, and so on, will be substituted appropriately by the DAC when the statement is executed.

To edit the Create Index syntax

Open the customsql.xml file located in the OracleBI\DAC\CustomSQLs directory.
Locate the Create Index syntax for the appropriate database type, and edit the syntax.

Using SQL Files as an Execution Type in the DAC

There are two types of custom SQL files that can be executed through the DAC: XML formatted .xml files, and plain text .sql files. For examples of XML and SQL files, see the files named samplesql.sql and samplesql.xml in the folder Oracle BI\DAC\CustomSQLs.

XML Formatted Files

An XML file consists of a set of SQL statements for which various options are defined using XML attributes. The tag names and values are case sensitive. The following tags are available:

name

The name of the SQL block. The DAC server uses this name to report errors.

type

The type of SQL block. Possible values are SQL and Stored Procedure.

The value SQL indicates a typical SQL block that does not return resultsets. For example, you can use insert, update, and delete statements but not select statements.
The value Stored Procedure indicates a stored procedure name with its appropriate parameters.

The default value is SQL.

ContinueOnFail

Indicates whether the ETL execution will continue if an error occurs. Possible values are true and false.

If the value is set to true, the execution will continue with the next statement (if any exist). If there are no additional statements, the task will be marked as Completed.
If the value is set to false, the execution will stop and additional statements will not be executed. The task will be marked as Failed.

The default value is false.

dbConnType

Indicates the connection type against which the statement will be executed. Possible values are source, target, and both.

If the value is set to source, the statement will be executed against the source connection.
If the value is set to target, the statement will be executed against the target connection.
If the value is set to both, the statement will be executed against the source and target connections.

If no value is specified for dbConnType, the statement will be executed against the target. If there is a @DAC_TABLE keyword (see"Runtime Substitution of Keywords") specified in the SQL statement, then this statement will be executed once for each source and target table.

If the table specified by the @DAC_TABLE keyword has a connection override (indicated in the Data Source field on the Source Tables and Target Tables subtabs of the Tasks tab), then the override connection will be used.

validDBPlatforms

Indicates the database platform types against which the SQL block can be executed. Possible values are DB2, DB2-390, MSSQL, Oracle, and Teradata. If this value is left empty or is not specified, then the SQL block is eligible to run against all database types.

You enter multiple values separated by a comma.

retries

Indicates how many times the DAC server will attempt to execute the statement if an error occurs. The value must be an integer. If no value is specified or a non-integer is specified, the value defaults to 1. If the statement is successfully executed after an error occurred, then the loop specified by this value will break.

Use of CDATA Section

SQL statements appear in CDATA sections, which allows for special characters (like <, >, \) to be used without breaking the XML structure. There should be only one SQL statement per CDATA section.

Runtime Substitution of Keywords

Keywords allow you to substitute values in the tags at runtime. You can use any of the source system parameters as keywords by placing the prefix @DAC_ before the parameter name.

For example, if you have a source system parameter called $$LAST_REFRESH_DTyou can refer to the parameter by using @DAC_$$LAST_REFRESH_DT.

Some common keywords are the following:

@DAC_TABLE

Depending on the value of the dbConnType tag, the substitution for this keyword is either the source or target tables of the task. The statement will be executed as many times as there are source or target tables.

@DAC_DATASOURCE_ID

The substitution for this keyword is the Data Source Number that appears in the Physical Data Sources tab.

@DAC_SYSDATE

The substitution for this keyword is the current timestamp.

@DAC_TBL_REFRESH_DATE

The substitution for this keyword is the refresh date of the connection name and table combination. This value will be substituted only if the @DAC_TABLE keyword is also used.

@DAC_CURRENT_PROCESS_ID

The substitution for this keyword is the Process ID, located in the Process ID column of the Current Run tab in the Execute view.

Example of XML Formatted File

Below is an example of an XML formatted file.

<!-- The following statement will be executed once on the target connection.--><sql name="Insert Statement" type="SQL"> <![CDATA[  insert into test_trgt (id) values (999)  ]]></sql><!-- The following stored procedure will be executed once per target tableon the table connection or task's target connection if table doesn't haveit's own defined.Even if there is an error, the execution will continue.Database platform type does not matter.--><sql name="Stored Procedure1" type="Stored Procedure" continueOnError="true"> <![CDATA[ test_procedure ('abc', 2, '@DAC_TABLE') ]]></sql><!-- The following statement will be executed once per target table on the table connection or task's target connection if table doesn't have it's own defined.The statement will be executed as many times as there are target tables.The statement will be executed only if target table connection is DB2-390 or Oracle--><sql name="insert new row" type="SQL" dbConnType="target" continueOnFail="true" validDBPlatforms="DB2-390, Oracle"><![CDATA[insert into @DAC_TABLE (INTEGRATION_ID, DATASOURCE_NUM_ID, ETL_PROC_WID) values ('1', @DAC_DATASOURCE_NUM_ID, @DAC_CURRENT_PROCESS_ID)]]></sql><!--The following statement will be executed once per source table on the table connection or task's source connection if table doesn't have it's own defined.Database platform type does not matter.--><sql name="UPDATE ETL_PROCESS_ID" type="SQL" dbConnType="source" continueOnFail="true"><![CDATA[UPDATE @DAC_TABLE SET DATASOURCE_NUM_ID= @DAC_DATASOURCE_NUM_ID]]></sql><!--The following statement uses one of the DAC Source System Parameters.Assume there was a variable defined as $$LAST_REFRESH_DATE that need to reflect the last refresh date ofthe target table, here is how the sql will look like.Database platform type does not matter.--><sql name="UPDATE ETL_PROCESS_ID SINCE LAST REFRESH" type="SQL" dbConnType="target" continueOnFail="true"><![CDATA[UPDATE @DAC_TABLE SET DATASOURCE_NUM_ID= @DAC_DATASOURCE_NUM_IDWHERE LAST_UPD = @DAC_$$LAST_REFRESH_DATE]]></sql></CustomSQLs>

Plain Text SQL Files

Plain text SQL files consist of a set of SQL statements (no stored procedure calls). The SQL statements are separated by a semicolon (;), and comment tags are allowed (//, /* comment */, --). If any of the SQL statements fail, the Task Run status will be Failed.

An example of a plain text SQL file follows:

CREATE TABLE w_etl_temp (name varchar(50))
;
UPDATE w_etl_temp 
SET name = 'that's right' //this line demonstrates the use of ' in a text area
WHERE name LIKE 'gone fishing%';

/*
*some 
*query 
*statement
*/
 SELECT * FROM w_etl_temp
;
DROP TABLE w_etl_temp
;
/*end of file*/

Overview of Change Capture Process (Siebel Sources Only)

This section describes the change capture process used to extract data from the Siebel transactional database. It includes the following topics:

Initial Data Capture

For each Siebel transactional source table (S_) from which data is extracted, there is one S_ETL_I_IMG_ table and one S_ETL_R_IMG_ table.

The first time a staging table is extracted, rows that are in the specified period are selected and inserted into the appropriate S_ETL_R_IMG_ table. The specified period is set in the Prune Days parameter, in the Execution Plans tab. For more information about the Prune Days parameter, see "Execution Plans Tab".

Change Capture Mechanisms

There are two kinds of change capture mechanisms used:

Change capture using tables

This is the most common method. It uses S_ETL_I_IMG_ and S_ETL_R_IMG_ table types and the LAST_UPD column of the source tables.
Change capture using the date column

In some cases, a predefined date column is used to enable change capture (without use of S_ETL_I_IMG_ and S_ETL_R_IMG_ tables).

Change Capture Using Tables

When S_ tables are extracted, the process looks for row ID information (the combination of ROW_ID and MODIFICATION_NUM from the S_ table) that does not exist in the row image table (S_ETL_R_IMG_) and in columns where the value for LAST_UPD is more recent than that of LAST_REFRESH_DATE minus the Prune Days parameter setting. This information is inserted into S_ETL_I_IMG table. The S_ETL_I_IMG_ table is joined with the base table during the SDE extraction process to extract only the change capture rows during refresh.

S_ETL_R_IMG tables store the ROW_ID, MODIFICATION_NUM, and LAST_UPD for the rows in the S_ table that have the LAST_UPD in the defined Prune Days period. The LAST_UPD column is used to delete the records from the S_ETL_R_IMG table. Records are deleted as soon as they go beyond the Prune Days period. This table is used to make sure that records that fall in the Prune Days period are not captured as updates unless they have actually been updated. This guarantees an efficient and faster change capture refresh. For information about tracking deleted records, see

Once the ETL process is completed, the data from the change capture image tables (S_ETL_I_IMG_) is pushed into the row image (S_ETL_R_IMG) table. The S_ETL_R_IMG_ information is subsequently used in the next refresh.

Although the LAST_UPD column in Siebel transactional tables is used for change capture, the timestamp reflects the time the data is committed in the database, rather than the actual transaction event time. This may happen because of remote synchronization, handheld synchronization, UTC conversion, and other processes in which there may be a significant lag between the time of a transaction and its commitment to the database. It is possible, therefore, for a data row to get committed to the transactional database with a LAST_UPD date that is older than the date on which last refresh was executed. Consequently, if the extract is based purely on the date in LAST_UPD, some rows might be missed during extraction.

The LAST_UPD date column, however, still permits change capture process efficiency by limiting the number of rows that have to be compared. The rows from transactional tables are filtered based on the LAST_UPD date being more recent than the LAST_REFRESH_DATE, minus the prune days. Then the ROW_ID and MODIFICATION_NUM combination is compared with the row image table to discover the changed records.

The Prune Days parameter ensures that the rows having LAST_UPD values older than LAST_REFRESH_DATE are not missed. This is a parameter that customers can set based on experience with processes (such as remote synchronization) that may cause records to be missed.

Primary and Auxiliary Tables

The DAC performs change capture for both primary and auxiliary tables. When more than one source table is involved, then both the auxiliary and primary table records need to be marked as changed. For auxiliary tables, you need to write auxiliary mappings to mark the primary tables as changed. The SQL queries that do this are part of the mapping SDEINC_FindAux_.

The extract logic sometimes requires that rows be considered as changed, even if the record has not been updated in the primary table (and therefore extracted during the SDE process). This situation occurs when child table rows have changed and the header/master rows need to be extracted so that data warehouse tables are loaded with a consistent set.

When the S_CONTACT_X row is changed, the corresponding S_CONTACT also needs to be extracted. In this case, rows in S_CONTACT are also marked as changed by inserting rows in the change capture row image table.

When the S_ORDERITEM row is changed, the corresponding S_DOC_ORDER also needs to be extracted. In this case, rows in S_DOC_ORDER are also marked as changed (by inserting rows in the change capture row image table).

These auxiliary changes capture processes are heavily dependent on the data warehouse data model and are required to support the ETL logic.

Example: Building S_ETL_I_IMG_ Table for Loading Account Dimension

This section gives an extended example of the process of change capture using tables.

Load image tables for all relevant source tables.

The content of this entity comes from the S_ORG_EXT and S_ORG_EXT_X tables. Whenever any of the rows change in either of these tables, the record is marked as changed.

The image table for S_ORG_EXT is S_ETL_I_IMG_26. The image table prefix can be found using the DAC to view any source table. This table is truncated before loading with fresh data during every refresh.

During the ETL, process rows are inserted into S_ETL_I_IMG_26 by selecting ROW_ID information from S_ORG_EXT, for rows (combined ROW_ID and MODIFICATION_NUM) that do not exist in the S_ETL_R_IMG_26 and for which LAST_UPD is more recent than LAST_REFRESH_DATE minus the Prune Days setting. This is done during the ETL execution by the DAC's internal image building tasks.

Similarly, the image table S_ETL_I_IMG_27 for S_ORG_EXT_X is loaded.
Load the image table for auxiliary table-based changes.

In addition to the basic change capture, extra processing might be required due to special ETL requirements. In this example, it happens that S_ORG_EXT needs to be extracted for processing even if only S_ORG_EXT_X changes. This is because both the tables are joined to form W_ORG_D, and the extract process of W_ORG_D (a SDE mapping) looks for a changed ROW_ID in the change capture row image table for the primary table S_ORG_EXT only. Therefore, the extract happens only when the ROW_ID for S_ORG_EXT exists in the row image table.

In this case, the SDEINC_FindAux_ mapping is needed to insert corresponding rows of S_ORG_EXT.ROW_ID in the change capture row image table whenever S_ORG_EXT_X changes. The following logical statement shows the method:

Identify the records that have changed in the S_ORG_EXT_X (rows in S_ETLI_IMG_27) table and then find the corresponding rows in S_ORG_EXT. Insert the ROW_ID and MODIFICATION_NUM of those corresponding rows from S_ORG_EXT into S_ETL_I_IMG_26 table.

Using Informatica, the auxiliary mapping SDEINC_FindAux_ has to be written for each primary table that requires it, depending on data warehouse extract logic. Using the DAC, this auxiliary task has to be linked as a parent to the extract mapping for the base table (S_ORG_EXT in this case).

This is the SQL override for the SDEINC_FindAux Informatica mapping:
```
SELECT
   S_ORG_EXT.ROW_ID,
   S_ORG_EXT.MODIFICATION_NUM,
   S_ORG_EXT.LAST_UPD
FROM
   S_ORG_EXT,
   S_ORG_EXT_X,
   S_ETL_I_IMG_27 IMG
WHERE
   (
   IMG.ROW_ID = S_ORG_EXT_X.ROW_ID
   AND
   S_ORG_EXT_X.PAR_ROW_ID = S_ORG_EXT.ROW_ID
   )
   AND NOT EXISTS
   (  SELECT 'X'
      FROM
      S_ETL_I_IMG_26 IMG1
      WHERE
      IMG1.ROW_ID = S_ORG_EXT.ROW_ID
   )
```
Extract source table information using change capture image information.

After the records that are new or modified are identified, those rows are loaded into the staging tables. The Informatica mappings that load the staging tables use the ROW_ID information captured in the image tables.

This example shows the loading of staging table W_ORG_DS. The main logic of populating this table lies in the SQL override of the mapping SDE_OrganizationDimension.

The DAC creates views on tables that are being extracted. The views are different, depending on whether a table is extracted the first time or is a change capture extract.
- If extracting for the first time, the view is created as SELECT * FROM S_ORG_EXT.
- If it is a change capture extract, the view is created as SELECT * FROM S_ORG_EXT, S_ETL_I_IMG_26 IMG WHERE S_ORG_EXT.ROW_ID = IMG.ROW_ID.
The SQL override in the mapping uses the view to extract the data.
```
SELECT
   S_ORG_EXT.ROW_ID,
   S_ORG_EXT.NAME, …..
   ….
   ….
FROM
   V_ORG_EXT,
   S_ORG_EXT_X,
   …..
WHERE
   {
      V_ORG_EXT S_ORG_EXT
      LEFT OUTER JOIN S_ORG_EXT_X ON
      S_ORG_EXT.ROW_ID = S_ORG_EXT_X.PAR_ROW_ID
      …..
   }
AND
   S_ORG_EXT.ROW_ID <> 'INT_COMPANY_ID'
```

Change Capture Using the Date Column

Forecasts are extracted without using the image table. The value S_FCSTSER_DATE is tracked using the date column ARCHIVE_TS. The administrator sets the ARCHIVE_TS for forecasts that are submitted, frozen, and ready to be loaded into the Oracle Business Analytics Warehouse. S_ETL_RUN stores the previous ETL date when forecasts were extracted and the current ETL date when the forecasts are being extracted. All forecasts with ARCHIVE_TS values greater than that of the previous ETL date and ARCHIVE_TS (less than the current ETL date) are extracted in the current ETL. Both ETL date and ARCHIVE_TS (less than the current ETL date) are stored in S_ETL_CURR_RUN.

Note:

Forecasts in the Oracle Business Analytics Warehouse are never updated. Once loaded, they are frozen.

SELECT
…..
FROM
   S_FCSTSER_DATE,
   S_FCSTSER,
   S_ETL_CURR_RUN,
   ……..
WHERE
   S_FCSTSER_DATE.FCSTSER_ID =  S_FCSTSER.ROW_ID
   AND S_FCSTSER_DATE.ARCHIVE_TS > S_ETL_CURR_RUN.PREV_LOAD_DT 
   AND S_FCSTSER_DATE.ARCHIVE_TS <= S_ETL_CURR_RUN.LOAD_DT

Using the Change Capture Filter

The change capture filter enables you to selectively load data from the Siebel transactional database into the data warehouse. You can set a condition in the ChangeCaptureFilter.xml file to filter data from specific tables. This file is located in the OracleBI\DAC\CustomSQLs directory. It provides an XML sample that you can copy and alter to fit your needs. Detailed instructions for using this feature are included at the top of the file.

Tracking Deleted Records

The Oracle Business Analytics Warehouse change capture process uses delete triggers to identify records for deletion on the Siebel transactional database. The deleted records are stored in S_ETL_D_IMG tables. During the change capture process, the DAC server moves the data from the S_ETL_D_IMG tables to the S_ETL_I_IMG tables, where D appears in the OPERATION column to show the records were deleted. During the change capture sync process, the records in the S_ETL_D_IMG tables that were moved to the S_ETL_I_IMG tables are flushed. In the DAC, you can view the SQL that runs during the change capture and change capture sync processes by using the Output to File right-click command in the Tasks tab of the Design view.

The preconfigured ETL process captures deleted records for the target tables W_ORG_D and W_PERSON_D, the source tables for which are S_ORG_EXT, S_CONTACT, and S_PRSP_CONTACT. These source tables need to have delete triggers created in the Siebel transactional database in order for deleted records to be tracked.

For vertical applications, the preconfigured ETL process captures deleted records for W_FUND_F and W_ALIGNMT_DH. You need to create delete triggers in the transactional database for the following additional tables: S_MDF_TXN, S_ASGN_GRP_POSTN, S_ASGN_RULE_ITEM.

In the Oracle Business Analytics Warehouse, preconfigured visibility tables are inactivated. If you activate visibility tables, you should also create delete triggers on the optional tables.

The preconfigured SIA Account and Contact visibility tables are activated by default for vertical applications. If your organization is not going to use any of the visibility tables, you need to inactivate them in the DAC.

On the target tables for which deleted records are tracked, a D appears in the INACTIVE_FLG column to show the records as deleted when the source records are deleted. This method of flagging a record as deleted is known as a soft delete, as compared to a hard delete when the record is physically deleted. When deleted records are tracked on visibility-related data warehouse tables, the records are physically deleted. The general rule is that soft deletes should be used for tables that are referenced by other tables. If a table is not referenced by any other table, then you can use hard deletes.

Aggregate tables are rebuilt during each ETL process. Therefore, records can be physically deleted from the base tables without consequence. If you want to use the soft delete method, you should consider changing the aggregate building mappings so that the deleted records are omitted.

Note:

The Oracle BI Server does not recognize soft deletes. Therefore, you have to modify the .rpd file so that it does not pick up soft-deleted records for reporting.

To create delete triggers for preconfigured ETL change capture

From the DAC menu bar, select Tools, then select ETL Management, then select Configure.
In the Sources dialog, select the database platform for the target and transactional databases, and click OK.
In the Data Warehouse Configuration Wizard, select the Create Delete Triggers in Transaction Database check box, and click Next.

The Delete Triggers tab is active.

Select one of the following:

Option	Description
Create Triggers	Executes the trigger statements directly.
Write Script to File	Writes the trigger statements to a file, which can be executed by a database administrator.

Select the database type as defined in the DAC.
For DB2 zSeries databases, enter the base table owner.
(Optional) Select the Include Optional Triggers check box to create triggers for the optional tables.
Click Start.

To create delete triggers for new source tables

In the DAC, navigate to the Design view, then select Tables.
Select the table for which you want to track deleted records.

Make sure the table has an image suffix.
Right-click the table and select Change Capture Scripts, then select Generate Image and Trigger Scripts.
In the Triggers and Image Tables dialog, select the database type of the source database.
Make sure the Generate Image Table Scripts and Generate Trigger Script(s) options are selected.
Execute the script on the database.

To track deleted records

Make sure the delete triggers are enabled for the appropriate tables.
Write custom Informatica workflows with a clause WHERE operation = 'D' to the appropriate I_IMG table to take them across to the dimension and fact tables.
In the DAC, register the workflows as tasks.
Define the appropriate dependencies.

For an example of such a workflow, see the preconfigured task SDE_OrganizationDimension_LoadDeletedRows.

Pointing Multiple Informatica PowerCenter Services to a Single Informatica Repository

You can install multiple Informatica PowerCenter Services and point them to a single Informatica repository. You need to register each instance of PowerCenter Services in the DAC and specify a unique machine name and server name. For instructions on registering PowerCenter Services in the DAC, see the Oracle Business Intelligence Applications Fusion Edition Installation and Configuration Guide.

Handling ETL Failures with the DAC

This section includes the following topics:

When the Execution of an Execution Plan Fails
Discarding the Current Run Execution Plan
Failure of Aggregator Transformation Tasks with Sorted Input

When the Execution of an Execution Plan Fails

When an execution plan is executed, if a task fails, the status of the tasks that are dependent on the failed task is changed to Stopped. While tasks are still running, the execution plan's status is Running. When all the tasks have been run, and if one or more tasks have failed, the execution plan's status is changed to Failed.

You can check the tasks that have failed in the Current Run tab of the Execute view, fix the problems, and then requeue the failed tasks by changing the status to Queued. You can then restart the ETL. All the tasks will then be rerun. You can also manually run a task, change its status to Completed, and then restart the ETL. Tasks with a Completed status are skipped.

Caution:

The DAC server does not validate tasks that have been run manually.

To restart a failed ETL, click Run Now from the Current Run tab of the Execute View.

In Case of Abnormal Termination of the DAC Server

If the DAC server fails during the execution of the ETL, the status of the ETL execution will remain as Running. When the DAC server is started again, it will automatically run the ETL if the Auto Restart ETL DAC system property is set to True. If the same system property is set to False, when the server restarts, it will set the correct status as Failed. In order to execute the ETL from the point of failure, submit the request to the server again.

The DAC server will automatically terminate if it looses connection to the DAC repository.

Discarding the Current Run Execution Plan

You can discard an execution plan that failed by navigating to the Current Run tab, right-clicking on the execution plan and changing its status to Mark as Completed. This will force the run status to be updated as Completed. When you submit a request for another run, the DAC server creates another instance of it.

Caution:

Perform this procedure in a development or testing environment only, since it might leave the data in an inconsistent state, causing you to have to reload all of the data.

Failure of Aggregator Transformation Tasks with Sorted Input

Tasks that use Informatica Aggregator transformation can fail when the Sorted Input option is active. The tasks SDE_DTLFORECASTFACT and SDE_COSTLIST are examples of tasks that can fail in such a situation.

To prevent such tasks from failing, in PowerCenter Designer, navigate to Mapping Designer, open the corresponding mapping, and in the Aggregator transformation, remove the check from the Sorted Input check box.