Oracle® Fusion Middleware User's Guide for Oracle Business Intelligence Data Warehouse Administration Console 11g Release 1 (11.1.1) Part Number E14849-06 |
|
|
PDF · Mobi · ePub |
This chapter provides instructions for performing common DAC tasks.
This chapter contains the following topics:
Pointing Multiple Informatica Integration Services to a Single Informatica Repository
Viewing DAC Metrics Using Fusion Middleware Control MBean Browser
Copying Data From a Regular Warehouse to an In-Memory Database
You can access the DAC Server through a command line to start and stop execution plans and to get status information for servers, databases, and execution plans. This feature enables you to access the DAC Server using third-party administration or management tools, without using the DAC Client. Refer to the file dacCmdLine.bat/.sh for usage.
This section includes the following topics:
The Command Line utility enables you to invoke commands on a DAC Server running on a remote or local machine.
To set up command line access to the DAC Server:
Make sure you have installed the supported version of the Java SDK.
Copy the following files from the <Domain_Home>\dac directory to a local directory:
DAWSystem.jar
dac.properties
dacCmdLine.bat or dacCmdLine.sh
In the dacCmdLine.bat file, do the following:
Edit the JAVA_HOME variable to point to the directory where the Java SDK is installed.
Make sure there are no spaces in the path reference.
Edit the DAC_HOME variable to point to the directory where the DAC is installed.
In the dac.properties file, edit the following parameter values.
Parameter | Value |
---|---|
ServerHost= |
Host name of the DAC Server. |
ServerPort= |
Port of the DAC Server. The default is 3141. |
RepositoryStampVal= |
Repository stamp that appears in the DAC Client Login Details screen. To find this value, in the DAC Client navigate to Help, then select Login Details. |
The dac.properties file should look similar to the following:
ServerHost=<host name> ServerPort=3141 RepositoryStampVal=851E0677D5E1F6335242B49FCCd6519
Follow this procedure to use the command line to access the DAC Server.
To use the command line to access the DAC Server:
At the command prompt, enter the following:
dacCmdLine <method name> <optional execution plan name>
where method name
is one of the following:
Method Name | Description |
---|---|
StartServer |
Starts the DAC Server (Web mode only). |
StopServer |
Stops the DAC Server (Web mode only). |
StartETL |
Starts an execution plan. You must specify an execution plan name. -wait option lets you start the execution plan in synchronous mode. |
StopETL |
Stops the operation of an execution plan. You must specify an execution plan name. |
ETLStatus |
If you do not specify an execution plan name, the status of the execution plan that last ran is returned. If you specify an execution plan name, the status of the specified execution plan is returned. |
DatabaseStatus |
Verifies whether the DAC Server can connect to all active database connections. You do not need to specify an execution plan name. |
InformaticaStatus |
Verifies whether the DAC Server is able to ping all active Informatica PowerCenter Services machines. |
Note:
The method names are case insensitive. Execution plan names are case sensitive. Also, if the execution plan name contains spaces, place beginning and ending double quotes around the name.
For example:
Command Line | Description |
---|---|
dacCmdLine EtlStatus |
Returns the status of the execution plan that last ran. |
dacCmdLine EtlStatus Forecast |
Returns the status of the last instance of the Forecast execution plan. |
dacCmdLine StopEtl Forecast |
If the execution plan currently running is Forecast, the operation will be terminated. Otherwise, the request is ignored. |
dacCmdLine databasestatus |
Returns the health status of all the database connections as defined in the DAC repository from the DAC Server. |
dacCmdLine InformaticaStatus |
Returns the health status of all the Informatica PowerCenter Services connections as defined in the DAC Client on the Informatica Services tab. |
The command line feature enables you to get the following status information:
Summary of the requested execution plan. If there are multiple instances of the same execution plan, a summary of the instance that last ran is returned. Below is an example of the information contained in the summary.
(c) 2009 Oracle Oracle DAC Server comprising the etl execution-management, scheduler, logger, and network server. ETL details for the last run: ETL Process Id : 255 ETL Name : Complete ETL Run Name : DRY RUN OF Complete ETL: ETL Run - 2004-06-17 18:30:13.201 DAC Server : (<host name>) DAC Port : 3141 Status: Stopped Log File Name: Complete_ETL.255.log Database Connection(s) Used : OLTP jdbc:microsoft:sqlserver://<host name>:1433;DatabaseName=OLTP Data Warehouse jdbc:microsoft:sqlserver://<host name>:1433;DatabaseName=olap Informatica Server(s) Used : InformaticaServer4-<host name>:(4) InformaticaServer2-<host name>:(2) InformaticaServer3-<host name>:(3) InformaticaServer1-<host name>:(10) Start Time: 2004-06-17 19:00:06.885 Message: ETL was interrupted Actual Start Time: 2004-06-17 18:30:13.357 End Time: 2004-06-17 19:05:56.781 Total Time Taken: 35 Minutes Start Time For This Run: 2004-06-17 19:00:06.885 Total Time Taken For This Run: 5 Minutes Total steps: 212 Running steps: 0 Complete steps: 142 Failed/Stopped steps:70
Summary of connection status to all active databases and Informatica servers.
This section describes the DAC repository command line utilities that are exposed in the AutomationUtils.bat file, which is located in the <Domain_Home>\dac directory.
Note: Before you can use the command line utilities, you must first configure DAC repository database and authentication properties in the automationUtils.properties file, which is located in the <Domain_Home>\dac directory. Alternatively, you can create your own connection file that contains the repository database and authentication properties. Many command line utilities discussed in this section require the path to the connection file (either automationUtils.properties or a user-defined connection file). The path to the connection file must include the file name.
The following DAC repository command line utilities are available in the AutomationUtils.bat file:
The analyzeDACSchema property analyzes the DAC repository tables.
Syntax:
<path to connection file> analyzeDACSchema
The applyDistributedDevPatch property applies a patch to the DAC repository.
Syntax:
<path to connection file> applyDistributedDevPatch <file name>
where:
Parameter | Description |
---|---|
|
Name of the XML file. |
The assembleSubjectArea property assembles a subject areas.
Syntax:
<path to connection file> assembleSubjectArea <container name> <subject area name>
The buildExecutionPlan property builds an execution plan.
Syntax:
<path to connection file> buildExecutionPlan <execution plan name>
The changeEncryptionKey property changes the encryption key in the credentials file and re-encrypts all encrypted information in the DAC repository.
Syntax to change encryption key with a randomly generated key:
<path to connection file> changeEncryptionKey -randomkey
Syntax to change encryption key by explicitly specifying the key from command line input:
<path to connection file> changeEncryptionKey
The clearEncryptedData property removes all encrypted data from the DAC repository.
Syntax:
<path to connection file> clearEncryptedData <schema owner>
Note: An authentication file is not required for this operation. The repository password will be taken from command line input.
Caution:
This operation will clear all database, Informatica, and external executor passwords as well as all DAC user data. Use this command only when the authentication file with the encryption key was lost and cannot be recovered. After removing the encrypted data from the DAc repository, generate the authentication file with a new encryption key, and log into the DAC repository using the DAC Client. When prompted to add the encryption key, update all passwords in the DAC repository and distribute the new authentication file to all required DAC users.
The cmdCredentials parameter stores credentials for the DAC command line utility for both the Fusion Middleware and DAC standalone authentication modes.
Syntax:
cmdCredentials <path to cwallet.sso file> <user name>
Note: The password will be taken from command line input. Also, make sure to include the cwallet.sso file name in the path.
The createDACSchema property creates the schema of a new DAC repository.
Syntax:
<path to connection file> createDACSchema <unicode> <tablespace>
where:
Parameter | Description |
---|---|
|
Specifies whether the schema is created as unicode. Possible values are |
|
The name of the default tablespace in which the schema is created. Surround the name with double quotation marks. Note: If the value for the AuthenticationType property in the connection file is |
The createDACUser property creates a DAC schema user.
Syntax for FMW authentication:
<path to connection file> createDACUser <user name> [role1] [role2] ...
Syntax for database authentication:
<path to connection file> createDACUser <user name> [role1] [role2] ... [-dbVerification <schema owner>]
Note: The user password will be taken from command line input.
The createDWSchema property creates the data warehouse schema using the schema definition file.
Syntax:
<path to connection file> createDWSchema <dwSchemaDefinitionFileName> <unicode> <tablespace>
where:
Parameter | Description |
---|---|
|
Specifies the data warehouse schema definition file location and name. This file is generated by the generateDWSchemaDefinitionFile command line parameter. |
|
The name of the default tablespace in which the schema is created. Surround the name with double quotation marks. |
unicode |
Specifies whether the schema is created as unicode. Possible values are |
Note: The file create.sql is generated and stored in the <Domain_Home>/dac/conf/sqlgen/sql/oracle directory of an enterprise installation, and the dac/conf/sqlgen/sql/oracle directory in a standalone or client installation. User authentication is not required for this command.
The createPatchOfObjectsBetweenTimeStamps property creates a DAC repository metadata patch containing all eligible objects that are new or changed between two timestamps.
Syntax:
<path to connection file> createPatchOfObjectsBetweenTimeStamps <patch name> <from timestamp> <to timestamp>
where:
Parameter | Description |
---|---|
|
The beginning timestamp for the timestamp range in format yyyymmdd-hh:mm:ss:ss. The hh value must be between 00-23. |
|
The ending timestamp for the timestamp range. The format is the same as that of the "from timestamp." |
-now |
A special string to specify the current timestamp. |
The dbCredentials property generates an authentication file (cwallet.sso) with the DAC repository credentials and encryption key.
Syntax for explicit key and credentials:
dbCredentials <path to cwallet.sso file> <user name> -withKey
Syntax for randomly generated encryption key and credentials:
dbCredentials <path to cwallet.sso file> <user name> -randomKey
Syntax to update credentials in existing file without modifying the encryption key:
dbCredentials <path to cwallet.sso file> <user name> -noKey
Note: The user password will be taken from command line input.
The deleteObjects ExecutionPlan property deletes executions plan from the DAC repository.
Syntax:
<path to connection file> deleteObjects ExecutionPlan <name 1> [name 2] ...
The deleteObjects SubjectArea property deletes subject areas from the DAC repository.
Syntax:
<path to connection file> deleteObjects SubjectArea <container name> <subject area name 1> [subject area name 2] ...
The dropDACSchema property drops the schema of the DAC repository.
Syntax:
<path to connection file> dropDACSchema
The dropDWSchemaFromSchemaDefinitionFile property drops the data warehouse schema using the schema definition file.
Syntax:
<path to connection file> dropDWSchemaFromSchemaDefinitionFile <dwSchemaDefinitionFileName> <execute>
where:
Parameter | Description |
---|---|
|
Specifies the data warehouse schema definition file location and name. This file is generated by the generateDWSchemaDefinitionFile command line parameter. |
execute |
Possible values are |
Note: The file drop.sql is generated and stored in the <Domain_Home>/dac/conf/sqlgen/sql/oracle directory of an enterprise installation, and the dac/conf/sqlgen/sql/oracle directory in a standalone or client installation. User authentication is not required for this command.
The Export property exports DAC metadata from the DAC repository for specified source system containers.
Syntax:
<path to connection file> export <folder name> <container name 1> <container name 2> ...
where:
Parameter | Description |
---|---|
|
Full path to the location of the export file structure. |
|
(Optional) Name of the source system container for which you want to export DAC metadata. If no container is named, all containers that are found in the file structure will be exported. |
The exportCategory property exports DAC metadata from the DAC repository based on the Logical, Run Time, or System categories.
Syntax:
<path to connection file> exportCategory <folder name> <category>
where:
Parameter | Description |
---|---|
|
Full path to the root of the export file structure. |
|
Possible values are the following:
|
The exportPatch property exports a DAC metadata repository patch in XML format.
Syntax:
<path to connection file> exportPatch <file name> <patch name>
where:
Parameter | Description |
---|---|
|
Name of the XML file. |
The jksPassword parameter stores the password for the Java key store.
Syntax:
jksPassword <path to cwallet.sso file>
Note: The password will be taken from command line input. Also, make sure to include the cwallet.sso file name in the path.
The generateDWSchemaDefinitionFile generates a data warehouse schema definition file.
Syntax:
<path to connection file> generateDWSchemaDefinitionfile <dwSchemaDefinitionFileName> <container 1> <container 2> ...
Note: The definition file will be generated for the specified containers. If you do not specify a container, the file will be generated for all containers.
The Import property imports DAC metadata into the DAC repository for specified source system containers.
Syntax:
<path to connection file> import <folder name> [-noTruncate] [-noUpdate] <container name 1> <container name 2> ...
where:
Parameter | Description |
---|---|
|
Full path to the root of the import file structure. |
|
(Optional) Name of the source system container for which you want to import DAC metadata. If no container is named, all containers that are found in the file structure will be imported. |
|
By default the repository will be truncated upon import. Use the -noTruncate option for incremental import. |
|
By default existing records will be updated during incremental import. Use the -noUpdate option if you only want to insert new records. |
The importCategory property imports DAC metadata from the DAC repository based on the Logical, Run Time, or System categories.
Syntax:
<path to connection file> exportCategory <folder name> <category>
where:
Parameter | Description |
---|---|
|
Full path to the root of the import file structure. |
|
Possible values are the following:
|
|
By default the repository will be truncated upon import. Use the -noTruncate option for incremental import. |
|
By default existing records will be updated during incremental import. Use the -noUpdate option if you only want to insert new records. |
The repositoryXMLSnapshot property exports a snapshot of the DAC repository in XML format.
Syntax:
<path to connection file> repositoryXMLSnapshot <file name>
where:
Parameter | Description |
---|---|
|
Name of the XML file. |
The serverSetup property sets up the DAC Server connection to the DAC repository.
Syntax:
<path to connection file> serverSetup <database type> <database name/instance name/tns> <host> <port>
Note: If Fusion Middleware authentication is specified in the connection file, the DAC Server URL should be specified in the properties file and the DAC Server will be set to where the URL points (location can be a remote server). If DAC standalone authentication is specified in the connection file, you must configure the repository connection properties in the properties file. The DAC Server will be set up in the local installation. The user name, password, and encryption key will be taken from the database credentials (cwallet.sso) file.
The setPassword parameter sets the password for the Informatica Integration Service, Informatica Repository Service, and physical data sources in the DAC repository.
Syntax:
<path to connection file> setPassword <type> <logical name>
where:
Parameter | Description |
---|---|
|
Possible values are the following:
|
|
Logical name of the server or physical data source record in DAC. |
Note: The password will be taken from command line input. If the logical name or password contains spaces, surround the entry in double quotes.
The upgradeDACSchema parameter upgrades the DAC repository.
Syntax:
<path to connection file> upgradeDACSchema
The upgradeDWSchemaFromSchemaDefinitionFile property upgrades the data warehouse schema using the schema definition file.
Syntax:
<path to connection file> upgradeDWSchemaFromSchemaDefinitionFile <dwSchemaDefinitionFileName> <unicode> <execute>
where:
Parameter | Description |
---|---|
|
Specifies the data warehouse schema definition file location and name. This file is generated by the generateDWSchemaDefinitionFile command line parameter. |
unicode |
Specifies whether the schema is created as unicode. Possible values are |
execute |
Possible values are |
Note: User authentication is not required for this command.
Follow this procedure to set up the DAC Server to be run automatically when your machine reboots.
Note: This procedure applies to the DAC Server in standalone mode only.
To set up the DAC Server to run automatically upon rebooting the machine:
From the Start menu, select Programs, Accessories, System Tools, Scheduled Tasks.
Double-click Add Scheduled Task.
In the Scheduled Task Wizard, browse to startserver.bat, and click Open.
Select the option When my computer starts, and click Next.
Enter the domain user account to start the DAC Server and a password, and click Finish.
The startserver task is displayed in the Scheduled Task window.
Right-click the task and select Properties.
In the Settings tab, deselect the Stop the task if it runs for 72 hours check box.
To start the DAC Server as a scheduled task
From the Programs menu, select Accessories, System Tools, Scheduled Tasks.
Right-click startserver, and then click Run.
To stop the DAC Server as a scheduled task
From the Programs menu, select Accessories, System Tools, Scheduled Tasks.
Right-click startserver, and then click End Task.
To check if the DAC Server is running
From the Programs menu, select Accessories, System Tools, Scheduled Tasks.
Select the startserver task.
In the Windows menu bar, select View, Details.
In the DAC standalone mode, you can run two DAC Servers on the same machine as long as they are listening on different ports and pointing to two different repositories.
Note: This procedure does not apply when the DAC Server is running as a service of the DACServer application on WebLogic Server.
To run two DAC Servers on the same machine:
Copy the <Domain_Home>\dac directory to a different directory on the same machine.
For example, you might copy the C:\<Domain_Home>\dac directory to C:\<Domain_Home>\DAC_SERVER2\dac.
Edit the config.bat file to set the DAC_HOME variable appropriately for each instance.
For example if you copy the C:\<Domain_Home>\dac directory to C:\<Domain_Home>\DAC_SERVER2\dac, make sure that the C:\<Domain_Home>\DAC_SERVER2\dac config.bat file is configured correctly.
Launch each of the DAC Clients by navigating to the DAC directories and double-clicking the startclient.bat file.
For each instance, configure the DAC Server connection.
Navigate to Tools, DAC Server Management, DAC Server Setup.
A message dialog states this operation should be performed on the machine running the DAC Server. It asks whether you want to continue.
Click Yes.
In the Repository Connection Information tab, enter the appropriate information for each instance. The Database Host should be the same for each instance, and the Database Port should be different.
Click Save.
For each instance, configure the DAC repository connection.
Navigate to Tools, DAC Server Management, Repository Configuration.
In the Repository Configuration dialog, enter the following information:
Field | Description |
---|---|
Mode |
Select Standalone. |
Standalone Mode Configuration: |
If you are using standalone mode configuration, enter the following information: |
Host |
Name of the host machine where the DAC Server resides. |
Alternative Hosts |
(Optional) Names of additional machines hosting DAC Servers that can be used to connect to the DAC repository if the main DAC Server host machine fails. Separate multiple hosts with commas. |
Port |
DAC Server port. |
Start each DAC Server from its directory.
You can use one DAC Client to access multiple DAC Servers. The DAC Servers can reside on the same machine, or they can each reside on a different machine. Each DAC Server must listen on its own port and point to a single DAC repository.
You can install multiple Informatica Integration Services and point them to a single Informatica repository. You need to register each instance of the Integration Service in DAC and specify a unique machine name and service name. For instructions on registering Integration Services, see "Registering Informatica Services in DAC".
When you reset a data warehouse all tables are truncated and refresh dates are deleted. This allows you to perform a full load of data into the data warehouse.
To reset the data warehouse:
On the Tools menu, select ETL Management, and then select Reset Data Warehouse.
The Reset Data Warehouse dialog is displayed and requests you to confirm you want to reset the data warehouse.
Enter the confirmation text in the text box.
Click Yes.
Fusion Middleware Control MBean Browser is an Oracle Web application (based on JMX MBean containers) that you can use to view information about running, failed and queued execution plans and the status of the DAC Server.
Information about execution plans is available through the MBean Browser only if the DAC Server is running.
To display DAC metrics using Fusion Middleware Control MBean Browser:
Display Fusion Middleware Control.
In the Navigator window, expand the WebLogic Domain folder and the bifoundation_domain node, and select the AdminServer node.
Display the WebLogic Server menu and select System MBean Browser from the menu to display the Fusion Middleware Control System MBean Browser.
Display the DACMetrics MBean by selecting the following: ApplicationDefinedMBeans, com.oracle.dac, Server:<server name>, Application: DACServer, EMIntegration, and DACMetrics MBean.
The attributes and operations that are available to be invoked are displayed.
The DAC Server runs as a service of the DACServer application on WebLogic Server. The WebLogic Server Administration Console enables you to monitor the status of the DAC Server.
Note: The DAC Server can also run in standalone mode. For information about the differences between the standalone and WebLogic Server modes, see "About DAC Authentication Modes".
To monitor the DAC Server using the WebLogic Server Administration Console:
Display the WebLogic Server Administration Console in one of two ways:
On the Overview page in Fusion Middleware Control, expand the WebLogic Domain node and select the bifoundation_domain. Locate the WebLogic Server Administration Console link in the Summary region.
Enter the following URL into a Web browser:
http://<host>:<port>/console
where host is the DNS name or IP address of the DAC Server and port is the listen port on which the WebLogic Server is listening for requests. The default port is 7001.
Enter the system administrator user name and password and click Login.
This system wide administration user name and password was specified during the installation process, and you can use it to log in to WebLogic Server Administration Console and Fusion Middleware Control. Alternatively, enter a user name that belongs to one of the following security groups:
Administrators
Operators
Deployers
Monitors
Click on the Deployments link in the navigation tree on the left to display the list of deployed applications.
Select the DACServer application.
The DAC Server status is displayed.
Follow this procedure to export the logical data model in a text or CSV format.
To export the logical data model in text or CSV formats:
In the DAC Design view, select the DataWarehouse container from the drop-down list.
Go to the Tables tab.
Right-click on any record in the top pane window.
Select Flat Views, and then select Table Columns.
In the Table Columns dialog, select Warehouse, and then click Go to query for Warehouse tables and columns.
A list of tables and columns in the data warehouse is displayed.
Output the list to a text or CSV file:
Right-click on any record, and select Output to File.
In the Output Table dialog, enter a file name, including the extension .txt or .csv, and a directory path where you want the file saved. If you do not enter a directory path, the file is stored in the <Domain_Home>\dac directory.
Select the properties you want to export to the file.
Click OK.
Follow this procedure to add foreign key table names and column names into DAC.
To add foreign key metadata into DAC:
In the DAC Design view, select the appropriate container from the drop-down list.
Go to the Tables tab.
Query for and select the appropriate table.
Right-click and select Import Foreign Keys.
Click OK in the Updating Foreign Keys dialog.
Enter the column name, foreign key table name, and foreign key column name in the text field. Separate the values with a comma.
Alternatively, click Read from File to import the metadata from a file. Browse for and select the appropriate file. After the file is loaded into the text field, click OK to proceed with the import.
When you are using an in-memory database, such as Oracle TimesTen, for analytical reporting, a scenario to consider is to maintain a full-size warehouse, with staging and temporary tables, in a regular database and run ETL processes against this warehouse. Then, you would copy into the TimesTen database only the tables you need for reporting; or you could copy a subset of these tables. Note that this is a bulk copy functionality and not a trickle feed load strategy.
The DAC Data Copy Framework enables you to copy data from a regular database into an in-memory database by configuring an external executor and DAC parameters and creating data copy tasks.
This section contains the following topics:
For additional information about using a TimesTen database, see "Oracle TimesTen Database Guidelines".
The first step in copying data from a regular warehouse into an in-memory database is to create a external executor with the type of Data Copy.
To create an external executor of type Data Copy:
In the DAC Setup view, go to the External Executors tab.
Click New in the top pane toolbar.
Enter a name for the external executor.
Select Data Copy as the Type.
Click Save.
Click Generate to generate the required properties.
Click the Properties subtab to view and edit the properties:
Num Parallel Jobs per EP. Specifies how many data copy processes can run in a given execution plan in parallel. Note: One process copies data for one table.
Num Writers per Job. Specifies how many writers a data copy process has. You can figure out how many writers you need by comparing read and write throughput published by the data copy tasks.
For example, if the read throughput is 3000 records per second and the write throughput is 1000 records per second, it makes sense to have three writers per table. However, if the read throughput is 20000 to a write throughput of 1000, it does not necessarily mean that 20 writers will give the best performance. You might need 10 or 15 writers; to determine the correct number, you will need to compare the throughput and the overall time. As the number of writers increases, the performance might start dropping at some point, because the writers might delay each other.
Batch Size. Specifies how many records are written to the target at one time using array inserts and updates. Generally, a bigger batch size means faster writes (up to a point), but it also requires more memory.
After you create an external executor with the type Data Copy, the next step is to create data copy tasks.
To create data copy tasks:
In the DAC Design view, go to the Tables tab.
Identify and query for the warehouse tables you want to copy.
Note: All tables you wish to copy must have the In Memory check box selected.
In the Generate Data Copy Tasks dialog, select one of the following:
- Selected record only. To generate tasks only for the selected record.
- All records in the list. To generate tasks for all records in the list of query results.
In the next screen of the Generate Data Copy Tasks dialog, select the following:
Primary Logical Source
Primary Logical Target
Task Phase
Enable Incremental Data Copy
If you want to always run data copies in full mode (pre-deleting and inserting all records), deselect this check box. If you want to use incremental mode after the initial population, to insert missing records and update existing records, select this check box. This should be decided based on the expected rate of modifications compared to the table size. When modifications are light, incremental mode is preferable. When a significant portion of the records in a data warehouse table are updated or new, it will be faster to use full mode, which will truncate the target table and insert all records, without checking which ones already exist.
Note:
You can also create tasks manually by doing the following:
In DAC, create a new task.
Select Data Copy as the execution type.
Create exactly one source and target table. The names of the source and target tables can be different.
Enter INSERT_ONLY_DATA_COPY for the Command for Full Mode.
Enter INSERT_UPDATE_DATA_COPY for the Command for Incremental Mode.
Set other required task properties.
To run a data copy process, you need to configure the parameters listed below. For instructions on configuring parameters, see "Defining and Managing Parameters".
DataCopyIncrementalFilter. Required for incremental mode. It narrows down the records being read from source tables to only the ones created or updated since the last time the table was copied. The recommended value is:
ETL_PROC_ID > @DAC_LAST_ETL_PPROC_ID
It is assumed that the warehouse table has the ETL_PROC_ID column, which stores the ID of the ETL that inserts or updates a given record. The LAST_ETL_PPROC_ID is a suggested parameter of type Text, which uses the runtime DAC variable @DAC_LAST_PROCESS_ID. If this parameter is configured correctly, every data copy task will read the new and modified records since the previous invocation of this task, regardless of how many regular ETLs occurred in between.
DataCopyUniqueKey. Required for incremental mode. This is the primary index column, such as ROW_ID, which is used to identify records. The unique key columns need to exist on both the source and target tables. The column type on the source and target should be of a similar type. For example, a source column of type Long can map to a target of type Integer as long as the target value does not require more space. However, you can not map a source type of Integer to a target type of Timestamp.
DataCopyExtraReadFilter. Optional for full and incremental modes. Specifies additional filters to be used when reading from the source table.
Note:
Values that are common for all or most of the tables can be specified in the Source System Parameters tab of the Design view. Values specific to a particular table should be set in the Parameters subtab of the Tasks tab.
After you have completed the procedures "Creating an External Executor of Type Data Copy", "Creating Data Copy Tasks", and "Configuring DAC Parameters for Data Copy", you can run a data copy process by doing one of the following:
Add the data copy tasks to an existing subject area. For instructions, see "Modifying an Existing Subject Area". You will need to rebuild the execution plan to which the subject area belongs.
Create a new subject area and add the data copy tasks to it. Then, add the new subject area to an existing execution plan or create a new execution plan specifically for the data copy process.
For instructions, see:
Each data copy task requires one connection to the source, so at least as many connections as there are tasks specified in parallel are required; otherwise, the ETL might hang.
Each data copy task is required to have one connection to the target table per writer. In addition, more connections are required to find which records exist and which do not in incremental mode with a single column unique key (which are shared, so it doesn't have to match the number of the tasks). For example, if ten tasks are allowed in parallel and each has five writers, at least fifty connections are required for full mode, and a few additional for incremental mode.
Each data copy task requires one worker thread for management, one more to read from the source, and then one more per writer. Therefore, at least seventy worker threads would be needed in a case where you have 10 tasks in parallel and five writers per task. Note that worker threads are specified in the DAC system property Worker Pool Size, which has a default value of 50.
DAC creates a log file for every data copy task, which is stored in the ETL log directory. The file name is in the format <task_name>_<source_name>_<target_name>.log, where all spaces are replaced with the underscore character (_).
Data copy tasks are restartable. If a data copy process fails, you can restart the run after making the necessary fixes, or you can mark the task as completed and start another run when needed. The data copy process will reprocess records again from the last successful point.
The following points explain how DAC handles the data copy process:
DAC carries out the data copy process by reading from source tables and writing to target tables using the JDBC connectivity layer.
When writing to target tables, DAC uses array inserts, updates and deletes to improve performance. If an array operation fails, then the records from the failed array are processed one by one.
Because writing is usually slower than reading (particularly in incremental mode), you can use multiple writers to modify the same table.
In full mode, the table is truncated (if the corresponding truncate flag is specified) and all records are inserted from the source to the target without checking which ones already exist. Incremental mode is handled batch by batch. For single-column unique keys, the target table is scanned to see which records in a batch are already there and which are not. Then, all new records are inserted at one time, and all existing records are updated in one batch. For multi-column unique keys, all records from a batch are deleted and inserted.
The SQL statements used to read from the source and write to the target (and check for existing tables) are specified in the data_copy.xml file located in the <DAC_Config_Location>\CustomSQLs directory. You can modify these SQL statements, but you should not change the name of the SQL block or the %<value% tag. Only advanced DAC users should modify the data_copy.xml file.