Migrating Data Using Siebel Enterprise Integration Manager

This chapter describes special considerations if you are using Siebel Enterprise Integration Manager (EIM) to import, export, update, merge, or delete data within Siebel Business Applications on DB2 for z/OS. Before using EIM, familiarize yourself with Siebel Enterprise Integration Manager Administration Guide. This chapter only discusses those elements of EIM operation specific to running EIM on DB2.

This chapter consists of the following topics:

About Setting Up EIM for DB2

There are many ways you can optimize EIM data throughput. EIM is similar in function to other DB2 applications, so use regular performance monitoring tools to monitor and fine-tune performance.

Review and tune performance for each set of business data you load with EIM, for example, Opportunities, Contacts, or Products.

Choosing an Appropriate Partitioning Scheme

To achieve the best system throughput, partition base tables and EIM tables and set up parallel EIM processes to efficiently exploit partitioned EIM table spaces.

When a table is created on a partitioned table space, the table is assigned a partitioning key composed of one or more columns. EIM tables by default use the clustering U1 index, IF_ROW_BATCH_NUM, ROW_ID. In some cases, you can obtain better performance by partitioning EIM tables based on the partitioning key of the target base table, for example, on IF_ROW_BATCH_NUM plus the partitioning key of the corresponding target table. For more information on partitioning EIM tables, see EIM Tables and Partitioning.

Note: Do not update the values in the partitioning key because you might have to perform an unload and reload or a reorganization of the data to resolve performance issues.

EIM and Table Partitioning

The mechanism by which EIM generates the ROW_ID can result in an uneven distribution of data if you use EIM with one of the default partitioning schemes to import data into a base table. For recommendations about how to avoid uneven data distribution in a partitioned table containing data imported by EIM, see Prepartitioned Siebel Tables.

Considerations in Planning Data Mapping

Data mapping, an important part of preparing to run EIM, is particularly important on DB2 because of the way DB2 stores data in tables and indexes. When planning your data mapping, take into account such factors as partitioning, lookups during the load phase, and searches for online transactions.

About Deciding What Data to Import

The Siebel Business Applications Data Model provides for many possible business needs and configurations. Many of the tables, columns, and indexes provided with Siebel Business Applications might not be relevant to your business. Because DB2 stores information in all columns, including NULL columns, consider carefully what data you want to import into your Siebel database. Importing unnecessary data increases the size of tables and indexes, which can affect performance. For information on removing columns and indexes and reducing column sizes using Siebel Tools, see Customizing a Development Environment.

MS_IDENT Column for DB2 for z/OS

The MS_IDENT column exists in every EIM table and is specific to DB2 for z/OS. It allows DB2 for z/OS to automatically generate unique and sequential values for each row in a table. This column is the unique identity column created using the Generated Always with Identity option.

Caution: If you load data into this column using load utilities, your import might fail. Such a failure can occur particularly when migrating data from a Siebel application on DB2 for UNIX or Windows to a Siebel application on DB2 for z/OS.

How to Presort Data During an Initial Load

When loading the EIM tables with legacy data, sort this data on the partitioning key that is used on the target base table. When sorting string data, use the same character-collating sequence used in the target database. During this load, also preassign and load batch numbers corresponding to the partitioning scheme of the target base tables when possible. This method is particularly useful for those processes, such as Accounts or Contacts, where the partitioning key is based on business data rather than on a generated key. Preassigning and loading batch numbers generally improves EIM throughput, as the data is then loaded sequentially and by partition into the target base table.

Tip: To further improve performance, after the EIM tables are loaded with partitioned data, unload this data in clustering order, and then reload it.

Optimal EIM Batch Size

You can regulate EIM commit frequency in several ways, including:

Setting COMMIT EACH TABLE to the valueTRUE
Setting COMMIT EACH PASS to the valueTRUE
Adjusting batch size

Of these, the batch size most directly influences EIM performance.

The size of EIM batches can significantly affect throughput. This issue occurs as a result of the number of commits that EIM must execute. While each commit is CPU-intensive, commits release page locks, thereby releasing memory and avoiding lock escalation and timeouts.

While every installation is different, perform initial testing with large batch sizes (except DELETE EXACT, which normally runs better with smaller batch sizes). You might want to use this large-batch testing as a starting point for your own processes and modify batch size based on results.

Note: Processes that populate child tables might violate the maximum number of locks allowed or might cause contention for resources in those child tables. To avoid this, reduce the batch size of these processes.

Optimal Number of Parallel Streams

A rule to determine how many batches can run simultaneously is that the combined number of parallel streams should keep the CPU 100 percent busy. These can either be multiple EIM processes running simultaneously, or the same EIM process repeated against multiple partitions of the same table.

If you run EIM processes in parallel on a DB2 database with the default setting of UPDATE STATISTICS set to TRUE in the EIM configuration file (.ifb), deadlocks can occur when multiple EIM processes access the same interface table simultaneously. To avoid a deadlock, set UPDATE STATISTICS to FALSE in the EIM configuration file.

A similar problem might occur if parallel processes access the same child tables while updating foreign keys. Therefore, analyze your EIM mappings and processes, and do not run these processes in parallel. Alternatively, you can try altering the locking level from Page to Row.

DB2 Database Logging Performance

The number of parallel streams that the DB2 for z/OS environment can support is often limited more by the DB2 logging throughput rate than by anything else. The commit process waits for a successful write to the active log and, if there is contention on the logs, performance suffers. Possible solutions to ease bottlenecks caused by logging performance include:

Increase the number of output buffers for writing active log data sets. The size of the output buffer can be changed using the DSNZPARM parameter, OUTBUFF.
Increase the active log size. If logs fill up faster than they can be archived, performance suffers.
Determine the optimal number of DB2 active logs based on peak EIM loads, plus a cushion.
Place the active log data sets on separate dedicated volumes.
Place archive logs on a virtual tape system (VTS), or on disk, if VTS is not available.
Define all table spaces (base tables and interface tables) with compression, which reduces logging, and decreases I/O.
Consider data sharing if a single subsystem cannot handle the logging tasks.
Implement data striping.

RUNSTATS Utility

If EIM performance degrades over time, consider running RUNSTATS against the base tables to update the optimizer statistics.

If you set the parameter UPDATE STATISTICS to TRUE in your EIM configuration file (.ifb), the EIM process invokes the DSNUTILS stored procedure. This procedure allows EIM to perform RUNSTATS dynamically on temporary columns to obtain optimal statistics numbers in the DB2 catalog. You must install the IBM DSNUTILS stored procedure and enable the z/OS WLM manager.

To execute EIM with UPDATE STATISTICS set to TRUE, the user account must have EXECUTE privileges on the DSNUTILS stored procedure and SYSADM, STATS, or DBADM privileges on the tables accessed. For more information on RUNSTATS, see DB2 Statistics for Siebel Business Applications.

Note: Set UPDATE STATISTICS to TRUE only once to collect proper statistics for each given EIM process. As soon as the statistic are collected, reset UPDATE STATISTICS to FALSE for that process.

SQLPROFILE Log

The SQLPROFILE parameter in the header of your EIM configuration file (.ifb) designates the file to which EIM writes an analysis of the worst-performing SQL queries overall (by total time), and the worst-performing statements in each step of the EIM process (also by total time), for example:

[Siebel Interface Manager]
  PROCESS = Import Products
  SQLPROFILE = topsql.log

Review this log file after the test run of each EIM process to determine where potential bottlenecks exist in your process. Based on the results of this log, either adjust mapping and IFB parameters, or perform normal DB2 tuning (such as adding an index to improve the access path).

How to Improve EIM Performance When Importing Data

There are a number of techniques you can use to improve EIM performance during import processes, particularly during your initial load of data into Siebel Business Applications.

For example, to improve throughput, if certain values in your database are constant across all rows of a table, use the DEFAULT COLUMN parameter to fill these rather than loading them through the EIM tables to improve throughput. For further information on improving EIM import performance, see:

Follow the recommended import order, as described in Siebel Enterprise Integration Manager Administration Guide for both initial and ongoing EIM loads.

Tuning the EIM Import Process

Review and tune your EIM process for each set of business data you load, for example, Opportunities, Contacts, or Products.

To tune each EIM import process

Load a single batch with the following flags set in your configuration file:
- Error flag set to 1
- SQL flag set to 8
- Trace flag set to 3
Setting these flags produces a log file containing SQL statements and information about the length of time each statement took to execute.
Identify SQL statements that might be taking too long to execute.
Using the DB2 EXPLAIN utility, populate the explain table with information on how the DB2 Optimizer is executing each of these SQL statements.
Carefully review any changes in the default optimization level in the EIM log.
Based on the Access Plan and other information from the EXPLAIN output, determine the required indexes for EIM and base tables.
Run EIM with parameters to reach each partition and repeat single process tuning.
Perform .ifb file tuning for each process, and remove unnecessary foreign keys.
Execute the REORG utility on base tables to build a compression dictionary.
Perform parallel process tuning.
Perform buffer pool tuning.
Execute the STOSPACE utility on base tables. This collects storage information on DB2 objects.
Run the RUNSTATS utility on base tables and EIM tables.
Use SQL explain tools to verify access paths (required indexes).
Modify statistics if required.
Remove unused indexes on base and EIM tables using the SQL DROP command.
Check space and resize table spaces if necessary.

About Improving EIM Performance During Initial Loads

Some considerations relating to improving EIM performance apply only during an initial load of data into your Siebel application. These considerations are described in the following sections.

Unused, Nonunique Indexes

The initial load of data is typically a database-intensive process. Each row that is inserted into the base table requires modifications, not only to the table itself, but to all its affected indexes. However, most of these indexes are never used during an initial load process.

You can usually improve performance by determining which base table indexes are unused by EIM during the import process and removing them prior to the initial load using SQL DROP commands. You must later re-create these indexes.

Unused Triggers

Removing unused triggers can improve import performance by reducing the database system overhead used to verify trigger constraints. You can use the Generate Triggers (GenTrig) component from Siebel Server Tasks to remove the triggers and to re-create them later on. This component must be in the Enabled state. For details on how to work with Generate Triggers, see Siebel Business Process Framework: Workflow Guide.

Note: If you are using partitioning, do not remove triggers that are used for partitioning purposes from the database. If you do, your EIM process fails, or it inserts all the data in a single partition.

Free Space Parameters

If you use your Siebel application primarily for queries, updates, and deletions, alter your table spaces and indexes to provide optimal insert and update performance.

To improve the EIM import and update SQL performance, set PCTFREE for table spaces and indexes to a value of 20 or 30 prior to the EIM initial load, and maintain these settings at 20 or 30 for subsequent loads.

Insert Performance on Base Tables

You can improve the insert performance on base tables during an initial load of data into your Siebel application by following these guidelines:

Set PCTFREE to zero for data and sequential index.
Set FREEPAGE to zero for data and index.
Use Page Lock if you are inserting many rows or pages sequentially.
For a data sharing environment, set the member cluster option to reduce space map and data page P-Lock contention.

Alter base table spaces to PCTFREE and FREEPAGE values of 20 or 30 for subsequent import processes. It is also recommended that you load or reorganize any altered items before the new values take effect. The following example demonstrates how to set these values for your table spaces and indexes. Substitute your own table space and index names for your implementation.

ALTER TABLESPACE SIDB0401.H0401000 PART 1 PCTFREE 20;
ALTER TABLESPACE SIDB0401.H0401000 PART 2 PCTFREE 20;

ALTER INDEX SIDB0401.S_ADDR_ORG_M6         PCTFREE 20;
ALTER INDEX SIDB0401.S_ADDR_ORG_P1         PCTFREE 20;
ALTER INDEX SIDB0401.S_ADDR_ORG_P99 PART 1 PCTFREE 20;
ALTER INDEX SIDB0401.S_ADDR_ORG_P99 PART 2 PCTFREE 20;

Transaction Logging

Siebel transaction logging is unnecessary during an initial EIM load, and should be disabled by setting LOG TRANSACTIONS to FALSE in your .ifb file. For more information on transaction logging in EIM, see Siebel Enterprise Integration Manager Administration Guide.

Resolving Performance Degradation During the Load Process

When executing the EIM load process, performance might degrade noticeably after a number of batch loads are executed. This performance degradation often results when outdated statistics are loaded on the tables. To resolve the problem, update statistics on the target base tables.

You can also improve performance during the load process by updating the S_LST_OF_VAL table. The BU_ID column in the S_LST_OF_VAL base table can have only one or very few distinct values. When this happens, the DB2 optimizer often needlessly performs a table scan through all rows in the S_LST_OF_VAL table.

To avoid this problem and speed up the query, modify the statistics data by running the following SQL statements:

update sysibm.sysindexes set firstkeycardf=1000 where
name='S_LST_OF_VAL_M2';
update sysibm.syscolumns set colcardf = 1000 where
tbname='S_LST_OF_VAL' and name='BU_ID';

Note: Depending on the data you are working with, you might have to run other SQL statements first.

Resolving Errors in the EIM Process

Perform the following steps to resolve errors that can occur during your EIM process:

If EIM aborts with the following generic error, then the user account running EIM does not have EXECUTE privileges on the DSNUTILS stored procedure:
```
DSNU060I database_alias DSNUGMAP - USER username NOT AUTHORIZED FOR RUNSTATS 
UTILITY ON
```
For information about the required permissions, see RUNSTATS Utility. Alternatively, you can run EIM with the UPDATE STATISTICS parameter set to FALSE.
If your EIM process fails with the following error message, then the DSNZPARM parameter RETVLCFK might be set incorrectly.
```
EIM-00205 Failed To Load the Application Dictionary
```
Siebel Business Applications require that the parameter RETVLCFK is set to NO. For information on DSNZPARM settings see DSNZPARM Parameter Settings for Siebel Business Applications.
If EIM terminates during a DELETE EXACT process involving deletion of child records, the following error indicates that the maximum allowable locks have been exceeded:
```
EIM Fails - Max Locks Exceeded on DELETE EXACT
```
This termination causes the child records to be orphaned, because the delete to the parent table has already been committed.
You can avoid this error by specifying the following in your .ifb file:
```
COMMIT EACH PASS = FALSE
COMMIT EACH TABLE = FALSE
ROLLBACK ON ERROR = TRUE
```