Oracle® Business Intelligence Data Warehouse Administration Console Guide > Common Tasks Performed in the DAC > Overview of Change Capture Process (Siebel Sources Only) >

Change Capture Mechanisms


There are two kinds of change capture mechanisms used:

  • Change capture using tables

    This is the most common method. It uses S_ETL_I_IMG_ and S_ETL_R_IMG_ table types and the LAST_UPD column of the source tables.

  • Change capture using the date column

    In some cases, a predefined date column is used to enable change capture (without use of S_ETL_I_IMG_ and S_ETL_R_IMG_ tables).

Change Capture Using Tables

When S_ tables are extracted, the process looks for row ID information (the combination of ROW_ID and MODIFICATION_NUM from the S_ table) that does not exist in the row image table (S_ETL_R_IMG_) and in columns where the value for LAST_UPD is more recent than that of LAST_REFRESH_DATE minus the Prune Days parameter setting. This information is inserted into S_ETL_I_IMG table. The S_ETL_I_IMG_ table is joined with the base table during the SDE extraction process to extract only the change capture rows during refresh.

S_ETL_R_IMG tables store the ROW_ID, MODIFICATION_NUM, and LAST_UPD for the rows in the S_ table that have the LAST_UPD in the defined Prune Days period. The LAST_UPD column is used to delete the records from the S_ETL_R_IMG table. Records are deleted as soon as they go beyond the Prune Days period. This table is used to make sure that records that fall in the Prune Days period are not captured as updates unless they have actually been updated. This guarantees an efficient and faster change capture refresh. For information about tracking deleted records, see

Once the ETL process is completed, the data from the change capture image tables (S_ETL_I_IMG_) is pushed into the row image (S_ETL_R_IMG) table. The S_ETL_R_IMG_ information is subsequently used in the next refresh.

Although the LAST_UPD column in Siebel transactional tables is used for change capture, the timestamp reflects the time the data is committed in the database, rather than the actual transaction event time. This may happen because of remote synchronization, handheld synchronization, UTC conversion, and other processes in which there may be a significant lag between the time of a transaction and its commitment to the database. It is possible, therefore, for a data row to get committed to the transactional database with a LAST_UPD date that is older than the date on which last refresh was executed. Consequently, if the extract is based purely on the date in LAST_UPD, some rows might be missed during extraction.

The LAST_UPD date column, however, still permits change capture process efficiency by limiting the number of rows that have to be compared. The rows from transactional tables are filtered based on the LAST_UPD date being more recent than the LAST_REFRESH_DATE, minus the prune days. Then the ROW_ID and MODIFICATION_NUM combination is compared with the row image table to discover the changed records.

The Prune Days parameter ensures that the rows having LAST_UPD values older than LAST_REFRESH_DATE are not missed. This is a parameter that customers can set based on experience with processes (such as remote synchronization) that may cause records to be missed.

Primary and Auxiliary Tables

The DAC performs change capture for both primary and auxiliary tables. When more than one source table is involved, then both the auxiliary and primary table records need to be marked as changed. For auxiliary tables, you need to write auxiliary mappings to mark the primary tables as changed. The SQL queries that do this are part of the mapping SDEINC_FindAux_.

The extract logic sometimes requires that rows be considered as changed, even if the record has not been updated in the primary table (and therefore extracted during the SDE process). This situation occurs when child table rows have changed and the header/master rows need to be extracted so that data warehouse tables are loaded with a consistent set.

When the S_CONTACT_X row is changed, the corresponding S_CONTACT also needs to be extracted. In this case, rows in S_CONTACT are also marked as changed by inserting rows in the change capture row image table.

When the S_ORDERITEM row is changed, the corresponding S_DOC_ORDER also needs to be extracted. In this case, rows in S_DOC_ORDER are also marked as changed (by inserting rows in the change capture row image table).

These auxiliary changes capture processes are heavily dependent on the data warehouse data model and are required to support the ETL logic.

Example: Building S_ETL_I_IMG_ Table for Loading Account Dimension

This section gives an extended example of the process of change capture using tables.

  1. Load image tables for all relevant source tables.

    The content of this entity comes from the S_ORG_EXT and S_ORG_EXT_X tables. Whenever any of the rows change in either of these tables, the record is marked as changed.

    The image table for S_ORG_EXT is S_ETL_I_IMG_26. The image table prefix can be found using the DAC to view any source table. This table is truncated before loading with fresh data during every refresh.

    During the ETL, process rows are inserted into S_ETL_I_IMG_26 by selecting ROW_ID information from S_ORG_EXT, for rows (combined ROW_ID and MODIFICATION_NUM) that do not exist in the S_ETL_R_IMG_26 and for which LAST_UPD is more recent than LAST_REFRESH_DATE minus the Prune Days setting. This is done during the ETL execution by the DAC's internal image building tasks.

    Similarly, the image table S_ETL_I_IMG_27 for S_ORG_EXT_X is loaded.

  2. Load the image table for auxiliary table-based changes.

    In addition to the basic change capture, extra processing might be required due to special ETL requirements. In this example, it happens that S_ORG_EXT needs to be extracted for processing even if only S_ORG_EXT_X changes. This is because both the tables are joined to form W_ORG_D, and the extract process of W_ORG_D (a SDE mapping) looks for a changed ROW_ID in the change capture row image table for the primary table S_ORG_EXT only. Therefore, the extract happens only when the ROW_ID for S_ORG_EXT exists in the row image table.

    In this case, the SDEINC_FindAux_ mapping is needed to insert corresponding rows of S_ORG_EXT.ROW_ID in the change capture row image table whenever S_ORG_EXT_X changes. The following logical statement shows the method:

    Identify the records that have changed in the S_ORG_EXT_X (rows in S_ETLI_IMG_27) table and then find the corresponding rows in S_ORG_EXT. Insert the ROW_ID and MODIFICATION_NUM of those corresponding rows from S_ORG_EXT into S_ETL_I_IMG_26 table.

    Using Informatica, the auxiliary mapping SDEINC_FindAux_ has to be written for each primary table that requires it, depending on data warehouse extract logic. Using the DAC, this auxiliary task has to be linked as a parent to the extract mapping for the base table (S_ORG_EXT in this case).

    This is the SQL override for the SDEINC_FindAux Informatica mapping:

    SELECT

       S_ORG_EXT.ROW_ID,

       S_ORG_EXT.MODIFICATION_NUM,

       S_ORG_EXT.LAST_UPD

    FROM

       S_ORG_EXT,

       S_ORG_EXT_X,

       S_ETL_I_IMG_27 IMG

    WHERE

       (

       IMG.ROW_ID = S_ORG_EXT_X.ROW_ID

       AND

       S_ORG_EXT_X.PAR_ROW_ID = S_ORG_EXT.ROW_ID

       )

       AND NOT EXISTS

       (  SELECT 'X'

          FROM

          S_ETL_I_IMG_26 IMG1

          WHERE

          IMG1.ROW_ID = S_ORG_EXT.ROW_ID

       )

  3. Extract source table information using change capture image information.

    After the records that are new or modified are identified, those rows are loaded into the staging tables. The Informatica mappings that load the staging tables use the ROW_ID information captured in the image tables.

    This example shows the loading of staging table W_ORG_DS. The main logic of populating this table lies in the SQL override of the mapping SDE_OrganizationDimension.

    The DAC creates views on tables that are being extracted. The views are different, depending on whether a table is extracted the first time or is a change capture extract.

    • If extracting for the first time, the view is created as SELECT * FROM S_ORG_EXT.
    • If it is a change capture extract, the view is created as SELECT * FROM S_ORG_EXT, S_ETL_I_IMG_26 IMG WHERE S_ORG_EXT.ROW_ID = IMG.ROW_ID.

      The SQL override in the mapping uses the view to extract the data.

    SELECT

       S_ORG_EXT.ROW_ID,

       S_ORG_EXT.NAME, .....

       ....

       ....

    FROM

       V_ORG_EXT,

       S_ORG_EXT_X,

       .....

    WHERE

       {

          V_ORG_EXT S_ORG_EXT

          LEFT OUTER JOIN S_ORG_EXT_X ON

          S_ORG_EXT.ROW_ID = S_ORG_EXT_X.PAR_ROW_ID

          .....

       }

    AND

       S_ORG_EXT.ROW_ID <> 'INT_COMPANY_ID'

Change Capture Using the Date Column

Forecasts are extracted without using the image table. The value S_FCSTSER_DATE is tracked using the date column ARCHIVE_TS. The administrator sets the ARCHIVE_TS for forecasts that are submitted, frozen, and ready to be loaded into the Oracle Business Analytics Warehouse. S_ETL_RUN stores the previous ETL date when forecasts were extracted and the current ETL date when the forecasts are being extracted. All forecasts with ARCHIVE_TS values greater than that of the previous ETL date and ARCHIVE_TS (less than the current ETL date) are extracted in the current ETL. Both ETL date and ARCHIVE_TS (less than the current ETL date) are stored in S_ETL_CURR_RUN.

NOTE:  Forecasts in the Oracle Business Analytics Warehouse are never updated. Once loaded, they are frozen.

SELECT

.....

FROM

   S_FCSTSER_DATE,

   S_FCSTSER,

   S_ETL_CURR_RUN,

   ........

WHERE

   S_FCSTSER_DATE.FCSTSER_ID = S_FCSTSER.ROW_ID

   AND S_FCSTSER_DATE.ARCHIVE_TS > S_ETL_CURR_RUN.PREV_LOAD_DT

   AND S_FCSTSER_DATE.ARCHIVE_TS <= S_ETL_CURR_RUN.LOAD_DT

Oracle® Business Intelligence Data Warehouse Administration Console Guide Copyright © 2007, Oracle. All rights reserved.