4       Dimension Loading Process

This chapter provides information about Dimension Loading Process in the Oracle Financial Services Data Foundation application.

Topics:

·        Dimension Table Population

·        Overview of Seeded Dimensions

·        Overview of SCD Process

·        Prerequisites

·        Tables Used by the SCD Component

·        Guidelines for Configuring User Defined (Custom) SCDs

·        Executing the SCD Component

·        Checking the Execution Status

·        Different Methods of Loading or Executing the Dimensions

Dimension Table Population

Dimension Tables in Data Foundation Solutions are of two types:

·        Seeded Dimensions

·        SCD Dimensions

Data Foundation solutions use the SCD component to handle dimensional data changes.

Overview of Seeded Dimensions

The Seeded Dimensions are those Dimension tables which are pre-populated with data (which are standard and OFSAA specific codes). The Banks are supposed to refer these dimension tables and its values to configure the ETL Layer of Staging area wherever there is a corresponding reference.

Overview of SCD Process

A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. SCDs are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records. There are three types of SCDs and you can use Warehouse Builder to define, deploy, and load all three types of SCDs.

Type 1 SCDs - Overwriting

The Type 1 methodology overwrites old data with new data, and therefore does not track historical data. This is useful for making changes to dimension data.

Example:

In this example, N_PRODUCT_SKEY is the surrogate key column which is a unique key for each record in the dimension table. V_PRODUCT_NAME is the product name. D_START_DATE indicates the date from which this product record is valid.

 

Table 11: Type 1 SCD methodology example after the execution during a processing period

N_PRODUCT_S KEY

V_PRODUCT_NAME

D_START_DATE

D_END_DATE

F_LATEST_RECORD_ INDICATOR

1

PL

5/31/2010

12/31/9999

Y

 

D_END_DATE indicates the date till which this product record is valid.

F_LATEST_RECORD_INDICATOR with value 'Y', which indicates this is the latest record in the dimension table for this product and 'N' indicates it is not. If the V_PRODUCT_NAME column is set as a Type 1 SCD column and if there is a change in the product name to 'Personal Loan' from 'PL' in the above example, in the next processing period, then when SCD is executed for the new processing period the record in the above example changes to the following values represented by an example.

Record Change Example:

 

Table 12: Type 1 SCD methodology example after the execution and record change during a new processing period

N_PRODUCT_SK EY

V_PRODUCT_NAME

D_START_DATE

D_END_DATE

F_LATEST_RECORD_INDICATOR

1

Personal Loan

6/30/2010

12/31/9999

Y

 

Type 2 SCDs - Creating another dimension record

The Type 2 method tracks historical data by creating multiple records for a given natural key in the dimensional tables with separate surrogate keys. With Type 2, the historical changes in dimensional data are preserved. In the above example for the change in product name from 'PL' to 'Personal Loan' if history has to be preserved, then the V_PRODUCT_NAME column has to be set as Type 2 when SCD is processed for the processing period and the change inserts a new record as shown in the following example:

 

Table 13: Type 2 SCD methodology example after the execution during a processing period

N_PRODUCT_SKEY

V_PRODUCT_NAME

D_START_DATE

D_END_DATE

F_LATEST_RECORD_INDICATOR

1

PL

5/31/2010

12/31/9999

N

2

Personal Loan

6/30/2010

12/31/9999

Y

 

A new record is inserted to the product dimension table with the new product name. The latest record indicator for this is set as 'Y', indicating this is the latest record for the personal loan product. The same flag for the earlier record was set to 'N'.

Type 3 SCDs - Creating a current value field

A Type 3 SCD stores two versions of values for certain selected level attributes. Each record stores the previous value and the current value of the selected attribute.

When the value of any of the selected attributes changes, the current value is stored as the old value and the new value becomes the current value.

For more information on SCDs, see:

·        Oracle Data Integrator Best Practices for a Data Warehouse at

§       http://www.oracle.com/technetwork/middleware/data-integrator/overview/odi-bestpractices-datawarehouse-whi-129686.pdf 

·        Oracle® Warehouse Builder Data Modeling, ETL, and Data Quality Guide at

§       http://docs.oracle.com/cd/E11882_01/owb.112/e10935.pdf  

Additional online sources include:

·        http://en.wikipedia.org/wiki/Slowly_changing_dimension

·        http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/10g/r2/owb/owb10gr2_gs/owb/lesson3/slowlychangingdimensions.htm

An excellent published resource that covers SCD in detail is "The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling" by Ralph Kimball and Margy Ross.

The SCDs used in Data Foundation solutions are listed in the Oracle Financial Services Data Foundation - SCD Technical Metadata file.

Prerequisites

·        The SCD executable should be present under <installation home>ficdb/bin. The file name is scd.

·        The user executing the SCD component should have execute rights on the file mentioned as prerequisite in point 2.

·        The setup tables accessed by SCD component are SYS_TBL_MASTER and SYS_STG_JOIN_MASTER.

SYS_TBL_MASTER stores the information like which is the source stage table and the target dimension tables. The source sometimes can be the database views which could be simple or a complex view.

SYS_STG_JOIN_MASTER stores the information like which source column is mapped to which column of a target dimension table. It makes use of data base sequence to populate into surrogate key columns of dimension tables.

Tables Used by the SCD Component

The database tables used by the SCD component are:

·        SYS_TBL_MASTER

·        SYS_STG_JOIN_MASTER

SYS_TBL_MASTER

The solution installer will populate one row per dimension for the seeded dimensions in this table. SYS_TBL_MASTER:

 

Table 14: Rows populated for the seeded dimensions in the SYS_TBL_MASTER table

Column Name

Data Type

Column Description

MAP_REF_NUM

NUMBER(3) NOT NULL

The Mapping Reference Number for this unique mapping of a Source to a Dimension Table.

TBL_NM

VARCHAR2(30) NOT NULL

Dimension Table Name

STG_TBL_NM

VARCHAR2(30) NOT NULL

Staging Table Name

SRC_PRTY

SRC_PRTY NUMBER(2) NULL

Priority of the Source when multiple sources are mapped to the same target.

SRC_PROC_SEQ

NUMBER(2) NOT NULL

The sequence in which the various sources for the DIMENSION will be taken up for processing.

SRC_TYP

VARCHAR2(30) NULL

The type of the Source for a Dimension, that is, Transaction Or Master Source.

DT_OFFSET

NUMBER(2) NULL

The offset for calculating the Start Date based on the Functional Requirements Document (FRD).

SRC_KEY

NUMBER(3) NULL

 

 

Sample Data: This is the row put in by the solution installer for the Line of Business dimension.

 

Table 15: Sample data in the rows populated for the Line of Business dimension in the SYS_TBL_MASTER table

Parameter

Value

MAP_REF_NUM

6

TBL_NM

DIM_LOB

STG_TBL_NM

STG_LOB_MASTER

SRC_PRTY

 

SRC_PROC_SEQ

23

SRC_TYP

MASTER

DT_OFFSET

0

SRC_KEY

 

 

NOTE:   

For any new dimension added, a row will have to be inserted to this table manually.

SYS_STG_JOIN_MASTER

The solution installer will populate this table for the seeded dimensions. Table for Seeded Dimension:

 

Table 16: Rows populated for the seeded dimensions in the SYS_STG_JOIN_MASTER table

Column Name

Data Type

Column Description

MAP_REF_NUM

NUMBER(3) NOT NULL

The Mapping Reference

Number for this unique mapping of a Source to a Dimension Table.

COL_NM

VARCHAR2(30) NOT NULL

Name of the column in the Dimension Table.

COL_TYP

VARCHAR2(30) NOT NULL

Type of column. The possible values are given in the following section.

STG_COL_NM

VARCHAR2(60) NULL

Name of the column in the Staging Table.

SCD_TYP_ID

NUMBER(3) NULL

SCD type for the column.

PRTY_LOOK? UP_REQD_FLG

CHAR(1) NULL

Column to determine whether Lookup is required for Priority of Source against the Source Key

Column or not.

COL_DATATYPE

VARCHAR2(15) NULL

The list of possible values are VARCHAR, DATE, NUMBER based on the underlying column datatype.

COL_FORMAT

VARCHAR2(15) NULL

 

 

The possible values for column type (the COL_TYPE column) in SYS_STG_JOIN_MASTER are:

·        PK – Primary Dimension Value (may be multiple for a given ?Mapping Reference Number")

·        SK – Surrogate Key

·        DA – Dimensional Attribute (may be multiple for a given ?Mapping Reference Number")

·        SD – Start Date

·        ED – End Date

·        LRI – Latest Record Indicator (Current Flag)

·        CSK – Current Surrogate Key

·        PSK – Previous Surrogate Key

·        SS – Source Key

·        LUD – Last Updated Date / Time

·        LUB – Last Updated By

·        NN – Not Null

Sample Data: This is the row put in by the solution installer for the Line of Business dimension. Sample Data:

 

Table 17: Sample data in the rows populated for the Line of Business dimension in the SYS_STG_JOIN_MASTER table

Parameter

Value

MAP_REF_NUM

6

COL_NM

V_LOB_CODE

COL_TYP

PK

STG_COL_NM

V_LOB_CODE

SCD_TYP_ID

 

PRTY_LOOKUP_REQD_FLG

N

COL_DATATYPE

VARCHAR

COL_FORMAT

61

 

NOTE:   

For any new dimension added, the column details will have to be inserted to this table manually.

 

DIM_<dimensionname>_V – The database view which SCD uses as the source. Example: Dim_Bands_V

These views come as part of install for the dimensions seeded with the application.

NOTE:   

For any new dimension added, a view will have to be created similar to DIM_BANDS_V.

 

 

DIM_<dimensionname> – Output table to which SCD writes the dimension data. A sequence should be added for every user-defined dimension.

NOTE:   

Example:

Create sequence SEQ_DIM_<DIM> minvalue 1 maxvalue 999999999999999999999999999 increment by 1.

 

Guidelines for Configuring User Defined (Custom) SCDs

All positive numbering series of Map Reference Numbers ranging from 1 to 999 are reserved for OFSAA Out Of Box Application usage. For custom SCDs, you must use negative numbering series of Map Reference Numbers ranging from -2 to -999.

Executing the SCD Component

To execute the SCD component from Operations module of OFSAAI, create a batch according to the following steps:

NOTE:   

For a more comprehensive coverage of configuration and execution of a batch, see Oracle Financial Services Advanced Analytical Applications Infrastructure User Guide Release 8.1.1.0.0

 

1.     From the Home menu, select Operations, then select Batch Maintenance.

2.     Click New Batch (?+? symbol in Batch Name container) and enter the Batch Name and Description.

3.     Click Save.

4.     Select the Batch you created in the earlier step by clicking the check box in the Batch Name container.

5.     Click New Task (?+? symbol in Task Details container).

6.     Enter the Task ID and Description.

7.     Select Run Executable, from the Component ID list.

8.     Click Parameters. Select the following from the Dynamic Parameters List and then click Save:

§       Datastore Type ? Select the appropriate datastore from the list

§       Datastore Name ? Select the appropriate name from the list

§       IP address ? Select the IP address from the list

§       Executable ? scd,<map ref num>

NOTE:   

Example:

scd, 61 (Refer the following sections for details).

 

Wait: When the file is being executed you have the choice to either wait till the execution is complete or proceed with the next task. Click the list box of the field provided for Wait in the Value field to select 'Yes' or 'No'. Clicking Yes confirms that you wish to wait for the execution to be complete. Clicking No indicates that you wish to proceed.

Batch Parameter: Clicking Yes would mean that the batch parameters are also passed to the executable being started; else the batch parameters will not be passed to the executable.

ATTENTION:   

Always select Y in Batch Parameter.

 

For the Parameter Executable earlier mentioned, the map ref num values are

·        ?1 (if you want to process all the dimensions). The Executable parameter mentioned earlier would be scd,-1

·        If you want to process for a single dimension, query the database table SYS_TBL_MASTER and give the number in the map_ref_num column for the dimension you want to process. These are the ones which come seeded with the install.

·        Execute the batch from Batch Execution by choosing the batch created following the steps mentioned in the preceding sections for a date.

NOTE:   

A seeded batch FSDFINFO_DATA_FOUNDATION_SCD is provided which consists of all the required dimensions as different tasks that are part of SCD.

 

Checking the Execution Status

The status of execution can be monitored using the Batch Monitor screen. You can access this from the Left Hand Side (LHS) menu as follows:

From the Home menu, select Operations, then select Batch Monitor.

NOTE:   

For a more comprehensive coverage, see Oracle Financial Services Advanced Analytical Applications Infrastructure User Guide Release 8.1.1.0.0.

 

The status messages in Batch Monitor are:

·        N - Not Started

·        O - On Going

·        F - Failure

·        S – Success

The ICC execution log can be accessed on the application server in the following directory ftpshare/logs/<Run_Date>/FSDFINFO/RUN EXECUTABLE. The file name will have the batch execution id.

Sample

/dbfiles/home/oracle/OFSAAI/ficdb/log/ficgen

The detailed SCD component log can be accessed on the application server in the directory

$FIC_HOME, go one folder up from there and then accessing the following

ftpshare/logs/<Run_Date>/FSDFINFO/RUN EXECUTABLE

The file name will have the batch execution id.

Check the .profile file in the installation home if you are not able to find the paths mentioned earlier.

Different Methods of Loading or Executing the Dimensions

This section provides information about different methods of loading or executing the Dimension tables.

LOAD DIM TABLES THROUGH SCD

Batch FSDFINFO_DATA_FOUNDATION_SCD has been introduced with 177 tasks under it. These 177 tasks represent the 177 SCD processes where different staging tables would be the source and Dimension Tables would be the targets. The required SCDs have been introduced into SYS_TBL_MASTER table, and subsequently into SYS_STG_JOIN_MASTER.

Depending on the requirement by an application, a task can be excluded or included from the batch execution.

LOAD KEY DIMENSIONS USING DRMLOADER AND HIERARCHY FLATTENING

The Dimension Loader functionality in FSDF, enables you to load dimension tables such as DIM_ORG_UNIT,DIM_GL_ACCOUNT,DIM_COMMON_COA, DIM_PRODUCT, DIM_ORG_STRUCTURE.

Loading Data from STG_INTF Tables to DIM_INTF Tables

The following tables are part of staging. Dimension data is stored in the following set of tables:

·        STG _<DIMENSION>_B_INTF - Stores leaf and node member codes within the dimension.

·        STG_<DIMENSION>_ TL_INTF - Stores names of leaf and node and their translations.

·        STG_<DIMENSION>_ ATTR_INTF - Stores attribute values for the attributes of the dimension.

·        STG_<DIMENSION>_ HIER_INTF - Stores parent-child relationship of members and nodes that are part of hierarchies.

·        STG_HIERARCHIES_INTF - Stores master information related to hierarchies. Data present in the above set of staging dimension tables are loaded into the below set of dimension tables.

·        DIM_<DIMENSION>_ B - Stores leaf and node member codes within the dimension.

·        DIM_<DIMENSION>_TL - Stores names of leaf and node and their translations.

·        DIM_<DIMENSION>_ATTR - Stores attribute values for the attributes of the dimension.

·        DIM_<DIMENSION>_HIER - Stores parent-child relationship of members and nodes that are part of hierarchies.

·        REV_HIERARCHIES - Stores hierarchy related information.

·        REV_HIERARCHY_LEVELS - Stores levels of the hierarchy.

For more information on viewing the structure of staging tables, refer to the Oracle Financial Services Analytical Applications Data Model Data Dictionary.

Staging tables are present for all key dimensions that are configured within the OFSAAI framework. The Dimension IDs used for key dimensions are:

Table 18: The Dimension IDs used for the key Dimensions

Dimension ID

Dimension Table

Staging Tables

1

DIM_ORG_UNIT

STG_ORG_UNIT_**_INTF

2

DIM_GL_ACCOUNT

STG_GL_**_INTF

3

DIM_COMMON_COA

STG_COMMON_COA_**_INTF

4

DIM_PRODUCT

STG_PRODUCTS_**_INTF

5

DIM_ORG_STRUC- TURE

STG_LEGAL_ENTITY_**_INTF

 

Executing The Loading Procedure using Batch Framework

The batch for Key Dimension Loader can be executed by executing the task (Task for Loading Data from DRM) present in the seeded batch FSDFINFO_DATA_FOUNDATION_SCD.

Below are the input parameters:

·        pDIMENSIONID: This is the dimension ID.

·        pSynchFlag: This parameter is used to identify if a complete synchronization of data between staging and fusion table is required. The default value is 'Y'.

For more details refer to Dimension Loaders Section, in Chapter 4 Data Loaders, of the Oracle Financial Services Analytical Applications Data Model Utilities User Guide.

Executing The Hierarchy Flattening Procedure using Batch Framework

The batch for Dimension Hierarchy flattening can be executed by executing the task (DT for DRM Dimension Hierarchy Flattening) present in the seeded batch FSDFINFO_DATA_FOUNDATION_SCD.

Below are the input parameters:

·        pDIMENSIONID: This is the dimension ID.

·        pHierarchyId: Enter the Hierarchy ID. If all the hierarchies belonging to a dimension are to be processed then, provide NULL as the parameter value. Else, provide the System Identifier of the hierarchy that needs to be transformed.

For more details refer to Hierarchy Transformation Section, in Chapter 4 Data Loaders, of the Oracle Financial Services Analytical Applications Data Model Utilities User Guide.

Executing The SCD for Loading Key Dimensions using Batch Framework

The batch for Key Dimension Loading into final Dimension tables can be executed by executing the seeded batch <Infodom>_POP_KEY_DIMENSION_SCD.

Improving SCD Performance

SCD performance can be improved by providing hints and session alter statements. This requires the presence of the following four columns in SYS_TBL_MASTER:

·        merge_hint

·        select_hint

·        session_enable_statement

·        session_disable_statement

These columns are present in the OFSAAI versions 7.3.2.4.0 and higher. If these have to be used in OFSAAI versions 7.3.2.2.0 or 7.3.2.3.0 and higher, execute the following SQL queries:

ALTER TABLE SYS_TBL_MASTER ADD MERGE_HINT VARCHAR2(255)

/

ALTER TABLE SYS_TBL_MASTER ADD SELECT_HINT VARCHAR2(255)

/

ALTER TABLE SYS_TBL_MASTER ADD SESSION_ENABLE_STATEMENT VARCHAR2(255)

/

ALTER TABLE SYS_TBL_MASTER ADD SESSION_DISABLE_STATEMENT VARCHAR2(255)

/

During upgrade to OFSAAI 7.3.2.4.0, ensure to backup SYS_TBL_MASTER table and to drop the preceding four columns, if these scripts are executed in any of the OFSAAI versions prior to 7.3.2.4.0. Otherwise, an upgrade to OFSAAI 7.3.2.4.0 may throw an error, since the columns are existing.

·        For improving performance, hints for the MERGE query which is generated internally by the SCD can be provided under MERGE_HINT. Session alters could be mentioned under SESSION_ENABLE_STATEMENT and SESSION_DISABLE_STATEMENT columns.

·        SESSION_ENABLE_STATEMENTs will be executed before the MERGE in the SCD and SESSION_DISABLE_STATEMENTs will be executed after the SCD MERGE.

·        Since all the tasks under the SCD batch for DIM_ACCOUNT works on the same target, the SESSION_DISABLE_STATEMENTs in SYS_TBL_MASTER cannot be provided when tasks are executed. In this case, there can be a separate SQL file to contain all the SESSION_DISABLE_STATEMENTs to be executed once after all the tasks in the SCD are done. The SESSION_DISABLE_STATEMENT will hold a null in SYS_TBL_MASTER table.

·        SESSION_ENABLE_STATEMENTs are required to be mentioned only for the first task in the batch. Here the target is the same for all the tasks under a batch. In case any of the tasks are to be executed separately, then the SESSION_ENABLE_STATEMENTs should be mentioned for any one of the tasks which is included in the batch for the execution.

Example

MERGE_HINT and SESSION_ENABLE_STATEMENT in SYS_TBL_MASTER

Table 19: Merge Hint and Session Enable Statement details

Table Name

Stage Table Name

Merge Hint

Session Enable Statement

DIM_ACCOUNT

STG_LOAN_CONTRACTS_V

/*+parallel

?alter session enable

 

 

(DIM_ACCOUNT,10)

*/

parallel dml query?, ?alter table DIM_ACCOUNT

 

 

 

nologging parallel 10?

 

·        Execute the SQL file with all the SESSION_DISABLE_STATEMENTs, after the successful completion of the SCD batch.