This chapter provides information about Dimension Loading Process in the Oracle insurance data foundation application.
Topics:
· Populating Data in Dimension Table
· Tables Used by the SCD Component
· Supplementary Information for Dimension Loading Process
A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. SCDs are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records. There are three types of SCDs and you can use Warehouse Builder to define, deploy, and load all three types of SCDs.
Topics:
· Type 2 SCDs - Creating another dimension record
· Type 3 SCDs - Creating a current value field
The Type 1 SCD overwrites old data with new data and therefore does not track historical data. This is useful for making changes to dimension data. This is the default type of dimension that is created. You do not need to specify any additional information to create a Type 1 SCD.
In this example, N_PRODUCT_SKEY is the surrogate key column which is a unique key for each record in the dimension table. V_PRODUCT_NAME is the product name. D_START_DATE indicates the date from which this product record is valid.
N_PRODUCT_ SK EY |
V_PRODUCT_ NAME |
D_START_DATE |
D_END_DAT E |
F_LATEST_RECORD_ INDICATOR |
---|---|---|---|---|
1 |
PL |
5/31/2010 |
12/31/9999 |
Y |
D_END_DATE indicates the date till which this product record is valid.
F_LATEST_RECORD_INDICATOR with value Y, which indicates that this is the latest record in the dimension table for this product and N indicates it is not. If the V_PRODUCT_NAME column is set as a Type 1 SCD column and if there is a change in the product name to Personal Loan from PL in the above example; in the next processing period, when SCD is executed for the new processing period, the record in the above example changes to the following.
Record Change Example:
N_PRODUCT_SK EY |
V_PRODUCT_NAME |
D_START_DATE |
D_END_DATE |
F_LATEST_RECORD_INDICATOR |
---|---|---|---|---|
1 |
Personal Loan |
6/30/2010 |
12/31/9999 |
Y |
The Type 2 method tracks historical data by creating multiple records for a given natural key in the dimensional tables with separate surrogate keys. With Type 2, the historical changes in dimensional data are preserved.
Therefore, a Type 2 SCD retains the full history of values. When the value of a chosen attribute changes, the current record is closed. A new record is created with the changed data values and this new record becomes the current record. Each record contains the effective time and expiration time to identify the time period between which the record was active.
In the above example for the change in product name from PL to Personal Loan if history has to be preserved, then the V_PRODUCT_NAME column has to be set as Type 2 when SCD is processed for the processing period and the change inserts a new record as shown in the following example:
N_PRODUCT_SKEY |
V_PRODUCT_NAME |
D_START_DATE |
D_END_DATE |
F_LATEST_RECORD_INDICATOR |
---|---|---|---|---|
1 |
PL |
5/31/2010 |
12/31/9999 |
N |
2 |
Personal Loan |
6/30/2010 |
12/31/9999 |
Y |
A new record is inserted to the product dimension table with the new product name. The latest record indicator for this is set as 'Y', indicating this is the latest record for the personal loan product. The same flag for the earlier record was set to 'N'.
A Type 3 SCD stores two versions of values for certain selected level attributes. Each record stores the previous value and the current value of the selected attribute.
When the value of any of the selected attributes changes, the current value is stored as the old value and the new value becomes the current value.
For more information on SCDs, see:
· Oracle Data Integrator Best Practices for a Data Warehouse at
· Oracle® Warehouse Builder Data Modeling, ETL, and Data Quality Guide at
§ http://docs.oracle.com/cd/E11882_01/owb.112/e10935.pdf
Additional online sources include:
· http://en.wikipedia.org/wiki/Slowly_changing_dimension
An excellent published resource that covers SCD in detail is "The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling" by Ralph Kimball and Margy Ross.
The SCDs used in Data Foundation solutions are listed in the OIDF SCD Technical Metadata file of this release.
Topics:
Data Foundation solutions use the SCD component to handle dimensional data changes. For more details about the SCD component and SCD process, see the Slowly Changing Dimension (SCD) Process and the Execute the SCD Batch sections respectively.
· The SCD executable should be present under <installation home>ficdb/bin. The file name is scd.
· The user executing the SCD component should have execute rights on the file mentioned as prerequisite in point 2.
· The setup tables accessed by SCD component are SYS_TBL_MASTER and SYS_STG_JOIN_MASTER.
SYS_TBL_MASTER stores the information like which is the source stage table and the target dimension tables. The source sometimes can be the database views which could be simple or a complex view.
SYS_STG_JOIN_MASTER stores the information like which source column is mapped to which column of a target dimension table. It makes use of data base sequence to populate into surrogate key columns of dimension tables.
The database tables used by the SCD component are as follows:
The solution installer populates one row per dimension for the seeded dimensions in the SYS_TBL_MASTER table.
Column Name |
Data Type |
Column Description |
---|---|---|
MAP_REF_NUM |
NUMBER(3) NOT NULL |
The Mapping Reference Number for this unique mapping of a Source to a Dimension Table. |
TBL_NM |
VARCHAR2(30) NOT NULL |
Dimension Table Name |
STG_TBL_NM |
VARCHAR2(30) NOT NULL |
Staging Table Name |
SRC_PRTY |
SRC_PRTY NUMBER(2) NULL |
Priority of the Source when multiple sources are mapped to the same target. |
SRC_PROC_SEQ |
NUMBER(2) NOT NULL |
The sequence in which the various sources for the DIMENSION will be taken up for processing. |
SRC_TYP |
VARCHAR2(30) NULL |
The type of the Source for a Dimension, that is, Transaction Or Master Source. |
DT_OFFSET |
NUMBER(2) NULL |
The offset for calculating the Start Date based on the Functional Requirements Document (FRD). |
SRC_KEY |
NUMBER(3) NULL |
|
Sample Data: This is the row put in by the solution installer for the Line of Business dimension.
Parameter |
Value |
---|---|
MAP_REF_NUM |
6 |
TBL_NM |
DIM_LOB |
STG_TBL_NM |
STG_LOB_MASTER |
SRC_PRTY |
|
SRC_PROC_SEQ |
23 |
SRC_TYP |
MASTER |
DT_OFFSET |
0 |
SRC_KEY |
|
NOTE:
For any new dimension added, a row will have to be inserted to this table manually.
The solution installer will populate this table for the seeded dimensions. Table for Seeded Dimension:
Column Name |
Data Type |
Column Description |
---|---|---|
MAP_REF_NUM |
NUMBER(3) NOT NULL |
The Mapping Reference Number for this unique mapping of a Source to a Dimension Table. |
COL_NM |
VARCHAR2(30) NOT NULL |
Name of the column in the Dimension Table. |
COL_TYP |
VARCHAR2(30) NOT NULL |
Type of column. The possible values are given in the following paragraph. |
STG_COL_NM |
VARCHAR2(60) NULL |
Name of the column in the Staging Table. |
SCD_TYP_ID |
NUMBER(3) NULL |
SCD type for the column. |
PRTY_LOOKUP_ REQD_FLG |
CHAR(1) NULL |
Column to determine whether Lookup is required for Priority of Source against the Source Key Column or not. |
COL_DATATYPE |
VARCHAR2(15) NULL |
The list of possible values is VARCHAR, DATE, NUMBER based on the underlying column data type. |
COL_FORMAT |
VARCHAR2(15) NULL |
Column format. |
The possible values for column type (the COL_TYPE column) in SYS_STG_JOIN_MASTER are:
· PK – Primary Dimension Value (may be multiple for a given ?Mapping Reference Number")
· SK – Surrogate Key
· DA – Dimensional Attribute (may be multiple for a given ?Mapping Reference Number")
· SD – Start Date
· ED – End Date
· LRI – Latest Record Indicator (Current Flag)
· CSK – Current Surrogate Key
· PSK – Previous Surrogate Key
· SS – Source Key
· LUD – Last Updated Date / Time
· LUB – Last Updated By
· NN – Not Null
Sample Data: This is the row put in by the solution installer for the Line of Business dimension.
Sample Data:
Parameter |
Value |
---|---|
MAP_REF_NUM |
6 |
COL_NM |
V_LOB_CODE |
COL_TYP |
PK |
STG_COL_NM |
V_LOB_CODE |
SCD_TYP_ID |
|
PRTY_LOOKUP_REQD_FLG |
N |
COL_DATATYPE |
VARCHAR |
COL_FORMAT |
61 |
NOTE:
For any new dimension added, the column details will have to be inserted to this table manually.
The database view which SCD uses as the source.
Example:
DIM_BANDS_V
These views come as part of the install for the dimensions seeded with the application.
NOTE:
For any new dimension added, a view will have to be created similar to DIM_BANDS_V.
DIM_<dimensionname>: Output table to which SCD writes the dimension data. A sequence must be added for every user-defined dimension.
Example:
create sequence SEQ_DIM_<DIM> minvalue 1
maxvalue 999999999999999999999999999
increment by 1
Load data into the Dimension tables using one of the following suitable processes.
Topics:
· Loading Policy Tables through Insurance SCD
· Loading Dimension Tables through Data Foundation SCD
· Loading Key Dimensions using AMHM and Hierarchy Flattening
· Loading Data from STG_INTF Tables to DIM_INTF Tables
· Executing the Loading Procedure using Batch Framework
· Executing the Hierarchy Flattening Procedure using Batch Framework
· Checking the SCD Batch Execution Status
This is an automatic process wherein the Seeded data for Dimensions, which are packaged within the installer, are populated during the installation.
NOTE:
To access the Seeded data for Dimensions, see the latest version of the Oracle Insurance Data Foundation Application Pack Run Chart.
To execute Policy Dimension SCD batch <INFODOM>_INSURANCE_SCD, see Execute the SCD Batch of the DIM_POLICY table.
To execute an SCD batch <INFODOM>_DATA_FOUNDATION_SCD for any other Dimension table, see Execute the <INFODOM>_DATA_FOUNDATION_SCD batch for the required Dimension table.
NOTE:
For more comprehensive coverage of configuration and execution of a batch, see Oracle Financial Services Advanced Analytical Applications Infrastructure User Guide Release 8.1.1.0.0.
The Dimension Loader functionality in the Data Foundation application enables you to load Dimension tables such as DIM_ORG_UNIT, DIM_GL_ACCOUNT, DIM_COMMON_COA, DIM_PRODUCT, and DIM_ORG_STRUCTURE.
For more information about loading the Dimensions using AMHM, see the Dimension Management section in Oracle Financial Services Advanced Analytical Applications Infrastructure User Guide Release 8.1.1.0.0 and Dimension Load Procedure section in the Oracle Financial Services Analytical Applications Data Model Utilities User Guide.
The following tables are a part of the Staging. The Dimension data is stored in the following set of tables.
Item |
Description |
---|---|
STG _<DIMENSION>_B_INTF |
Stores leaf and node member codes within the dimension. |
STG_<DIMENSION>_TL_INTF |
Stores names of leaf and node and their translations. |
STG_<DIMENSION>_ATTR_INTF |
Stores attribute values for the attributes of the dimension. |
STG_<DIMENSION>_HIER_INTF |
Stores parent-child relationships of members and nodes that are part of hierarchies. |
STG_HIERARCHIES_INTF |
Stores master information related to hierarchies. Data present in the above set of staging dimension tables are loaded into the below set of dimension tables. |
DIM_<DIMENSION>_ B |
Stores leaf and node member codes within the dimension. |
DIM_<DIMENSION>_TL |
Stores names of leaf and node and their translations. |
DIM_<DIMENSION>_ATTR |
Stores attribute values for the attributes of the dimension. |
DIM_<DIMENSION>_HIER |
Stores parent-child relationships of members and nodes that are part of hierarchies. |
REV_HIERARCHIES |
Stores hierarchy related information. |
REV_HIERARCHY_LEVELS |
Stores levels of the hierarchy. |
For more information about viewing the structure of staging tables, see the Oracle Financial Services Analytical Applications (OFSAA) Data Model Document Generation Release 8.1.x.
Staging tables are present for all key dimensions that are configured within the OFSAAI framework. The Dimension IDs used for key dimensions are as follows.
Dimension ID |
Dimension Table |
Staging Tables |
---|---|---|
1 |
DIM_ORG_UNIT |
STG_ORG_UNIT_**_INTF |
2 |
DIM_GL_ACCOUNT |
STG_GL_**_INTF |
3 |
DIM_COMMON_COA |
STG_COMMON_COA_**_INTF |
4 |
DIM_PRODUCT |
STG_PRODUCTS_**_INTF |
5 |
DIM_ORG_STRUC- TURE |
STG_LEGAL_ENTITY_**_INTF |
The batch for Key Dimension Loader can be executed by executing the task (Task for Loading Data from DRM) present in the seeded batch <infodom>_DATA_FOUNDATION_SCD.
Below are the input parameters:
· pDIMENSIONID: This is the dimension ID.
· pSynchFlag: This parameter is used to identify if a complete synchronization of data between staging and fusion table is required. The default value is 'Y'.
For more information, see the Dimension Loaders section in the Data Loaders section in the Oracle Financial Services Analytical Applications Data Model Utilities User Guide.
The batch for Dimension Hierarchy flattening can be executed by executing the task (DT for DRM Dimension Hierarchy Flattening) present in the seeded batch <infodom>_DATA_FOUNDATION_SCD.
Below are the input parameters:
· pDIMENSIONID: This is the dimension ID.
· pHierarchyId: Enter the Hierarchy ID. If all the hierarchies belonging to a dimension are to be processed, then provide NULL as the parameter value. Else, provide the System Identifier of the hierarchy that must be transformed.
For more details, see the Hierarchy Transformation section in the Data Loaders section the Oracle Financial Services Analytical Applications Data Model Utilities User Guide.
To check the SCD batch execution status, see Verify Log Files and Check Error Messages if any.
The following are the supplementary information required for the Dimension loading process.
Topics:
· Handling Multiple GAAP Codes for the Same Account Number for the Same MIS Date in SCD
· Handling Multiple GAAP Codes for the Same Account Number for the Same MIS Date in the Function
SCD performance can be improved by providing hints and session alter statements. This requires the presence of the following four columns in SYS_TBL_MASTER:
· merge_hint
· select_hint
· session_enable_statement
· session_disable_statement
These columns are present in the OFSAAI versions 7.3.2.4.0 and above. If these have to be used in OFSAAI versions 7.3.2.2.0 or 7.3.2.3.0, execute the following SQL queries:
ALTER TABLE SYS_TBL_MASTER ADD MERGE_HINT VARCHAR2(255)
/
ALTER TABLE SYS_TBL_MASTER ADD SELECT_HINT VARCHAR2(255)
/
ALTER TABLE SYS_TBL_MASTER ADD SESSION_ENABLE_STATEMENT VARCHAR2(255)
/
ALTER TABLE SYS_TBL_MASTER ADD SESSION_DISABLE_STATEMENT VARCHAR2(255)
/
During the upgrade to OFSAAI 7.3.2.4.0, ensure to backup SYS_TBL_MASTER table and to drop the preceding four columns, if these scripts are executed in any of the OFSAAI versions prior to 7.3.2.4.0. Otherwise, an upgrade to OFSAAI 7.3.2.4.0 may throw an error, since the columns exist.
For improving performance, hints for the MERGE query, which is generated internally by the SCD, can be provided under MERGE_HINT. The following session alters can be mentioned in the SESSION_ENABLE_STATEMENT and SESSION_DISABLE_STATEMENT columns.
1. SESSION_ENABLE_STATEMENTs are executed before the MERGE in the SCD and SESSION_DISABLE_STATEMENTs are executed after the SCD MERGE.
2. Since all the tasks under the SCD batch for DIM_ACCOUNT works on the same target, the SESSION_DISABLE_STATEMENTs in SYS_TBL_MASTER cannot be provided when tasks are executed. In this case, there can be a separate SQL file to contain all the SESSION_DISABLE_STATEMENTs to be executed once after all the tasks in the SCD are done. The SESSION_DISABLE_STATEMENT will hold a null in the SYS_TBL_MASTER table.
3. SESSION_ENABLE_STATEMENTs are required to be mentioned only for the first task in the batch. Here the target is the same for all the tasks under a batch. In case any of the tasks are to be executed separately, then the SESSION_ENABLE_STATEMENTs should be mentioned for any one of the tasks which is included in the batch for the execution.
Example
MERGE_HINT and SESSION_ENABLE_STATEMENT in SYS_TBL_MASTER
Table Name |
Stage Table Name |
Merge Hint |
Session Enable Statement |
---|---|---|---|
DIM_ACCOUNT |
STG_LOAN_CONTRACTS_V |
/*+parallel |
?alter session enable |
|
|
(DIM_ACCOUNT,10) */ |
parallel dml query?, ?alter table DIM_ACCOUNT |
|
|
|
nologging parallel 10? |
4. Execute all the tasks in parallel. This may cause N_RCV_LEG_ACCT_SKEY to have an incremental value as compared to N_ACCT_SKEY.
5. Execute the SQL file with all the SESSION_DISABLE_STATEMENTs, after the successful completion of the SCD batch.
6. After the DIM_ACCOUNT table is populated using this approach, you cannot use the initial approach (FN_POPDIMACCOUNT) as this will lead to SKey conflict.
7. Ensure that you have set the value of the sequence SEQ_DIM_ACCOUNT_SCD as max (value of SKey in DIM_ACCOUNT) +1, before moving from old to a new approach.
8. The F_LATEST_RECORD_INDICATOR for an existing DIM_ACCOUNT data already loaded by the function must be updated to 'Y' before running the SCD, failing which a new SKey may get generated for the same account number.
9. SCD execution occurs based on the GAAP code, which is configured in the SETUP_MASTER table. These are introduced to tackle the scenario of multiple GAAP codes. Whether or not there exist multiple GAAP codes, SETUP_MASTER must be manually configured as follows:
V_COMPONENT_CODE |
V_COMPONENT_DESC |
V_COMPONENT_VALUE |
---|---|---|
DEFAULT_GAAP |
DEFAULT_GAAP |
USGAAP |
Where V_COMPONENT_VALUE must be manually populated with the required GAAP code. For all other GAAP codes, ensure to update SETUP_MASTER manually before running DIM_ACCOUNT SCD.
NOTE:
For illustration, Account Dimension is considered.
When multiple GAAP codes exist for the same account number for the same MIS date, configure the SETUP_MASTER table manually as mentioned in the preceding section:
· V_COMPONENT_VALUE will hold the GAAP code for which the SCD must be executed.
· If there are different GAAP codes for two distinct account numbers for the same MIS date, then the SCD has to be executed for each GAAP code by changing the V_COMPONENT_VALUE manually in SETUP_MASTER table. The SETUP_MASTER table should have only one record WHERE V_COMPONENT_DESC = 'DEFAULT_GAAP'.
NOTE:
For illustration, Account Dimension is considered.
For the FN_POPDIMACCOUNT function, you have to create views and use these views instead of the tables in the FSI_DIM_ACCOUNT_SETUP_DETAILS table. For Product Processors having GAAP code as part of the Primary Key, create a view on the table with a filter on the GAAP code as:
· where V_GAAP_CODE = (SELECT V_COMPONENT_VALUE FROM SETUP_MASTER WHERE V_COMPONENT_DESC = 'DEFAULT_GAAP')
· Use this view under TABLE_NAME in the FSI_DIM_ACCOUNT_SETUP_DETAILS table. If there are different GAAP codes for two distinct account numbers for the same MIS date, then the function has to be executed for each GAAP code by changing the V_COMPONENT_VALUE manually in SETUP_MASTER table. The SETUP_MASTER table should have only one record WHERE V_COMPONENT_DESC ='DEFAULT_GAAP'.
NOTE:
If STG_OPTION_CONTRACTS is loaded for two MIS dates, and FN_POPDIMACCOUNT is executed, then all records irrespective of the MIS date will get loaded. To resolve this, remove the expression under SQL_TEXT in FSI_DIM_ACCOUNT_SETUP_DETAILS for STG_OPTION_CONTRACTS and use the same expression to create a view and use this view as the TABLE_NAME in FSI_DIM_ACCOUNT_SETUP_DETAILS.