This chapter provides information about Dimension Loading Process in the Oracle Financial Services Data Foundation application.
Topics:
· Overview of Seeded Dimensions
· Tables Used by the SCD Component
· Guidelines for Configuring User Defined (Custom) SCDs
· Checking the Execution Status
· Different Methods of Loading or Executing the Dimensions
Dimension Tables in Data Foundation Solutions are of two types:
· Seeded Dimensions
· SCD Dimensions
Data Foundation solutions use the SCD component to handle dimensional data changes.
The Seeded Dimensions are those Dimension tables which are pre-populated with data (which are standard and OFSAA specific codes). The Banks are supposed to refer these dimension tables and its values to configure the ETL Layer of Staging area wherever there is a corresponding reference.
A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. SCDs are dimensions that have data that changes slowly, rather than changing on a time-based, regular schedule. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records. There are three types of SCDs and you can use Warehouse Builder to define, deploy, and load all three types of SCDs.
The Type 1 methodology overwrites old data with new data, and therefore does not track historical data. This is useful for making changes to dimension data.
Example:
In this example, N_PRODUCT_SKEY is the surrogate key column which is a unique key for each record in the dimension table. V_PRODUCT_NAME is the product name. D_START_DATE indicates the date from which this product record is valid.
N_PRODUCT_S KEY |
V_PRODUCT_NAME |
D_START_DATE |
D_END_DATE |
F_LATEST_RECORD_ INDICATOR |
---|---|---|---|---|
1 |
PL |
5/31/2010 |
12/31/9999 |
Y |
D_END_DATE indicates the date till which this product record is valid.
F_LATEST_RECORD_INDICATOR with value 'Y', which indicates this is the latest record in the dimension table for this product and 'N' indicates it is not. If the V_PRODUCT_NAME column is set as a Type 1 SCD column and if there is a change in the product name to 'Personal Loan' from 'PL' in the above example, in the next processing period, then when SCD is executed for the new processing period the record in the above example changes to the following values represented by an example.
Record Change Example:
N_PRODUCT_SK EY |
V_PRODUCT_NAME |
D_START_DATE |
D_END_DATE |
F_LATEST_RECORD_INDICATOR |
---|---|---|---|---|
1 |
Personal Loan |
6/30/2010 |
12/31/9999 |
Y |
The Type 2 method tracks historical data by creating multiple records for a given natural key in the dimensional tables with separate surrogate keys. With Type 2, the historical changes in dimensional data are preserved. In the above example for the change in product name from 'PL' to 'Personal Loan' if history has to be preserved, then the V_PRODUCT_NAME column has to be set as Type 2 when SCD is processed for the processing period and the change inserts a new record as shown in the following example:
N_PRODUCT_SKEY |
V_PRODUCT_NAME |
D_START_DATE |
D_END_DATE |
F_LATEST_RECORD_INDICATOR |
---|---|---|---|---|
1 |
PL |
5/31/2010 |
12/31/9999 |
N |
2 |
Personal Loan |
6/30/2010 |
12/31/9999 |
Y |
A new record is inserted to the product dimension table with the new product name. The latest record indicator for this is set as 'Y', indicating this is the latest record for the personal loan product. The same flag for the earlier record was set to 'N'.
A Type 3 SCD stores two versions of values for certain selected level attributes. Each record stores the previous value and the current value of the selected attribute.
When the value of any of the selected attributes changes, the current value is stored as the old value and the new value becomes the current value.
For more information on SCDs, see:
· Oracle Data Integrator Best Practices for a Data Warehouse at
· Oracle® Warehouse Builder Data Modeling, ETL, and Data Quality Guide at
§ http://docs.oracle.com/cd/E11882_01/owb.112/e10935.pdf
Additional online sources include:
· http://en.wikipedia.org/wiki/Slowly_changing_dimension
An excellent published resource that covers SCD in detail is "The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling" by Ralph Kimball and Margy Ross.
The SCDs used in Data Foundation solutions are listed in the Oracle Financial Services Data Foundation - SCD Technical Metadata file.
· The SCD executable should be present under <installation home>ficdb/bin. The file name is scd.
· The user executing the SCD component should have execute rights on the file mentioned as prerequisite in point 2.
· The setup tables accessed by SCD component are SYS_TBL_MASTER and SYS_STG_JOIN_MASTER.
SYS_TBL_MASTER stores the information like which is the source stage table and the target dimension tables. The source sometimes can be the database views which could be simple or a complex view.
SYS_STG_JOIN_MASTER stores the information like which source column is mapped to which column of a target dimension table. It makes use of data base sequence to populate into surrogate key columns of dimension tables.
The database tables used by the SCD component are:
· SYS_TBL_MASTER
· SYS_STG_JOIN_MASTER
The solution installer will populate one row per dimension for the seeded dimensions in this table. SYS_TBL_MASTER:
Column Name |
Data Type |
Column Description |
---|---|---|
MAP_REF_NUM |
NUMBER(3) NOT NULL |
The Mapping Reference Number for this unique mapping of a Source to a Dimension Table. |
TBL_NM |
VARCHAR2(30) NOT NULL |
Dimension Table Name |
STG_TBL_NM |
VARCHAR2(30) NOT NULL |
Staging Table Name |
SRC_PRTY |
SRC_PRTY NUMBER(2) NULL |
Priority of the Source when multiple sources are mapped to the same target. |
SRC_PROC_SEQ |
NUMBER(2) NOT NULL |
The sequence in which the various sources for the DIMENSION will be taken up for processing. |
SRC_TYP |
VARCHAR2(30) NULL |
The type of the Source for a Dimension, that is, Transaction Or Master Source. |
DT_OFFSET |
NUMBER(2) NULL |
The offset for calculating the Start Date based on the Functional Requirements Document (FRD). |
SRC_KEY |
NUMBER(3) NULL |
|
Sample Data: This is the row put in by the solution installer for the Line of Business dimension.
Parameter |
Value |
---|---|
MAP_REF_NUM |
6 |
TBL_NM |
DIM_LOB |
STG_TBL_NM |
STG_LOB_MASTER |
SRC_PRTY |
|
SRC_PROC_SEQ |
23 |
SRC_TYP |
MASTER |
DT_OFFSET |
0 |
SRC_KEY |
|
NOTE:
For any new dimension added, a row will have to be inserted to this table manually.
The solution installer will populate this table for the seeded dimensions. Table for Seeded Dimension:
Column Name |
Data Type |
Column Description |
---|---|---|
MAP_REF_NUM |
NUMBER(3) NOT NULL |
The Mapping Reference Number for this unique mapping of a Source to a Dimension Table. |
COL_NM |
VARCHAR2(30) NOT NULL |
Name of the column in the Dimension Table. |
COL_TYP |
VARCHAR2(30) NOT NULL |
Type of column. The possible values are given in the following section. |
STG_COL_NM |
VARCHAR2(60) NULL |
Name of the column in the Staging Table. |
SCD_TYP_ID |
NUMBER(3) NULL |
SCD type for the column. |
PRTY_LOOK? UP_REQD_FLG |
CHAR(1) NULL |
Column to determine whether Lookup is required for Priority of Source against the Source Key Column or not. |
COL_DATATYPE |
VARCHAR2(15) NULL |
The list of possible values are VARCHAR, DATE, NUMBER based on the underlying column datatype. |
COL_FORMAT |
VARCHAR2(15) NULL |
|
The possible values for column type (the COL_TYPE column) in SYS_STG_JOIN_MASTER are:
· PK – Primary Dimension Value (may be multiple for a given ?Mapping Reference Number")
· SK – Surrogate Key
· DA – Dimensional Attribute (may be multiple for a given ?Mapping Reference Number")
· SD – Start Date
· ED – End Date
· LRI – Latest Record Indicator (Current Flag)
· CSK – Current Surrogate Key
· PSK – Previous Surrogate Key
· SS – Source Key
· LUD – Last Updated Date / Time
· LUB – Last Updated By
· NN – Not Null
Sample Data: This is the row put in by the solution installer for the Line of Business dimension. Sample Data:
Parameter |
Value |
---|---|
MAP_REF_NUM |
6 |
COL_NM |
V_LOB_CODE |
COL_TYP |
PK |
STG_COL_NM |
V_LOB_CODE |
SCD_TYP_ID |
|
PRTY_LOOKUP_REQD_FLG |
N |
COL_DATATYPE |
VARCHAR |
COL_FORMAT |
61 |
NOTE:
For any new dimension added, the column details will have to be inserted to this table manually.
DIM_<dimensionname>_V – The database view which SCD uses as the source. Example: Dim_Bands_V
These views come as part of install for the dimensions seeded with the application.
NOTE:
For any new dimension added, a view will have to be created similar to DIM_BANDS_V.
DIM_<dimensionname> – Output table to which SCD writes the dimension data. A sequence should be added for every user-defined dimension.
NOTE:
Example:
Create sequence SEQ_DIM_<DIM> minvalue 1 maxvalue 999999999999999999999999999 increment by 1.
All positive numbering series of Map Reference Numbers ranging from 1 to 999 are reserved for OFSAA Out Of Box Application usage. For custom SCDs, you must use negative numbering series of Map Reference Numbers ranging from -2 to -999.
To execute the SCD component from Operations module of OFSAAI, create a batch according to the following steps:
NOTE:
For a more comprehensive coverage of configuration and execution of a batch, see Oracle Financial Services Advanced Analytical Applications Infrastructure User Guide Release 8.1.1.0.0
1. From the Home menu, select Operations, then select Batch Maintenance.
2. Click New Batch (?+? symbol in Batch Name container) and enter the Batch Name and Description.
3. Click Save.
4. Select the Batch you created in the earlier step by clicking the check box in the Batch Name container.
5. Click New Task (?+? symbol in Task Details container).
6. Enter the Task ID and Description.
7. Select Run Executable, from the Component ID list.
8. Click Parameters. Select the following from the Dynamic Parameters List and then click Save:
§ Datastore Type ? Select the appropriate datastore from the list
§ Datastore Name ? Select the appropriate name from the list
§ IP address ? Select the IP address from the list
§ Executable ? scd,<map ref num>
NOTE:
Example:
scd, 61 (Refer the following sections for details).
Wait: When the file is being executed you have the choice to either wait till the execution is complete or proceed with the next task. Click the list box of the field provided for Wait in the Value field to select 'Yes' or 'No'. Clicking Yes confirms that you wish to wait for the execution to be complete. Clicking No indicates that you wish to proceed.
Batch Parameter: Clicking Yes would mean that the batch parameters are also passed to the executable being started; else the batch parameters will not be passed to the executable.
ATTENTION:
Always select Y in Batch Parameter.
For the Parameter Executable earlier mentioned, the map ref num values are
· ?1 (if you want to process all the dimensions). The Executable parameter mentioned earlier would be scd,-1
· If you want to process for a single dimension, query the database table SYS_TBL_MASTER and give the number in the map_ref_num column for the dimension you want to process. These are the ones which come seeded with the install.
· Execute the batch from Batch Execution by choosing the batch created following the steps mentioned in the preceding sections for a date.
NOTE:
A seeded batch FSDFINFO_DATA_FOUNDATION_SCD is provided which consists of all the required dimensions as different tasks that are part of SCD.
The status of execution can be monitored using the Batch Monitor screen. You can access this from the Left Hand Side (LHS) menu as follows:
From the Home menu, select Operations, then select Batch Monitor.
NOTE:
For a more comprehensive coverage, see Oracle Financial Services Advanced Analytical Applications Infrastructure User Guide Release 8.1.1.0.0.
The status messages in Batch Monitor are:
· N - Not Started
· O - On Going
· F - Failure
· S – Success
The ICC execution log can be accessed on the application server in the following directory ftpshare/logs/<Run_Date>/FSDFINFO/RUN EXECUTABLE. The file name will have the batch execution id.
Sample
/dbfiles/home/oracle/OFSAAI/ficdb/log/ficgen
The detailed SCD component log can be accessed on the application server in the directory
$FIC_HOME, go one folder up from there and then accessing the following
ftpshare/logs/<Run_Date>/FSDFINFO/RUN EXECUTABLE
The file name will have the batch execution id.
Check the .profile file in the installation home if you are not able to find the paths mentioned earlier.
This section provides information about different methods of loading or executing the Dimension tables.
Batch FSDFINFO_DATA_FOUNDATION_SCD has been introduced with 177 tasks under it. These 177 tasks represent the 177 SCD processes where different staging tables would be the source and Dimension Tables would be the targets. The required SCDs have been introduced into SYS_TBL_MASTER table, and subsequently into SYS_STG_JOIN_MASTER.
Depending on the requirement by an application, a task can be excluded or included from the batch execution.
The Dimension Loader functionality in FSDF, enables you to load dimension tables such as DIM_ORG_UNIT,DIM_GL_ACCOUNT,DIM_COMMON_COA, DIM_PRODUCT, DIM_ORG_STRUCTURE.
The following tables are part of staging. Dimension data is stored in the following set of tables:
· STG _<DIMENSION>_B_INTF - Stores leaf and node member codes within the dimension.
· STG_<DIMENSION>_ TL_INTF - Stores names of leaf and node and their translations.
· STG_<DIMENSION>_ ATTR_INTF - Stores attribute values for the attributes of the dimension.
· STG_<DIMENSION>_ HIER_INTF - Stores parent-child relationship of members and nodes that are part of hierarchies.
· STG_HIERARCHIES_INTF - Stores master information related to hierarchies. Data present in the above set of staging dimension tables are loaded into the below set of dimension tables.
· DIM_<DIMENSION>_ B - Stores leaf and node member codes within the dimension.
· DIM_<DIMENSION>_TL - Stores names of leaf and node and their translations.
· DIM_<DIMENSION>_ATTR - Stores attribute values for the attributes of the dimension.
· DIM_<DIMENSION>_HIER - Stores parent-child relationship of members and nodes that are part of hierarchies.
· REV_HIERARCHIES - Stores hierarchy related information.
· REV_HIERARCHY_LEVELS - Stores levels of the hierarchy.
For more information on viewing the structure of staging tables, refer to the Oracle Financial Services Analytical Applications Data Model Data Dictionary.
Staging tables are present for all key dimensions that are configured within the OFSAAI framework. The Dimension IDs used for key dimensions are:
Dimension ID |
Dimension Table |
Staging Tables |
---|---|---|
1 |
DIM_ORG_UNIT |
STG_ORG_UNIT_**_INTF |
2 |
DIM_GL_ACCOUNT |
STG_GL_**_INTF |
3 |
DIM_COMMON_COA |
STG_COMMON_COA_**_INTF |
4 |
DIM_PRODUCT |
STG_PRODUCTS_**_INTF |
5 |
DIM_ORG_STRUC- TURE |
STG_LEGAL_ENTITY_**_INTF |
The batch for Key Dimension Loader can be executed by executing the task (Task for Loading Data from DRM) present in the seeded batch FSDFINFO_DATA_FOUNDATION_SCD.
Below are the input parameters:
· pDIMENSIONID: This is the dimension ID.
· pSynchFlag: This parameter is used to identify if a complete synchronization of data between staging and fusion table is required. The default value is 'Y'.
For more details refer to Dimension Loaders Section, in Chapter 4 Data Loaders, of the Oracle Financial Services Analytical Applications Data Model Utilities User Guide.
The batch for Dimension Hierarchy flattening can be executed by executing the task (DT for DRM Dimension Hierarchy Flattening) present in the seeded batch FSDFINFO_DATA_FOUNDATION_SCD.
Below are the input parameters:
· pDIMENSIONID: This is the dimension ID.
· pHierarchyId: Enter the Hierarchy ID. If all the hierarchies belonging to a dimension are to be processed then, provide NULL as the parameter value. Else, provide the System Identifier of the hierarchy that needs to be transformed.
For more details refer to Hierarchy Transformation Section, in Chapter 4 Data Loaders, of the Oracle Financial Services Analytical Applications Data Model Utilities User Guide.
The batch for Key Dimension Loading into final Dimension tables can be executed by executing the seeded batch <Infodom>_POP_KEY_DIMENSION_SCD.
SCD performance can be improved by providing hints and session alter statements. This requires the presence of the following four columns in SYS_TBL_MASTER:
· merge_hint
· select_hint
· session_enable_statement
· session_disable_statement
These columns are present in the OFSAAI versions 7.3.2.4.0 and higher. If these have to be used in OFSAAI versions 7.3.2.2.0 or 7.3.2.3.0 and higher, execute the following SQL queries:
ALTER TABLE SYS_TBL_MASTER ADD MERGE_HINT VARCHAR2(255)
/
ALTER TABLE SYS_TBL_MASTER ADD SELECT_HINT VARCHAR2(255)
/
ALTER TABLE SYS_TBL_MASTER ADD SESSION_ENABLE_STATEMENT VARCHAR2(255)
/
ALTER TABLE SYS_TBL_MASTER ADD SESSION_DISABLE_STATEMENT VARCHAR2(255)
/
During upgrade to OFSAAI 7.3.2.4.0, ensure to backup SYS_TBL_MASTER table and to drop the preceding four columns, if these scripts are executed in any of the OFSAAI versions prior to 7.3.2.4.0. Otherwise, an upgrade to OFSAAI 7.3.2.4.0 may throw an error, since the columns are existing.
· For improving performance, hints for the MERGE query which is generated internally by the SCD can be provided under MERGE_HINT. Session alters could be mentioned under SESSION_ENABLE_STATEMENT and SESSION_DISABLE_STATEMENT columns.
· SESSION_ENABLE_STATEMENTs will be executed before the MERGE in the SCD and SESSION_DISABLE_STATEMENTs will be executed after the SCD MERGE.
· Since all the tasks under the SCD batch for DIM_ACCOUNT works on the same target, the SESSION_DISABLE_STATEMENTs in SYS_TBL_MASTER cannot be provided when tasks are executed. In this case, there can be a separate SQL file to contain all the SESSION_DISABLE_STATEMENTs to be executed once after all the tasks in the SCD are done. The SESSION_DISABLE_STATEMENT will hold a null in SYS_TBL_MASTER table.
· SESSION_ENABLE_STATEMENTs are required to be mentioned only for the first task in the batch. Here the target is the same for all the tasks under a batch. In case any of the tasks are to be executed separately, then the SESSION_ENABLE_STATEMENTs should be mentioned for any one of the tasks which is included in the batch for the execution.
Example
MERGE_HINT and SESSION_ENABLE_STATEMENT in SYS_TBL_MASTER
Table Name |
Stage Table Name |
Merge Hint |
Session Enable Statement |
---|---|---|---|
DIM_ACCOUNT |
STG_LOAN_CONTRACTS_V |
/*+parallel |
?alter session enable |
|
|
(DIM_ACCOUNT,10) */ |
parallel dml query?, ?alter table DIM_ACCOUNT |
|
|
|
nologging parallel 10? |
· Execute the SQL file with all the SESSION_DISABLE_STATEMENTs, after the successful completion of the SCD batch.