This chapter explains the background of OFSDF, its functional architecture along with the differences from traditional warehouse architecture. OFSDF product package consists of the Analytical Data Warehouse model. The individual components of Physical Data Model is explained in detail.
Topics:
· Differences from Traditional Warehouse Architecture
It is important to note that OFSDF architecture differs from 'traditional' data warehouse architecture in key ways.
OFSDF was built to specifically address the key challenges of building a scalable, practical data management platform to support Financial Services Analytics needs. The differences from traditional BI architecture are highlighted and explained in the following sections.
The following diagram depicts the OFSDF functional architecture.
Figure 1: Data Foundation Functional Architecture diagram
The OFSDF architecture can be thought of as two distinct, interacting components. The dark green colored cylindrical portions denote the data repository and the red colored portion denotes the management toolkit.
· A data repository whose physical structure is given by the OFSDF physical data model (Here Physical Data Model = OFSDF Staging + OFSDF Reporting). This user guide only details the structure and organization of the data repository that is covered by the data models in the OFSDF.
· A management toolkit provided by OFSAAI that is used to manage the data repository, by providing a collection of tools and frameworks based on a common metadata foundation. This user guide DOES NOT cover the details of the Analytical Application Infrastructure, as that is a separate product with its own documentation..
The architecture illustrates the following key concepts:
· A unified data-sourcing area for analytics: The OFSDF Staging Data Model provides the basis for central, unified data-sourcing layer for a wide variety of analytical needs. The staging layer faces the operational (OLTP) and front office systems of a bank. It consists of tables to capture key business and operational data from these systems, which is then processed by one or more analytical engines.
· A unified reporting/consumption layer: Analytical results can be simple to complex, quantitative and qualitative measures of a bank's Risk, Compliance, Customer and Financial Performance. The OFSDF Reporting data model is a dimensional data model spanning these key analytical functions. It forms the foundation of OFSAA Business Intelligence applications, but can clearly be used as the result data store for any equivalent engines and processes provided by other vendors, or custom-built solutions. By providing a single data repository for reporting needs, this layer provides a foundation for departmental as well as cross-departmental and cross- functional reporting.
Additionally, the schema of the reporting area is specifically built for Financial Services analytics. As an illustration, the reporting area has a 'Run dimension' that is shared across all BI/reporting use cases. Similarly, the aggregation of shared measures/reporting into a common set of output structures (Account Summary) allows for cross-functional reporting, which is increasingly the norm in Financial Institutions.
· Single point of control and operation: The Oracle Financial Services Analytical Applications Infrastructure is a separate Oracle product that offers a set of tools that are built on a common metadata foundation. These tools are used to control and manage the lifecycle of data from sourcing to reporting. There is a collection of frameworks to manage the following lifecycle steps of data within OFSDF:
§ Metadata Management
§ Data Quality
§ Data Movement
§ Scheduling and runtime operations
§ Security/User management
§ Analytical Process Definition and Execution
· Processing Area: As explained earlier, the primary purpose of the OFSDF is to serve as a container for analytical processing from sourcing to consumption. Such processing is usually delivered in the form of discrete units called analytical applications spanning different analytical use cases ranging from Finance to Risk to Compliance.
These applications consist of custom-built computational engines and numerical libraries, and may execute processes on the data that range from simple aggregations to complex, multi-step stochastic processes such as Monte-Carlo simulation.
Hence, analytical applications place varying demands on the data infrastructure in terms of volumes and speed, and hence place different demands on data architecture. In practice, the normalized (3NF) design favored for Enterprise Data Warehouses often fails to be efficient or performant when it comes to analytical processing across a wide range of use cases.
Therefore, the OFSDF recognizes the need for distinct application-specific working stores, separate from the staging and reporting area. For example, the OFSAA Asset/Liability Management application (ALM) has a distinct set of ALM-specific tables, as does the Market Risk solution.
NOTE:
The structure of these processing area stores is decided by the actual analytical application and engine used. The OFSAA suite of applications is organized this way, with each application managing a specific set of tables/schemas within the processing area.
The processing area tables/schemas are NOT part of the OFSDF. This is because the OFSDF is intended to be an open platform. Other analytical applications and engines can equally provision data out of OFSDF by mapping their input requirements appropriately to the OFSDF staging area model.
This table summarizes the differences of the OFSDF from a traditional Data Warehouse architecture.
Component |
FSDF |
Traditional BI Architecture |
---|---|---|
Staging Layer |
Common staging Area (CSA) where data from source systems are staged and is implemented as database schema. |
Usually a file system-based area where file-based extracts (operational images) from source systems are staged prior to loading into a target schema. |
3rd Normal Form Operational Data Store/Enterprise Data Warehouse |
Does not provide a physicalized model for a 3NF store. Operational/fine-grained reporting will be fulfilled from the reporting area. |
Notification sent to all users captured as owners and user who submitted the definition. |
Data Marts/Reporting Model |
Set of star schemas with conformed dimensions (Ralph Kimball approach) |
Set of star schemas |
With the preceding understanding in mind, the following sections describe the data models in the OFSDF in greater detail.
· Readily Deployable: The Physical Data Model is a readily deployable physical schema. It is provided as an erwin Data Model file (for details on erwin, seehttps://erwin.com/products/erwin-data-modeler/) and consists of tables grouped into distinct subject areas depending on function. The tables are either used to gather source data (Staging Area), or as containers of outputs/results from analytical processing and engines for reporting purposes (Reporting Area).
· Use-case Driven: The OFSDF Physical Data model is driven by a set of clearly identified analytical use cases spanning Risk, Performance, Customer Insight, and Compliance.
· Extensible: While the OFSDF Physical Data Model satisfies a very large number of analytical use cases across Risk, Finance, Marketing, and Compliance subject areas, customers may find the need to customize the model for a specific installation.
These customizations may be done in accordance with guidelines published in Using OFSDF section of this manual.
The OFSDF Physical Data Model is divided into two primary areas:
· Staging Data Model
· Reporting Data Model
The Common Staging Area Model (CSA) represents the point of entry of data into the OFSDF. The CSA provides a simplified, unified data sourcing area for inputs required by analytical applications and engines. It consists of over 900 tables and nearly 9000 columns organized into distinct subjects. The salient features of the CSA are as follows:
· Mapping to Analytical Use Cases: Since the primary purpose of the OFSDF is to be a data repository supporting analytics, each database object in the OFSDF physical data model is necessarily mapped to a corresponding analytical use case.
These mappings are captured in the data model, in the form of additional metadata called User- defined Properties (UDPs), and can be leveraged to reduce the scope of data gathering efforts by focusing on clearly-defined end use cases such as BASEL II, Market Risk Analytics, ALM and others.
These mappings can readily be extracted into a Download Specification, which lists the data demands for a specific analytical use case. An example is shown below:
Figure 2: Download Specifications sample
The Mappings can be generated from the OFSDF erwin file using erwin Data Modeler application.
· Schema Design: The data model of the Common Staging Area is designed to facilitate loading efficiency. This means that it is denormalized.
The primary focus of the staging area is to efficiently deliver data from operational systems such as Core Banking, Trading and Wealth Management to multiple analytical applications that process this data.
Typically, this data is extracted from source systems and loaded into OFSDF directly, or alternatively into a pre-defined file-based operational image area from which it is subsequently loaded into the OFSDF schema. In a large bank, it is not unusual to have 10s to 100s of millions of accounts and transactions.
Standard ETL (Extract, Transform, Load) approaches can fail to address this requirement adequately because of the complexity imposed by the target schema. If the target schema is increasingly normalized, then the ETL into this schema is correspondingly more complex, requiring careful load ordering to prevent integrity-related load failures, as well as integrity validation prior to loading. Such complex ETL processing is time-consuming and is prone to failure.
To address this, the CSA is designed to support a simplified loading process. De-normalized tables allow efficient data loading, and subsequent transformations can be done to verify data integrity through a series of data quality checks. This represents an 'ELT (Extract Load Transform)' approach to data sourcing, which is far more suited for an analytical data repository.
· Application-managed Referential Integrity (RI): In conjunction with the database design of the staging schema, a key feature is the management of Referential Integrity primarily in the application tier, rather than within the database. Rather than imposing foreign key relationships in the database, which could cause complex loading order dependencies, relationships between staging tables are managed by the Data Quality (DQ) framework, a toolkit within the Analytical Application Infrastructure that captures the relevant data quality checks for a specific table in the staging model. These checks include:
§ Value checks (Nulls, Value ranges, business checks on numeric fields)
§ Referential Integrity checks (which are otherwise implemented as foreign key constraints)
ATTENTION:
This is also why the ER model of the staging area in erwin does not contain any relationships – the staging area is a physical data model, which is deployed using the Analytical Application Infrastructure, which manages it.
In summary – the design of the staging area data model is to allow efficient data loading for analytics. It thus has crucial differences from a general-purpose repository of operational/transactional data across a bank.
The CSA model consists of over 900 tables and nearly 9000 attributes. These tables are organized into multiple 'subjects', currently by analytical use case/solution need.
However, it makes sense to first understand the staging area tables in terms of content before understanding how they map to analytical use cases.
There are two broad categories of staging data, regardless of the use case or analytical application that requires it:
· Business Data: This set of tables captures the actual business events and the resulting state of a bank from those business events. The OLTP systems (or Transactional Systems) capture this information resulting from the execution of the bank's different business processes. Broadly, this information can be categorized as:
§ Events: Business transactions, whether financial or non-financial, represent business happenings (events) that are relevant for analytical purposes. For example – a financial transaction by a customer on a current account is a specific event. Events happen at a specific point in time, and are recorded by OLTP systems. In the staging area model, there are several transaction tables that capture this detail – for both financial and non-financial transactions.
§ State: The net effect of business transactions is to change the bank's overall financial/non- financial state. An example of this – when a customer opens a checking/current account and deposits money into it, the account tracks the net effect of all withdrawals and deposits using a numeric quantity called a 'balance'. The account will also contain a list of all events (Withdrawals, deposits, fees, etc) that resulted in the balance. This state information is typically captured by product-specific systems in a bank or FSI. For example, there is a lending system that captures details of loans, and a current account system that captures details of checking and savings accounts, which are distinct products. In both cases, the accounts are governed by contracts, which refer to the terms and conditions governing business on that account.
· Reference/Master Data: Events and state refer to 'business activities' of a bank or FSI. To provide more detail on these, banks need to capture additional data that provides context for these activities. This data may be variously called as 'reference data' or 'master data', and covers various business dimensions of a given transaction or account. For example – a bank has a master list of products that it sells to customers (Product Master). Similarly, it has a list of customers (Customer master). A trading firm may hold a list of securities it transacts in (Securities master). These and other lists provide context for each business transaction or account. Banks typically maintain 'Master' data for this purpose.
With this background, the following lists the key categories of business data and reference data in the staging data model.
To view the current set of business data tables in the staging area, open the OFSDF Staging Area model in the erwin Data Modeler application, and click on Subject Areas as shown below. Right click and switch to the 'Staging-Data Tables' subject area.
Figure 3: OFSDF Staging Area Data Tables
This provides an ordered, visually grouped list of the business data tables in the current staging area model. Above each group in the diagram is the group name as a label, to categorize the tables, as follows:
The key groups of business data tables in the model are as follows.
Figure 4: The key groups of business data tables in the data model
Group Name |
Purpose |
---|---|
Product Processors |
Tables for Financial Instruments and Contracts. Note that these tables can accommodate nearly 80+ types of instruments and derivatives across Banking and Trading books. |
Transactions |
Tables to hold Transaction/Event level data from the bank's systems. These tables are available by both con- tract and channel, and there is additionally a set of Transaction Summary tables that contain transaction data in a summarized form required by the Profitability application. |
Exposures |
Exposures are basically contracts on the asset side of the balance sheet. The tables in this category are primarily used by Risk applications. |
Mitigants |
Mitigants are used to address specific types of risk posed by exposures. The tables in this category are primarily used by Risk applications. |
GL |
Contains tables holding information pertaining to the General Ledger. |
Customer |
Consists of Marketing-related Customer activity and plan data relevant to CRM analytics. Note that core customer information is in the Product Processors, and reference data about customers is elsewhere. |
Rates |
Economic indicators, Interest Rates and other information relevant to analytical processing for Risk and Asset Liability Management applications. |
Product Processors are classified into four categories as depicted in the following table:
Product Category |
Entity Name |
---|---|
Asset |
Payment Settlement Account Stage Cards Stage Correspondent Accounts Stage Foreign Exchange Contracts Stage Investments Stage Leases Contracts Stage Loan Contracts Stage Managed Investment Account Other Services Stage Merchant Banking Stage Money Market Contracts Stage Over Draft Accounts Stage Repo Contracts Stage Credit Facility Details |
Liabilities |
Payment Settlement Account Stage Annuity Contracts Stage Borrowings Stage Casa Accounts Stage Correspondent Accounts Stage Custodial Accounts Stage Foreign Exchange Contracts Stage Merchant Cards Stage Money Market Contracts Stage Mutual Funds Stage Prepaid Cards Stage Repo Contracts Stage Retirement Accounts Stage Term Deposit Contracts Stage Trading Account Stage Trusts |
Off Balance Sheet |
Stage Bill Contracts Stage Borrowing Commitment Contract Stage Commitment Contracts Stage Credit Derivatives Stage Futures Contract Stage Guarantees Stage Letter Of Credit Contracts Stage Option Contracts Stage Repo Contracts Stage Swaps Contracts Stage Forwards Contract |
Derivatives |
Stage Credit Derivatives Stage Futures Contract Stage Option Contracts Stage Swaps Contracts Stage Forwards Contract |
For detailed mapping information, see the Banking Product to Product Processor Mapping document at My Oracle Support. This document provides details regarding which banking product must be sourced to which Product Processor table based on the product functionality.
To view the current set of Master/Reference data tables in the staging area, open the OFSDF Staging Area model in the erwin Data Modeler application, and open up the subject areas menu on the left hand side, as shown below.
Right click and switch to the 'Staging-Master Tables' to get the list of the master tables currently in staging.
Figure 5: OFSDF Staging Area Master Tables
Similar to the Business Data tables, the ʹStaging - Master Tablesʹ subject area provides a single folder view of all of the reference/master information currently required by the staging area.
The Reporting Data Model is the point where outputs of analytical processing are aggregated for reporting and Business Intelligence (BI) tools.
Similar to the Common Staging Area being the foundation for data provisioning to analytical applications, engines and processes, the Reporting Data Model is the common data store for the outputs of these processes. Outputs are computed quantitative measures and Key Performance Indicators that involve simple to complex, mathematical and statistical processing using the raw data, which is performed by specialized engines and computational models. In the OFSDF design, the Reporting Data Model design ensures that the historical data is maintained.
The key features of the design of the Reporting Area model are as follows:
· Design:
The Reporting Area data model is a dimensional data model. This means that it consists primarily of central fact tables (de-normalized), related to multiple dimension tables, also called a Star Schema. Additionally, the dimension tables are shared across the star schemas in the reporting mode, meaning they are Conformed Dimensions. This means that Drill-across reporting is naturally supported by the OFSDF design.
Additionally – in keeping with the key principle of the OFSDF, the Reporting Model is organized by use cases to facilitate reporting and BI in a wide variety of areas.
· Support for multiple scenarios of analysis:
Increasingly as a result of the 2008 crisis, the Financial Services industry is moving towards scenario-based, forward-looking risk analysis instead of retroactive analysis. The reporting data model has been designed to support scenario analysis of the sort required by financial institutions that need to measure and report risk and performance under a variety of economic scenarios.
To facilitate this, the Oracle Financial Services Advanced Analytical Infrastructure (OFSAAI) provides a Stress Testing framework, allowing risk analysis to be performed under a variety of known scenarios corresponding to different input parameter values to risk models.
The reporting model provides support for this kind of analysis via a Run Dimension – it allows analytical engines to load multiple result sets identified by scenarios, and hence permits reporting related to baseline and stress conditions in economic terms.
· Support for Cross Functional Reporting:
The third critical feature of the Reporting area design is the support for cross-functional reporting.
Typically, Business Intelligence and Reporting Solutions work off a dedicated, purpose-specific data store called a data mart. Data marts are function-specific data stores typically star schemas (eg. Marketing data marts, Risk Data Marts, Customer Data mart), that provide the necessary reporting and analytics relevant to a particular business function in the FSI.
Increasingly, the trend is for cross-functional analytics and reporting. Majority of emerging needs relate to the analytical problems at the intersection of the distinct areas of Risk, Performance, Customer Insight, and Compliance.
Similar to the staging data model, the reporting model is divided into Subject Areas.
These subject areas are visible by clicking on the Subject Areas view in the erwin Data Modeler application as show below.
Figure 6: Reporting Data Model Subject Areas
As can be seen above, the Subject Areas are organized to support detailed level analysis related to a set of higher-level analytical reporting solutions. The list of solutions supported by the OFSDF are:
Name |
Definition |
---|---|
ALM |
Subject Areas corresponding to Asset Liability Management (ALM). |
BASEL II, Pillar I and III, Pillar II |
Subject Areas corresponding to the BASEL II Regulatory framework, and its reporting requirements as specified in the framework. |
Capital Planning |
These Subject areas provide support for reporting related to Capital Planning. |
Channel Management |
Support for analytics related to Channel Management, which is part of the overall Customer Insight solution set. |
Common Account Summary |
A critical subject area that collects account-level results from multiple analytical processes. The Common Account Summary allows for cross- functional Analytics such as Risk Adjusted Performance Measurement by combining outputs from Profitability and Risk solutions. |
Corporate Credit Risk Analytics |
Support for detailed analytics and reporting on Corporate Credit Risk - including Commercial Lending, Credit Facilities, Limits and Collateral |
Economic Capital |
Support for Aggregated Economic Capital Reporting based on risk assessments for Market, Credit and Operational Risk. |
GL Reconciliation |
Support for reporting specific to the Oracle GL reconciliation module that is part of the OFSAA product line. This allows a reporting view of the reconciliation processes and outputs |
ICAAP |
Supports reporting related to the Internal Capital Adequacy Assessment process. |
LRM |
Supports reporting related to Liquidity Risk Management processes within an organization |
CRM |
Part of the Corporate Credit Risk Solution, allows reporting on Credit Limits. |
CI |
Part of the Customer Insight solution set providing support for Retail, Institutional, and Channel Analytics. |
Market Risk |
Support for Market Risk analytics |
PFT BI |
Supports reporting related to Profitability analysis, part of the Enterprise Performance Management solution area. |
Reputational Risk |
Support for Reputational Risk measurement and analysis |
Retail Credit Risk |
The set of tables/subject areas supporting Retail Credit Risk Analysis. |
The list of solutions supported by OFSDF is as follows.
Name |
Definition |
---|---|
Retail Pooling |
Reporting support for Pooling for Retail Exposures – required by ALM and Retail Credit Risk analysis. |
Strategic Risk |
Support for reporting related to Strategic Risk – measurement of these risks is a qualitative process, and part of the ICAAP framework. |
Within the above higher-level areas, there are over 165 subject areas consisting of star schemas supporting detailed analysis. Each Subject Area typically corresponds to one or possibly more star schemas, depending on the reporting need. There are currently 500+fact and dimension tables in the reporting area. The details of the various naming conventions used in OFSDF Data Model are explained in Appendix A. For column-level details, see the Oracle Financial Services Analytical Applications (OFSAA) Data Model Document Generation Release 8.1.x, which details how to extract the data dictionary from erwin section.