There is often much discussion regarding the 'best' modeling approach to take for any given data warehouse with each style, classic 3NF and dimensional having their own strengths and weaknesses. It is likely that data warehouses must do more to embrace the benefits of each model type rather than rely on just one - this is the approach that was adopted in designing the Oracle Communications Data Model.
The foundation layer of the Oracle Communications Data Model is a 3NF model. The default Oracle Communications Data Model also provides a dimensional model of the data. This dimensional model of the data is a perspective that summarizes and aggregates data, rather than preserving detailed transaction information.
Familiarize yourself with dimensional modeling by reading the following topics before you begin to customize the dimensional model of the default Oracle Communications Data Model:
The simplicity of a dimensional model is inherent because it defines objects that represent real-world business entities. Analysts know which business measures they are interested in examining, which dimensions and attributes make the data meaningful, and how the dimensions of their business are organized into levels and hierarchies.
In the simplest terms, a dimensional model identifies the following objects:
Measures. Measures store quantifiable business data (such as sales, expenses, and inventory). Measures are also called "facts". Measures are organized by one or more dimensions and may be stored or calculated at query time:
Stored Measures. Stored measures are loaded and stored at the leaf level. Commonly, there is also a percentage of summary data that is stored. Summary data that is not stored is dynamically aggregated when queried.
Calculated Measures. Calculated measures are measures whose values are calculated dynamically at query time. Only the calculation rules are stored in the database. Common calculations include measures such as ratios, differences, totals and moving averages. Calculations do not require disk storage space, and they do not extend the processing time required for data maintenance.
Dimensions. A dimension is a structure that categorizes data to enable users to answer business questions. Commonly used dimensions are Customers, Products, and Time. A dimension's structure is organized hierarchically based on parent-child relationships. These relationships enable:
Navigation between levels.
Hierarchies on dimensions enable drilling down to lower levels or navigation to higher levels (rolling up). Drilling down on the Time dimension member 2012 typically navigates you to the quarters Q1 2012 through Q4 2012. In a calendar year hierarchy for 2012, drilling down on Q1 2012 would navigate you to the months, January 12 through March 12. These kinds of relationships make it easy for users to navigate through large volumes of multidimensional data.
Aggregation from child values to parent values.
The parent represents the aggregation of its children. Data values at lower levels aggregate into data values at higher levels. Dimensions are structured hierarchically so that data at different levels of aggregation are manipulated efficiently for analysis and display.
Allocation from parent values to child values.
The reverse of aggregation is allocation and is heavily used by planning budgeting, and similar applications. Here, the role of the hierarchy is to identify the children and descendants of particular dimension members of "top-down" allocation of budgets (among other uses).
Grouping of members for calculations.
Share and index calculations take advantage of hierarchical relationships (for example, the percentage of total profit contributed by each product, or the percentage share of product revenue for a certain category, or costs as a percentage of the geographical region for a retail location).
A dimension object helps to organize and group dimensional information into hierarchies. This represents natural 1:n relationships between columns or column groups (the levels of a hierarchy) that cannot be represented with constraint conditions. Going up a level in the hierarchy is called rolling up the data and going down a level in the hierarchy is called drilling down the data.
There are two ways that you can implement a dimensional model:
Relational tables in a star schema configuration. This traditional method of implementing a dimensional model is discussed in "Characteristics of Relational Star and Snowflake Tables".
Oracle OLAP Cubes. The physical model provided with Oracle Communications Data Model provides a dimensional perspective of the data using Oracle OLAP cubes. This dimensional model is discussed in "Characteristics of the OLAP Dimensional Model".
In the case of relational tables, the dimensional model has historically been implemented as a star or snowflake schema.
Dimension tables (which contain information about hierarchies, levels, and attributes) join to one or more fact tables. Fact tables are the large tables that store quantifiable business measurements (such as sales, expenses, and inventory) and typically have foreign keys to the dimension tables. Dimension tables, also known as lookup or reference tables, contain the relatively static or descriptive data in the data warehouse.
A star schema is a relational schema whose design represents a multidimensional data model. The star schema consists of one or more fact tables and one or more dimension tables that are related through foreign keys. This allows drill paths, hierarchy and query profile to be embedded in the data model itself rather than the data. This in part at least, is what makes navigation of the model so straightforward for end users. Star schemas usually have a large fact table surrounded by smaller dimension tables. Dimension tables do not change very much. Most of the information that the users need are in the fact tables. Therefore, star schemas have fewer table joins than do 3NF models.
A star schema is so called because the diagram resembles a star, with points radiating from a center. The center of the star consists of one or more fact tables and the points of the star are the dimension tables.
Figure 3-1 Star Schema Diagram
Snowflake schemas are slight variants of a simple star schema where the dimension tables are further normalized and broken down into multiple tables. The snowflake aspect only affects the dimensions and not the fact table and is therefore considered conceptually equivalent to star schemas. Snowflake dimensions are useful and indeed necessary when there are fact tables of differing granularity. A month-level derived or aggregate table (or materialized view) must be associated with a month level snowflake dimension table rather than the default (lower) Day level star dimension table.
When a relational table acts as a dimension to a fact table, it is recommended that you declare that table as a dimension (even though it is not necessary).
Defined dimensions can yield significant performance benefits, and support the use of more complex types of rewrite.
To define and declare the structure of the dimension use the
CREATE DIMENSION command. Use the
LEVEL clause to identify the names of the dimension levels.
To improve the data quality of the dimension data in the data warehouse, it is recommended that you validate the declarative information about the relationships between the dimension members after any modification to the dimension data.
To perform this validation, use the
VALIDATE_DIMENSION procedure of the
DBMS_DIMENSION package. When the
VALIDATE_DIMENSION procedure encounters any errors, the procedure places the errors into the
DIMENSION_EXCEPTIONS table. To find the exceptions identified by the
VALIDATE_DIMENSION procedure, query the
You can schedule a call to the
VALIDATE_DIMENSION procedure as a post-process step to the regular Incremental Dimension load script. This can be done before the call to refresh the derived or aggregate tables of the data model through materialized view refresh, intra-ETL package calls.
Oracle OLAP Cubes logically represent data similar to relational star tables, although the data is actually stored in multidimensional arrays. Like dimension tables, cube dimensions organize members into hierarchies, levels, and attributes. The cube stores the measure (fact) data. The dimensions form the edges of the cube.
Oracle OLAP is an OLAP server embedded in the Oracle Database. Oracle OLAP provides native multidimensional storage and speed-of-thought response times when analyzing data across multiple dimensions. The database provides rich support for analytics such as time series calculations, forecasting, advanced aggregation with additive and nonadditive operators, and allocation operations.
By integrating multidimensional objects and analytics into the database, Oracle Database provides the best of both worlds: the power of multidimensional analysis along with the reliability, availability, security, and scalability of the Oracle database.
Oracle OLAP is fully integrated into Oracle Database. At a technical level, this means:
The OLAP engine runs within the kernel of Oracle Database.
Dimensional objects are stored in Oracle Database in their native multidimensional format.
Cubes and other dimensional objects are first class data objects represented in the Oracle data dictionary.
Data security is administered in the standard way, by granting and revoking privileges to Oracle Database users and roles.
OLAP cubes, dimensions, and hierarchies are exposed to applications as relational views. Consequently, applications can query OLAP objects using SQL as described in "Oracle OLAP Cube Views" and Report and Query Customization.
Oracle OLAP cubes can be enhanced so that they are materialized views as described in "Cube Materialized Views".
Benefits of Using Oracle OLAP
The benefits of using Oracle OLAP are significant; Oracle OLAP offers the power of simplicity and provides: One database, standard administration and security, standard interfaces and development tools.
The Oracle OLAP dimensional data model is highly structured. Structure implies rules that govern the relationships among the data and control how the data can be queried. Cubes are the physical implementation of the dimensional model, and thus are highly optimized for dimensional queries. The OLAP engine leverages this innate dimensionality in performing highly efficient cross-cube joins for inter-row calculations, outer joins for time series analysis, and indexing. Dimensions are pre-joined to the measures. The technology that underlies cubes is based on an indexed multidimensional array model, which provides direct cell access.
The OLAP engine manipulates dimensional objects in the same way that the SQL engine manipulates relational objects. However, because the OLAP engine is optimized to calculate analytic functions, and dimensional objects are optimized for analysis, analytic and row functions can be calculated much faster in OLAP than in SQL.
The dimensional model enables Oracle OLAP to support high-end business intelligence tools and applications such as OracleBI Discoverer Plus OLAP, OracleBI Spreadsheet Add-In, Oracle Business Intelligence Suite Enterprise Edition, BusinessObjects Enterprise, and Cognos ReportNet.
Oracle OLAP Dimensional Objects
Oracle OLAP dimensional objects include cubes, measures, dimensions, hierarchies, levels and attributes. The OLAP dimensional objects are described in detail in Oracle OLAP User's Guide.Figure 3-2 shows the general relationships among the objects.
Figure 3-2 Diagram of the OLAP Dimensional Model
When you define an OLAP cube, Oracle OLAP automatically generates a set of relational views on the cube and its dimensions and hierarchies.
Cube view. Each cube has a cube view that presents the data for all the measures and calculated measures in the cube. You can use a cube view like a fact table in a star or snowflake schema. However, the cube view contains all the summary data in addition to the detail level data. The default name of a cube view is
Dimension and hierarchy views. Each dimension has one dimension view plus a hierarchy view for each hierarchy associated with the dimension. The default name for a dimension view is
_VIEW. For a hierarchy view, the default name is
These views are related in the same way as fact and dimension tables in a star schema. Cube views serve the same function as fact tables, and hierarchy views and dimension views serve the same function as dimension tables. Typical queries join a cube view with either a hierarchy view or a dimension view.
SQL applications query these views to display the information-rich contents of these objects to analysts and decision makers. You can also create custom views that follow the structure expected by your applications, using the system-generated views like base tables.
Oracle OLAP cubes can be enhanced so that they are materialized views. A cube that has been enhanced in this way is called a cube materialized view and has a
CB$ prefix. Cube materialized views can be incrementally refreshed through the Oracle Database materialized view subsystem, and they can serve as targets for transparent rewrite of queries against the source tables.
The OLAP dimensions associated with a cube materialized view are also defined with materialized view capabilities.
Necessary Cube Characteristics for Cube Materialized Views
A cube must conform to the following requirements, before it can be designated as a cube materialized view:
All dimensions of the cube have at least one level and one level-based hierarchy. Ragged and skip-level hierarchies are not supported. The dimensions must be mapped.
All dimensions of the cube use the same aggregation operator, which is either
The cube has one or more dimensions and one or more measures.
The cube is fully defined and mapped. For example, if the cube has five measures, then all five are mapped to the source tables.
The data type of the cube is
The source detail tables support dimension and rely constraints. If they have not been defined, then use the Relational Schema Advisor to generate a script that defines them on the detail tables.
The cube is compressed.
The cube can be enriched with calculated measures, but it cannot support more advanced analytics in a cube script.
Adding Materialized View Capabilities
In the Analytic Workspace Manager, connect to the
From the cube list, select the cube which you want to enable.
In the right pane, select the Materialized Views tab.
Select Enable Materialized View Refresh of the Cube. then click Apply.
You cannot enable the cube materialized view for a forecast cube.
For more information on working with OLAP cubes, see the following OBE tutorials:
"Querying OLAP 11g Cubes"
"Using Oracle OLAP 11g With Oracle BI Enterprise Edition"
To access the tutorials, open the Oracle Learning Library in your browser by following the instructions in "Oracle Technology Network"; and, then, search for the tutorials by name.
There are OLAP cubes defined in the default
ocdm_sys schema. These cubes have the general characteristics described in "Characteristics of the OLAP Dimensional Model". Specifically, OLAP cubes in the Oracle Communications Data Model have the following characteristics:
All of the default OLAP cubes are loaded with data from
DWA_ tables that are materialized views.
The cubes were defined and built using the Analytical Workspace Manager (AWM) client tool.
A relational view (with a
_VIEW suffix) is defined over each of the OLAP cubes.
All of the OLAP cubes in the Oracle Communications Data Model are cube materialized views (that is,
Immediately after installation, all materialized views underlying the OLAP cubes are disabled by default. To enable the cube materialized views, you must follow the steps outlined in "Adding Materialized View Capabilities".
For information on the using OLAP cubes in your customized version of Oracle Communications Data Model, see Oracle OLAP User's Guide and the following topics:
You can add new OLAP cubes to the
ocdm_sys schema. For consistency's sake, design and define these new cubes as described in "Characteristics of the OLAP Cubes".
Take the following steps to define new cubes:
Use the information provided in "Characteristics of the OLAP Dimensional Model". and the Oracle OLAP User's Guide to guide you when you design and define new OLAP cubes. Also, if you are familiar with a relational star schema design as outlined in "Characteristics of Relational Star and Snowflake Tables", then you can use this understanding to help you design an OLAP Cube:
Fact tables correspond to cubes.
Data columns in the fact tables correspond to measures.
Foreign key constraints in the fact tables identify the dimension tables.
Dimension tables identify the dimensions.
Primary keys in the dimension tables identify the base-level dimension members.
Parent columns in the dimension tables identify the higher level dimension members.
Columns in the dimension tables containing descriptions and characteristics of the dimension members identify the attributes.
You can also get insights into the dimensional model by looking at the sample reports included with Oracle Communications Data Model.
Oracle Communications Data Model Installation Guide for more information on installing the sample reports and deploying the Oracle Communications Data Model RPD and webcat on the Business Intelligence Suite Enterprise Edition instance.
While investigating your source data, you may decide to create relational views that more closely match the dimensional model that you plan to create.
For more information on creating OLAP cubes, see the "Building OLAP 11g Cubes" OBE tutorial.
To access the tutorial, open the Oracle Learning Library in your browser by following the instructions in "Oracle Technology Network"; and, then, search for the tutorial by name.
Common customizations to Oracle Communications Data Model cubes are changing the dimensions or the measures of the cube.
Since all Oracle Communications Data Model cubes load data from tables with the
DWA_ prefix, to change the measures or dimensions of one cube, you must take the following steps:
DWA_table identified in Step 1.
Improves scalability by keeping data structures small. Each partition functions like a smaller measure.
Keeps the working set of data smaller both for queries and maintenance, since the relevant data is stored together.
Enables parallel aggregation during data maintenance. Each partition can be aggregated by a separate process.
Simplifies removal of old data from storage. Old partitions can be dropped, and new partitions can be added.
The number of partitions affects the database resources that can be allocated to loading and aggregating the data in a cube. Partitions can be aggregated simultaneously when sufficient resources have been allocated.
The Cube Partitioning Advisor analyzes the source tables and develops a partitioning strategy. You can accept the recommendations of the Cube Partitioning Advisor, or you can make your own decisions about partitioning.
If your partitioning strategy is driven primarily by life-cycle management considerations, then you should partition the cube on the Time dimension. Old time periods can then be dropped as a unit, and new time periods added as a new partition. The Cube Partitioning Advisor has a Time option, which recommends a hierarchy and a level in the Time dimension for partitioning.
The level on which to partition a cube is determined based on a trade off between load performance and query performance.
Typically, you do not want to partition on too low a level (for example, on the DAY level of a TIME dimension) because if you do then too many partitions must be defined at load time which slows down an initial or historical load. Also, a large number of partitions can result in unusually long Analytic Workspace attach times and slows down the Time Series-based calculations. Also, a Quarterly Cumulative measure (Quarter to Date Measure) needs to access 90 or 91 partitions to calculate a value for one Customer and Organization. All dimension members above the partition level of partition dimension (including those belonging to nondefault hierarchies) would be present in a single default template. Day level partitioning makes this very heavy since all higher level members are stored in default template. However, the advantage of partitioning
DAY if the OLAP Cube load frequency is daily then there you must only load from a new partition in fact table into a single partition in the OLAP cube every day. This greatly improves the load performance since percentage-based refresh can be enabled if the cube is materialized-view enabled and has materialized-view logs.
Recommendations: Cube Partitioning Strategy
Usually a good compromise between the differing load and query performance requirements is to use an intermediate level like
MONTH as the partition level. Time series calculations within a month (week to date, month to date, and so on) are fast and higher level calculations like year to date needs to refer to 12 partitions at most. Also this way the monthly partition is defined and created only one time (that is during the initial load on first of each month) and is then reused for each subsequent load that month. The aggregation process may be triggered off at the month level (instead of specific day level) and some redundant aggregations (of previously loaded dates of current month) may occur each time but it should result in satisfactory load and query performance.
"The discussion on choosing a partition strategy in Oracle OLAP User's Guide, "Indexes and Partitioned Indexes in the Oracle Communications Data Model", and "Options for Partitioning and Materialized Views".
While developing a dimensional model of your data, it is a good idea to map and load each object immediately after you create it so that you can immediately detect and correct any errors that you made to the object definition or the mapping.
However, in a production environment, you want to perform routine maintenance as quickly and easily as possible. For this stage, you can choose among data maintenance methods. You can refresh all cubes using the Maintenance Wizard. This wizard enables you to refresh a cube immediately, or submit the refresh as a job to the Oracle job queue, or generate a PL/SQL script. You can run the script manually or using a scheduling utility, such as Oracle Enterprise Manager Scheduler or the
DBMS_SCHEDULER PL/SQL package. The generated script calls the
BUILD procedure of the
DBMS_CUBE PL/SQL package. You can modify this script or develop one from the start using this package.
The data for a partitioned cube is loaded and aggregated in parallel when multiple processes have been allocated to the build. You are able to see this in the build log.
In addition, each cube can support these data maintenance methods:
Custom cube scripts
Cube materialized views
If you are defining cubes to replace existing materialized views, then you use the materialized views as an integral part of data maintenance. Note, however, that materialized view capabilities restrict the types of analytics that can be performed by a custom cube script.
See the following OBE tutorial for an example of how Oracle uses cube materialized views for transparent access to a relational star schema.:
"Querying OLAP 11g Cubes"
To access the tutorial, open the Oracle Learning Library in your browser by following the instructions in "Oracle Technology Network"; and, then, search for the tutorial by name.