In This Section:
Some information in this chapter applies only to block storage databases and is not relevant to aggregate storage databases. Also see Comparison of Aggregate and Block Storage.
Online analytical processing (OLAP) is a multidimensional, multiuser, client-server computing environment for users who need to analyze enterprise data. Finance departments use OLAP for applications such as budgeting, activity-based costing (allocations), financial performance analysis, and financial modeling. Sales departments use OLAP for sales analysis and forecasting. Marketing departments use OLAP for market research analysis, sales forecasting, promotions analysis, customer analysis, and market/customer segmentation. Typical manufacturing OLAP applications include production planning and defect analysis.
Important to all of these applications is the ability to provide managers the information that they need to make effective decisions about an organization's strategic directions. A successful OLAP application provides information as needed; that is, it provides “just-in-time” information for effective decision-making.
Providing such information requires more than a base level of detailed data. Just-in-time information is computed data that usually reflects complex relationships and is often calculated on the fly. Analyzing and modeling complex relationships are practical only if response times are consistently short. In addition, because the nature of data relationships may not be known in advance, the data model must be flexible. A truly flexible data model ensures that OLAP systems can respond to changing business requirements as needed for effective decision making.
Key to OLAP systems are multidimensional databases, which not only consolidate and calculate data; but also provide retrieval and calculation of a variety of data subsets. A multidimensional database supports multiple views of data sets for users who need to analyze the relationships between data categories. For example, a marketing analyst might ask following questions:
In multidimensional databases, the number of data views is limited only by the database outline, the structure that defines all elements of the database. Users can pivot the data to see information from a different viewpoint, drill down to find more detailed information, or drill up to see an overview.
This section introduces the concepts of outlines, dimensions, and members within a multidimensional database. If you understand dimensions and members, you are well on your way to understanding the power of a multidimensional database.
A dimension represents the highest consolidation level in the database outline. The database outline presents dimensions and members in a tree structure to indicate a consolidation relationship. For example, in Figure 1, Hierarchical Structure, Year is a dimension (of type Time) and Qtr1 is a member.
Standard dimensions represent the core components of a business plan and often relate to departmental functions. Typical standard dimensions: Time, Accounts, Product Line, Market, and Division. Dimensions change less frequently than members.
Attribute dimensions are associated with standard dimensions. Through attribute dimensions, you group and analyze members of standard dimensions based on the member attributes (characteristics). For example, you can compare the profitability of noncaffeinated products that are packaged in glass to the profitability of noncaffeinated products packaged in cans.
Members are the individual components of a dimension. For example, Product A, Product B, and Product C might be members of the Product dimension. Each member has a unique name. Essbase can store the data associated with a member (referred to as a stored member in this chapter), or it can dynamically calculate the data when a user retrieves it.
Essbase uses the concept of members to represent data hierarchies. Each dimension consists of one or more members. The members, in turn, may consist of other members. When you create a dimension, you tell Essbase how to consolidate the values of its individual members. Within the tree structure of the database outline, a consolidation is a group of members in a branch of the tree.
For example, many businesses summarize their data monthly, rolling up monthly data to obtain quarterly figures and rolling up quarterly data to obtain annual figures. Businesses may also summarize data by zip code, city, state, and country. Any dimension can be used to consolidate data for reporting purposes.
In the Sample.Basic database included with Essbase Server, for example, the Year dimension comprises five members: Qtr1, Qtr2, Qtr3, and Qtr4, each storing data for an individual quarter, plus Year, storing summary data for the year. Qtr1 comprises four members: Jan, Feb, and Mar, each storing data for a month, plus Qtr1, storing summary data for the quarter. Similarly, Qtr2, Qtr3, and Qtr4 comprise the members that represent the individual months plus the member that stores the quarterly totals.
The database outline in Figure 1, Hierarchical Structure uses a hierarchical structure to represent the data consolidations and relationships in Qtr, as described in the previous paragraph.
Some dimensions consist of relatively few members, while others may have hundreds or even thousands of members. Essbase does not limit the number of members within a dimension and enables the addition of new members as needed.
Essbase uses hierarchical (generations and level; and roots and leaves) and family history (parents, children, and siblings; and descendants and ancestors) terms to describe the roles and relationships of the members in a database outline. The subtopics in this section reference the outline show in Figure 2, Member Generation and Level Numbers in describing the position of the members.
Figure 2, Member Generation and Level Numbers illustrates the following parent, child, and sibling relationships:
Siblings are child members of the same immediate parent, at the same generation. For example, Sales and Cost of Goods Sold are siblings (they both have the parent Margin), but Marketing (at the same branch level) is not a sibling, because its parent is Total Expenses.
Figure 2, Member Generation and Level Numbers illustrates the following descendant and ancestral relationships:
Descendants are members in branches below a parent. For example, Profit, Inventory, and Ratios are descendants of Measures. The children of Profit, Inventory, and Ratios are also descendants of Measures.
Figure 2, Member Generation and Level Numbers illustrates the following root and leaf member relationships:
Figure 2, Member Generation and Level Numbers illustrates the following generations levels:
Generation refers to a consolidation level within a dimension. A root branch of the tree is generation 1. Generation numbers increase as you count from the root toward the leaf member. In Figure 2, Member Generation and Level Numbers, Measures is generation 1, Profit is generation 2, and Margin is generation 3. All siblings of each level belong to the same generation; for example, both Inventory and Ratios are generation 2.
Figure 3, Generations shows part of the Product dimension with its generations numbered. Product is generation 1, 100 is generation 2, 100-10 is generation 3, and 100-10-12 and 100-10-16 are generation 4.
Level also refers to a branch within a dimension; levels reverse the numerical ordering used for generations. Levels count up from the leaf member toward the root. The root level number varies depending on the depth of the branch. In Figure 2, Member Generation and Level Numbers, Sales and Cost of Goods Sold are level 0. All other leaf members are also level 0. Margin is level 1, and Profit is level 2. Notice that the level number of Measures varies depending on the branch. For the Ratios branch, Measures is level 2. For the Total Expenses branch, Measures is level 3.
Figure 4, Levels shows part of the Product dimension with its levels numbered. 100 is level 2, 100-10 is level 1, and 100-10-12 and 100-10-16 are level 0.
To ease report maintenance, you can assign a name to a generation or level and then use the name as a shorthand for all members in that generation or level. Because changes to an outline are automatically reflected in a report, when you use generation and level names, you do not need to change the report if a member name is changed or deleted from the database outline.
Essbase has standard dimensions and attribute dimensions. This chapter focuses on standard dimensions, because Essbase does not allocate storage for attribute dimension members. Instead, it dynamically calculates the members when the user requests data associated with them.
An attribute dimension is a special type of dimension that is associated with a standard dimension. See Working with Attributes.
Essbase maximizes performance by dividing the standard dimensions of an application into two types: dense dimensions and sparse dimensions. This division allows Essbase to cope with data that is not smoothly distributed, without losing the advantages of matrix-style access to the data. Essbase speeds data retrieval while minimizing memory and disk requirements.
Most multidimensional databases are inherently sparse; they lack data values for the majority of member combinations. A sparse dimension is one with a low percentage of available data positions filled.
For example, the outline of the Sample.Basic database in Figure 5, Sample.Basic Database Outline includes the Year, Product, Market, Measures, and Scenario dimensions. Product represents the product units, Market represents the geographical regions in which the products are sold, and Measures represents the accounts data. Because not every product is sold in every market, Market and Product are chosen as sparse dimensions.
Most multidimensional databases also contain dense dimensions. A dense dimension has a high probability that one or more cells is occupied in every combination of dimensions. For example, in the Sample.Basic database, accounts data exists for almost all products in all markets, so Measures is chosen as a dense dimension. Year and Scenario are also chosen as dense dimensions. Year represents time in months, and Scenario represents whether the accounts values are budget or actual values.
Caffeinated, Intro Date, Ounces, Pkg Type and Population are attribute dimensions. See Working with Attributes.
In most data sets, existing data tends to follow predictable patterns of density and sparsity. If you match patterns correctly, you can store the existing data in a reasonable number of fairly dense data blocks, rather than in many highly sparse data blocks.
You can apply a recommended configuration, or you can turn off automatic configuration and manually set the sparse or dense property for each dimension. Attribute dimensions are always sparse dimensions. Keep in mind that you can associate attribute dimensions only with sparse standard dimensions.
The automatic configuration of dense and sparse dimensions provides only an estimate. It cannot take into account the nature of the data that you will load into your database or multiple user considerations.
Because TBC does not sell every product in every market, the data set is reasonably sparse. Data values do not exist for many combinations of members in the Product and Market dimensions. For example, if Caffeine Free Cola is not sold in Florida, data values do not exist for the combination Caffeine Free Cola (100-30) -> Florida, so Product and Market are sparse dimensions. Therefore, if no data values exist for a specific combination of members in these dimensions, Essbase does not create a data block for the combination.
However, consider combinations of members in the Year, Measures, and Scenario dimensions. Data values almost always exist for some member combinations on these dimensions. For example, data values exist for the member combination Sales -> January -> Actual, because at least some products are sold in January. Thus, Year and, similarly, Measures and Scenario, are dense dimensions.
Essbase creates a data block for each unique combination of members in the Product and Market dimensions (see Data Storage). Each data block represents data from the dense dimensions. The data blocks likely will have few empty cells.
For example, consider the sparse member combination Caffeine Free Cola (100-30), New York, in Figure 6, Dense Data Block for Sample.Basic Database:
The following scenarios show how a database is affected when you select different standard dimensions. Assume that these scenarios are based on typical databases with at least seven dimensions and several hundred members.
If you make all dimensions sparse, Essbase creates data blocks that consist of single data cells that contain single data values. An index entry is created for each data block and, therefore, in this scenario, for each existing data value.
If you make all dimensions dense, as shown in Figure 8, Database with All Dense Standard Dimensions, Essbase creates one index entry and one large, sparse block. In most applications, this configuration requires thousands of times more storage than other configurations. Essbase must load the entire memory when it searches for a data value, which requires enormous memory.
Based on your knowledge of your company’s data, you have identified all your sparse and dense standard dimensions. Ideally, you have approximately equal numbers of sparse and dense standard dimensions. If not, you are probably working with a nontypical data set, and you must do more tuning to define the dimensions.
Essbase creates dense blocks that can fit into memory easily and creates a relatively small index, as shown in Figure 9, An Ideal Configuration with Combination of Dense and Sparse Dimensions. Your database runs efficiently using minimal resources.
The two-dimensional data blocks shown in Figure 10, Two-dimensional Data Block for Time and Accounts represent data values from the dense dimensions: Time and Accounts. The members in the Time dimension are J, F, M, and Q1. The members in the Accounts dimension are Rev, Exp, and Net.
Essbase creates data blocks for combinations of members in the sparse standard dimensions (providing that at least one data value exists for the member combination). The sparse dimensions are Region and Product. The members of the Region dimension are East, West, South, and Total US. The members in the Product dimension are Product A, Product B, Product C, and Total Product.
Figure 11, Data Blocks Created for Sparse Members on Region and Product shows 11 data blocks. No data values exist for Product A in the West and South, for Product B in the East and West, or for Product C in the East. Therefore, Essbase has not created data blocks for these member combinations. The data blocks that Essbase has created have few empty cells. This example effectively concentrates all sparseness into the index and concentrates all data into fully utilized blocks. This configuration provides efficient data storage and retrieval.
Next, consider a reversal of the dense and sparse dimension selections. In the following example, Region and Product are dense dimensions, and Time and Accounts are sparse dimensions.
In Figure 12, Two-Dimensional Data Block for Region and Product, the two-dimensional data blocks represent data values from the dense dimensions: Region and Product. In the West region, data is not available for Product A and Product B. Data is also not available for Total Product in US.
Essbase creates data blocks for combinations of members in the sparse standard dimensions (providing that at least one data value exists for the member combination). The sparse standard dimensions are Time and Accounts.
Figure 13, Data Blocks Created for Sparse Members on Time and Accounts shows 12 data blocks. Data values exist for all combinations of members in the Time and Accounts dimensions; therefore, Essbase creates data blocks for all member combinations. Because data values do not exist for all products in all regions, the data blocks have many empty cells. Data blocks with many empty cells store data inefficiently.
Essbase does not store data for attribute dimensions. Essbase dynamically calculates attribute dimension data when a user retrieves the data.
Consider the simplified database shown in Figure 14, A Multidimensional Database Outline. This database has three dimensions: Accounts, Time, and Scenario:
The intersection of one member from one dimension with one member from each of the other dimensions represents a data value. The example in Figure 15, Three-Dimensional Database has three dimensions (Accounts, Time, and Scenario); therefore, the dimensions and data values in the database can be represented as a cube.
As illustrated in Figure 16, Sales Slice of the Database, when you specify Sales, you are specifying the slice of the database that contains eight Sales values, where Sales intersect with Actual and Budget.
As illustrated in Figure 17, Actual, Sales Slice of the Database, when you specify Actual Sales, you are specifying the slice of the database that contains four Sales values, where Actual and Sales intersect.
A data value is stored in one cell in the database. To refer to a specific data value in a multidimensional database, you specify its member on each dimension. In Figure 18, Sales->Jan->Actual Slice of the Database, the cell containing the data value for Sales, Jan, Actual is shaded. The data value can also be expressed using the cross-dimensional operator (->) as Sales -> Actual -> Jan.
Essbase creates a data block for each unique combination of sparse standard dimension members (providing that at least one data value exists for the sparse dimension member combination). The data block represents all the dense dimension members for its combination of sparse dimension members.
Essbase creates an index entry for each data block. The index represents the combinations of sparse standard dimension members. It contains an entry for each unique combination of sparse standard dimension members for which at least one data value exists.
For example, in the Sample.Basic database outline shown in Figure 19, Product and Market Dimensions from the Sample.Basic Database, Product and Market are sparse dimensions.
If data exists for Caffeine Free Cola in New York, Essbase creates a data block and an index entry for the sparse member combination of Caffeine Free Cola (100-30) -> New York. If Caffeine Free Cola is not sold in Florida, Essbase does not create a data block or an index entry for the sparse member combination: Caffeine Free Cola (100-30) -> Florida.
Each unique data value can be considered to exist in a cell in a data block. When Essbase searches for a data value, it uses the index to locate the appropriate data block. Then, within the data block, it locates the cell containing the data value. The index entry provides a pointer to the data block. The index handles sparse data efficiently because it includes only pointers to existing data blocks.
Figure 20, Part of a Data Block for the Sample.Basic Database shows part of a data block for the Sample.Basic database. Each dimension of the block represents a dense dimension in the Sample.Basic database: Time, Measures, and Scenario. A data block exists for each unique combination of members of the Product and Market sparse dimensions (providing that at least one data value exists for the combination).
Each data block is a multidimensional array that contains a fixed, ordered location for each possible combination of dense dimension members. Accessing a cell in the block does not involve sequential or index searches. The search is almost instantaneous, resulting in optimal retrieval and calculation speed.
A (Dense) a1 a2 B (Dense) b1 b11 b12 b2 b21 b22 C (Dense) c1 c2 c3 D (Sparse) d1 d2 d21 d22 E (Sparse) e1 e2 e3
The block in Figure 21, Data Block Representing Dense Dimensions for d22 -> e3 represents the three dense dimensions from within the combination of the sparse members d22 and e3, from the preceding database outline. In Essbase, member combinations are denoted by the cross-dimensional operator. The symbol for the cross-dimensional operator is ->, so d22 -> e3 denotes the block for d22 and e3. The intersection of A, b21, and c3 is written as A -> b21 -> c3.
Data blocks, such as the one in Figure 21, Data Block Representing Dense Dimensions for d22 -> e3, may include cells that do not contain data values. A data block is created if at least one data value exists in the block. Essbase compresses data blocks with missing values on disk, expanding each block fully as it brings the block into memory. Data compression is optional but is enabled by default. See Data Compression.
By carefully selecting dense and sparse standard dimensions, you can ensure that data blocks do not contain many empty cells, minimizing disk storage requirements and improving performance. In Essbase, empty cells are known as #MISSING data.
A multidimensional database supports multiple views of data sets for users who need to analyze the relationships between data categories. Slicing the database in different ways gives you different perspectives of the data. For example, in Figure 22, Data for January, the slice for Jan examines all data values for which the Year dimension is fixed at Jan.
The slice in Figure 23, Data for February shows data for the month of Feb:
The slice in Figure 24, Data for Profit Margin shows data for profit margin:
How does your company use the data?
How will you build and order the dimensions?
Which data compression scheme will you use?
How will you create and order calculations?
See these topics:
Planning the development of your multidimensional database, see Case Study: Designing a Single-Server, Multidimensional Database.
Selecting dense and sparse dimensions, see Selection of Dense and Sparse Dimensions.
Loading data, see Understanding Data Loading and Dimension Building.
Compressing data and optimizing your database, see Data Compression.
Calculating your database, see Calculating Essbase Databases.