Aggregating an Aggregate Storage Cube

Essbase aggregate storage (ASO) cubes require no separate calculation step after data values are loaded into level 0 cells. Users can retrieve and view values that are aggregated dynamically at retrieval time. Aggregate storage cubes are designed for quick retrieval of data across large, sparse dimensions.

As Essbase cubes grow, retrievals must process more data values to satisfy the calculation needs of the queries. For faster retrieval, Essbase enables you to precalculate data values and store those values in aggregations. If cube size nears one million aggregate cells, you should strongly consider performing an aggregation. Depending on the usage environment, you can achieve performance benefits by precalculating smaller cubes as well. Use MaxL, or Jobs in the Essbase web interface, to calculate an ASO cube.

Aggregation-Related Essbase Terms

To understand how Essbase aggregate storage (ASO) cubes are calculated, it helps to learn the terminology about aggregation, including: aggregate cells, aggregate views, aggregations, and aggregation scripts.

Aggregate Cells

Cells for level 0 intersections across dimensions, without formulas, are called input cells, meaning that data values can be loaded to them. Higher-level cells involving members of the accounts dimension or dynamic hierarchies are always calculated at retrieval time.

All other higher-level intersections across dimensions are aggregate cells. Values for aggregate cells must be rolled up from lower-level values.

For example, in the ASOSamp.Basic outline, Price Paid > Curr Year > 1st Half > Portable Audio > CO is an aggregate cell; Original Price > Curr Year > Jan > Camcorders > CO is another aggregate cell.

Figure 38-7 Sample Aggregate Storage Outline


This image shows the outline of the ASOSamp.Sample database.

Aggregate cell values are calculated for each request, or they can be precalculated and stored on disk.

Aggregate Views

When Essbase defines which aggregate cells to precalculate and store, it does so by creating aggregate views. An aggregate view is a collection of aggregate cells. The collection is based on the levels of the members within each dimension.

For example, consider one aggregate view for the outline. This aggregate view includes aggregate cells for the following dimension levels:

  • Measures dimension, level 0

  • Years dimension, level 0

  • Time dimension, level 1 of hierarchy 0

  • Product dimension, level 2 of hierarchy 0

  • Geography dimensions, level 0

The example aggregate view is shown as 0, 0, 1/0, 2/0, 0.

Each dimension is shown, left to right, in its sequence in the outline. If a dimension contains hierarchies, the notation specifies the member level within its hierarchy. Hierarchies within a dimension are numbered top-down, starting with hierarchy 0.

The 0, 0, 1/0, 2/0, 0 aggregate view contains aggregate cells that include the following member intersections:

Original Price, Curr Year, Qtr1, Personal Electronics, CO 
Original Price, Curr Year, Qtr1, Personal Electronics, KS 
Original Price, Curr Year, Qtr1, Home Entertainment,   CO 
Original Price, Curr Year, Qtr1, Home Entertainment,   KS 
Original Price, Curr Year, Qtr2, Personal Electronics, CO 
Original Price, Curr Year, Qtr2, Personal Electronics, KS 
Original Price, Curr Year, Qtr2, Home Entertainment,   CO 
Original Price, Curr Year, Qtr2, Home Entertainment,   KS 
Original Price, Curr Year, Qtr3, Personal Electronics, CO 
Original Price, Curr Year, Qtr3, Personal Electronics, KS 
Original Price, Curr Year, Qtr3, Home Entertainment,   CO 
Original Price, Curr Year, Qtr3, Home Entertainment,   KS 
Original Price, Curr Year, Qtr4, Personal Electronics, CO 
Original Price, Curr Year, Qtr4, Personal Electronics, KS 
Original Price, Curr Year, Qtr4, Home Entertainment,   CO 
Original Price, Curr Year, Qtr4, Home Entertainment,   KS 
Original Price, Prev Year, Qtr1, Personal Electronics, CO 
Original Price, Prev Year, Qtr1, Personal Electronics, KS 
Original Price, Prev Year, Qtr1, Home Entertainment,   CO 
Original Price, Prev Year, Qtr1, Home Entertainment,   KS 
and so on...

Aggregations

Aggregations are consolidations, based on outline hierarchy, of level 0 data values. An aggregation contains one or more aggregate views. Essbase provides an intelligent aggregation process that selects aggregate views to be rolled up, aggregates them, and then stores the values for the cells in the selected views. If an aggregation includes aggregate cells dependent on level 0 values that are changed through a data load, the higher-level values are automatically updated at the end of the data load process.

The term aggregation is used for the aggregation process and the set of values stored as a result of the process.

Aggregation Scripts

Each aggregation script is a file that defines a particular selection of aggregate views to be materialized. See Working with Aggregation Scripts.

Performing ASO Aggregations

You can optionally store precalculated values as aggregations on an Essbase aggregate storage cube. Use MaxL or Jobs to perform aggregations. An aggregation involves view selection and materialization. Aggregations that prove useful can be saved using scripts, to bypass view selection next time they are needed.

The aggregation process has two phases:

  • Aggregate view selection.

  • Calculation and storage of values for the selected aggregate views. This phase is also called the materialization of the aggregation.

During the aggregate view selection phase, Essbase analyzes how calculating and storing various combinations of aggregate views might affect average query response time. As input to the analysis, you can define physical storage and performance requirements. You can also track data usage and provide the information to the analysis process. See Selecting Views Based on Usage.

Based on their usefulness and resource requirements, Essbase creates a list of aggregate views. Included with each view in the list is storage and performance information that applies when that aggregate view plus all other aggregate views listed above it are stored. You can choose to aggregate the listed views, select and aggregate a subset of the listed views, or rerun the selection process with different input criteria. You can also add to an aggregation the materialization of new views that are based on new selection criteria. See Fine-Tuning Aggregate View Selection.

Whether or not you materialize the selection, you can save the selection of aggregate views as an aggregation script. Aggregation scripts provide flexibility and can save time because they enable you to bypass the selection process if the same selection is needed again. See Working with Aggregation Scripts.

After the selection process is finished, the selected aggregate views are calculated when you materialize the selected aggregate views into an aggregation.

The following process is recommended for defining and materializing aggregations:

  1. After the outline is created or changed, load data values.

  2. Perform the default aggregation.

    Optional: Specify a storage stopping point.

  3. Materialize the suggested aggregate views and save the default selection in an aggregation script.

  4. Run the types of queries for which the aggregation is designed.

  5. If query time or aggregation time is too long, consider fine-tuning the aggregation.

  6. (Optional) Save the aggregation selection as an aggregation script.

To perform an ASO cube aggregation selection or materialization, you can use the Essbase web interface or these MaxL statements:

You can also configure Essbase to generate aggregate views automatically when needed.

Fine-Tuning Aggregate View Selection

Essbase's default view selection for aggregate storage (ASO) cubes provides excellent performance. However, accepting all aggregate views in the selection list does not guarantee optimum performance. You can optimize the view selection for your cube's environment and retrieval patterns.

For its default selection of aggregate views, Essbase analyzes stored hierarchies and assumes an equal chance that any aggregate cell will be retrieved. Essbase cannot account for external factors such as the amount of available memory at the time of a query. Available memory can be affected by such factors as the cache memory definition at retrieval time, or the memory other concurrent processes require.

You may want to track which data is most queried and include the results and alternate views in the aggregate view selection process. See Selecting Views Based on Usage.

As you tune and test aggregations, consider the following points:

  • Improving retrieval performance can increase disk storage costs and the time it takes to materialize the aggregation.

  • Tracking queries may result in a set of proposed aggregate views that provide better performance for some queries than for others. Selecting proposed aggregate views can considerably improve performance time of some queries with others experiencing little improvement—but never worse—as long as query type and frequency are close to the type and frequency of queries performed during the tracking period.

  • Optimizing aggregations may require an iterative, fine-tuning process.

To estimate the size of aggregate views, you can use the ASOSAMPLESIZEPERCENT configuration setting to specify the number of sample cells Essbase uses. The sample size is specified as a percentage of input-level data. The default, and minimum, sample size is 1 million (1,000,000) cells.

Essbase provides information to help you select and store the right balance of aggregate views for your ASO cube. Weigh this information against what you know about your retrieval requirements and environment. Use the following information to help you select aggregate views for an aggregation:

  • The maximum storage requirement

    You can specify a storage limit for selecting aggregate views in two ways:

    • When the aggregation selection is initiated, you specify a maximum storage stopping value. Aggregate views are selected until the specified storage limit is reached or there are no more views to select.

      When using the execute aggregate process MaxL statement with the stopping when total_size exceeds grammar, you can specify the maximum disk space of the resulting data files, as a ratio of the current cube size. For example, if the size of a cube is 1 GB, specifying the total size as 1.2 means that the size of the resulting data cannot exceed 20% of 1 GB, for a total of 1.2 GB.

    • After each analysis of the cube, Essbase displays information about the level 0 input cell view followed by a list of suggested aggregate views. Displayed by each aggregate view is a storage number that includes that aggregate view and all other aggregate views it depends on. You can consider this storage number as you select the aggregate views to be included in the aggregation.

  • The relative “Query Cost” performance improvement

    The Query Cost number that is displayed by each aggregate view in the list projects an average retrieval time for retrieving values from the associated aggregate view. The default view selection estimates the cost as the average of all possible queries. When using query tracking, the estimated cost is the average for all tracked queries. The cost number for a specific aggregate view can be different in different selection lists; for example, aggregate view 0, 0, 1/0, 2/0, 0 can show a different query cost in the default selection list than it would show in a selection that includes tracked queries in the analysis.

    To compute the percentage improvement, divide the query cost value for the aggregate view into the query cost value shown for storing only level 0 input cells.

  • Tracked usage

    Before running an aggregate view selection, you can turn on query tracking to determine which data is retrieved most often. After some period of cube activity, you can have Essbase include the usage statistics in the aggregation analysis process.

  • Aggregation time

    The time it takes to perform an aggregation after the selection process completes increases for each aggregate view materialized. To determine actual aggregation time, you must perform the aggregation.

The following process is recommended for fine-tuning aggregations:

  1. Perform the default aggregations described in Performing ASO Aggregations.

  2. Save the default selection in an aggregation script. See Working with Aggregation Scripts.

  3. Turn on query tracking. See Selecting Views Based on Usage.

  4. Have users perform their usual queries against the cube or perform the batch query operations for which the aggregation is being designed. Queries from all query tools are tracked.

  5. After sufficient time to capture data retrieval requirements, perform another aggregation including tracked data.

  6. Analyze the proposed list of aggregate views to be stored, and select the aggregate views that you determine provide the best balance of system resources and retrieval performance.

  7. Materialize the selected aggregate views and, if desired, save the selection in an aggregation script.

  8. Working with aggregation scripts and various selection criteria, repeat the process until you think you have the optimum selection of aggregate views for your situation.

Note:

To optimize aggregations for different retrieval situations, such as for generating reports or user queries, you may need to repeat the tuning process, creating an aggregation script for each situation.

Selecting Views Based on Usage

You can capture retrieval statistics for your Essbase aggregate storage (ASO) cube, and use these statistics to build aggregations tailored to retrieval patterns in your company. Essbase includes alternate hierarchies in its analysis of the cube when you enable query tracking to inform the aggregate view selection process.

Database usage for periodic reporting may be different than for ongoing user retrievals. To optimize different retrieval situations, consider tracking situational usage patterns and creating aggregation scripts for each situation.

Before you begin the aggregation selection process, ensure that query tracking is on and that it has been on long enough to capture representative usage. To enable it, use the MaxL statement alter database with the enable query_tracking grammar.

Query tracking holds query usage information in memory. Performing any of the following operations clears query usage information.

  • Loading or clearing data

  • Materializing or clearing an aggregation

  • Turning off query tracking

Query tracking remains on until you turn it off, stop the application, or change the outline.

Note:

Query tracking and query tracing are different. Query tracking captures user retrieval statistics against an aggregate storage cube, so that Essbase can make view-based optimizations to improve the performance of aggregations. It is on by default. Query tracing helps you monitor Essbase query performance metrics for block storage and hybrid cubes. It is off by default. If you enable it, the information you want to trace is written to the application log and/or a separate tracing log. Related configuration parameters: TRACE_REPORT, QUERYTRACE.

Understanding User-Defined View Selection

If you manage an aggregate storage (ASO) cube, you can apply view selection properties to stored hierarchies to restrict Essbase from choosing certain levels for aggregation. In the Essbase web interface, you apply these properties using the outline editor, in the options group labeled Aggregate level usage.

By default, Essbase uses internal mechanisms to decide how to create aggregations. User-defined view selection provides a way for you to influence default view selection and view selection based on query data.

Database managers may apply view selection properties to stored hierarchies to restrict Essbase from choosing certain levels for aggregation.

In the Essbase web interface, you apply these properties using the outline editor, in the options group labeled Aggregate level usage.


Aggregate level usage options in the outline editor for an aggregate storage cube include "Default aggregation," "Consider all levels," "Do not aggregate," "Consider top level only," and "Never aggregate to intermediate levels."

Note:

Secondary (alternate) hierarchies are either shared or attribute hierarchies.

Table 38-5 View Selection Properties

Property Effect

Default

On primary hierarchies, Essbase considers all levels. It does not aggregate on secondary hierarchies unless alternative roll-ups are enabled.

Consider all levels

Considers all levels of the hierarchy as potential candidates for aggregation. This is the default for primary hierarchies, but not for secondary hierarchies.

Do not aggregate

Does not aggregate along this hierarchy. All views selected by Essbase are at the input level.

Consider bottom level only

Applies only to secondary hierarchies. Essbase considers only the bottom level of this hierarchy for aggregation.

Consider top level only

Applies only to primary hierarchies. Considers only top level of this hierarchy for aggregation.

Never aggregate to intermediate levels

Applies to primary hierarchies. Selects top and bottom levels only.

Note:

The bottom level of an attribute dimension consists of the zero-level attribute members. When a secondary hierarchy is formed using shared members, the bottom level comprises the immediate parents of the shared members.

Essbase considers only views that satisfy the selected view selection properties.

You should be familiar with the dominant query patterns of cubes before changing default properties; preventing selection of certain views will make queries to those views slower while improving the speed of other queries. Similarly, enabling Consider All Levels on a secondary hierarchy may speed queries to that hierarchy while making other queries slower.

To define view selection properties, you can use these MaxL statements:

Working with Aggregation Scripts

Aggregation scripts represent specific aggregate view selections on an Essbase aggregate storage (ASO) cube. They enable you the flexibility to skip the aggregate view selection process that might otherwise be required after a data load.

About Aggregation Scripts

Each aggregation script represents a specific aggregate view selection against the cube.

Aggregation scripts can save you time. For example, after loading new data values you need not perform another aggregate view selection. You can speed the aggregation process by using the selection stored in an aggregation script to materialize the aggregation.

Aggregation scripts also give you flexibility. You can use them to save aggregate view selections optimized for different retrieval situations; for example, you can use one script to optimize retrievals in month-end reporting and another for daily retrieval requirements.

Aggregation scripts for an ASO cube become invalid when the selection it contains is invalid for the cube. Create aggregation scripts when you create aggregations. Do not manually modify aggregation script files, which may cause unpredictable results.

Creating Aggregation Scripts

Saved aggregation scripts enable you to split up the total aggregation process. You can materialize an aggregation at a different time than when the aggregate views for the aggregation are selected. The aggregation script contains information derived during the aggregate view selection phase.

To create an aggregation script, you can use these MaxL statements:

Aggregation scripts are stored in the cube directory as text files with the .csc extension, and are valid as long as the dimension level structure in the outline has not changed.

To avoid the potential clutter of invalid aggregation script files, manually delete aggregation scripts when they are no longer useful.

Executing Aggregation Scripts

Executing an aggregation script materializes the aggregate views specified within it. Although you can create multiple aggregation scripts, only one aggregation can be materialized at a time.

To execute an aggregation script, you can use the execute aggregate build MaxL statement.

Clearing Aggregations

Clearing aggregations from your Essbase aggregate storage (ASO) cube removes non level-0 data. After aggregations are cleared, user queries calculate retrieved values dynamically from the level 0 values.

At times you might want to manually clear aggregations from the disk; for example, to make the disk space available for disk-intensive operations. Clearing aggregations clears all data, except level 0 values, from the cube, releasing the disk area for other use.

To clear aggregations, you can use the alter database MaxL statement.

Replacing Aggregations

You can replace an aggregation by clearing the existing aggregation and materializing a different selection of aggregate views for your Essbase ASO cube. You can perform a new aggregate view selection and materialization process, or you can run an aggregation script.

Consider replacing the aggregation in the following situations:

  • To fine-tune the selection of aggregate views to improve performance.

  • To create aggregations optimized for different database retrieval situations, such as for generating reports or user queries.

  • To optimize an aggregation after significant growth in database size. Gradually, as the size of a cube increases, an aggregation can become less efficient. Consider replacing an aggregation when performance degradation is noticeable or when the cube size increases to about 150% of its original size.

  • To optimize aggregations for new or different operational environments, such as memory and disk resource availability changes.

You must replace an aggregation and associated aggregation scripts after the number of levels in a dimension has been changed or one or more dimensions have been added or removed from the outline.