Oracle9i OLAP Services Developer's Guide to the OLAP DML
Release 1 (9.0.1)

Part Number A86720-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to beginning of chapter Go to next page

Defining Data Objects, 6 of 9


Defining Variables That Handle Sparse Data Efficiently

Definition: Sparse data

A variable with sparse data is one in which a relatively high percentage of the variable's cells do not contain actual data. Such "empty," or NA, values take up storage space in the file.

There are two types of sparsity:

Definition: Composite

A composite is an internal object that is used to compactly store a variable with sparse data. A composite is a list of dimension-value combinations in which one value is taken from each of the dimensions on which the composite is based.

Composites can be named or unnamed:

Because the values in composites are automatically maintained, using composites is the recommended way of handling sparsity in your analytic workspace.

In general, why you should use composites

Using composites is one of the most important steps you can take to manage sparsity, which contributes to keeping analytic workspace size to a minimum and promoting good performance.

Specifically, why you should use named composites

Using a named composite in the variable's dimension list tells OLAP Services that those dimensions in the named composite are sparse dimensions on this variable, and that this composite is shared only with other variables that use the same named composite.

In general, using named composites is a good practice. This is because any variables that are defined with an unnamed composite and that have exactly the same dimensions in the same order will automatically share that unnamed composite. If these variables have different sparsity patterns, performance will suffer. Using named composites makes it easier to track which variables share the same composite.

Note: You can also manage sparsity by using a conjoint dimension to hold dimension-value combinations for which a given variable has data. However, because the values in composites are automatically maintained, using composites is the recommended way of handling sparsity in your analytic workspace.

How to use composites

To ensure that a variable uses a minimum of disk storage space, when you define a multidimensional variable, you can specify that a composite is used to store the data for one or more of the variable's dimensions.

First, define a named composite as an OLAP DML object by using the DEFINE COMPOSITE command. Then, define the variables by using the following syntax to include a named composite in each variable's dimension list.

composite-name <dims>

For example, suppose you define a composite named PRODDIST, whose dimensions include PRODUCT and DISTRICT, as shown in the following command.

DEFINE proddist COMPOSITE <product district>

Now, suppose you want to define a SALES variable, in which TIME will be the fastest-varying dimension and the PRODDIST composite will be the slowest-varying dimension, as shown in the following command.

DEFINE sales <time proddist<product district>>

Note that you should never use the SPARSE keyword with a composite. Essentially, you use the name of the composite instead of the SPARSE keyword.

Naming, renaming, and unnaming composites

You can use the RENAME command to:

What happens when you add data to a variable that uses a composite

When you define a multidimensional variable, you can specify that a composite is used to store the data for one or more of the variable's dimensions. Later, as you add data to the variable's dimensions for which you defined a composite, the following actions are taken:

  1. The composite is filled with dimension-value combinations.

  2. The data for the variable is stored using the composite's structure rather than the structure of the base dimensions.

For a variable that uses a composite, cells are created for only those dimension values that are used in the composite's dimension-value combinations; it does not create a variable cell for every value in the base dimensions. Data for a variable is stored in order, cell by cell, for each combination of dimension values. From the perspective of data storage, each combination of base dimension values in a composite is treated like the value of a regular dimension. This means that if you define a variable with one regular dimension and one composite, then it is stored like a two-dimensional variable.

Example: Defining a variable that uses a named composite

If your company does promotional marketing for certain products in some but not all districts, then your variable data will be sparse along the PRODUCT and DISTRICT dimensions. Therefore, suppose you define a composite named PRODDIST, whose base dimensions are PRODUCT and DISTRICT. There are dimension-value combinations in the composite only for those values that have data. For example, if you ran a promotion for tents but not canoes, then the composite will include the tents and city combinations, but not the canoes and city combinations.

The following command creates a variable called PROMO that is dimensioned by MONTH and a composite named PRODDIST, whose base dimensions are PRODUCT and DISTRICT.

define promo integer <month proddist<product district>>

The conceptual figure below illustrates the PROMO variable that is created by this command, the MONTH, PRODUCT and DISTRICT base dimensions, a named composite (PRODDIST) created from the PRODUCT and DISTRICT base dimensions, and the internal relation that is created between the PRODUCT and DISTRICT base dimensions and the PRODDIST composite.


The content of this graphic is described in surrounding text

The following is an example of the sequence in which the data for the PROMO variable might be stored.

Sequence in which data for the PROMO variable might be stored 

TENTS, BOSTON

JAN95 

TENTS, BOSTON

FEB95 

TENTS, BOSTON

MAR95 

. . . 

RACQUETS, CHICAGO

JAN95 

RACQUETS, CHICAGO

FEB95 

. . . 

257 

379 

428 

. . . 

635 

192 

. . . 

Defining a variable with a single-dimension composite

When you specify a composite for just one dimension in a variable definition, a single-dimension composite is created. The values of this composite will be a subset of the values in its base dimension.

It is a good idea to use single dimension composites when a variable will share the same dimensions as some other variables, but for a particular single dimension, the variable will only have data for some of that dimension's values.

Example: Defining a variable with a single-dimension composite

Suppose you have already defined a variable called ACTUAL with the dimensions LINE, DIVISION, and MONTH. The ACTUAL variable does not contain any NA values. You need to define a variable called BUDGET, which requires much less detail than ACTUAL. For example, BUDGET only needs 10 percent of the LINE dimension values, while ACTUAL needs all of them.

If you define BUDGET without setting sparsity, then all of the LINE dimension values are present for every MONTH and ORG, but 90 percent of the LINE dimension cells will have NA values.

To handle sparse data in this case, when you define BUDGET, specify a composite for only the LINE dimension as shown below.

define budget decimal <sparse <line> division month>

Related information

For more information, see the following table.

IF you want documentation about . . .  THEN see . . . 

working with sparse data, 

"Working with NA Values" 

using composites in expressions, 

"Using composites in expressions" 

defining composites, 

the DEFINE COMPOSITE command in the OLAP DML Reference 

defining conjoint dimensions, 

the DEFINE DIMENSION command in the OLAP DML Reference 

defining variables that use composites, 

the DEFINE VARIABLE command in the OLAP DML Reference 


Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback