Understanding Data Cubes
This topic provides an overview of data cubes in cube collections.
A data cube is a container for one kind of data that you use in cube collections.
You can place the same data cube in more than one cube collection. For example, you can place the EMPLOYEE_EXPENSE data cube in both an EMPLOYEE_ANALYSIS cube collection and an INCOME_STATEMENT cube collection. To populate the data cubes with data from the database, you map fields to the data cubes within the cube collection's properties.
See Mapping Data Cubes and Dimensions to Fields.
Within PeopleSoft Pure Internet Architecture pages with analytic grids, end users view cube collections and drag and drop data cubes to view their relationships to other data cubes.
You create four different types of data cubes that you use within an analytic model:
Input data cubes.
Calculation data cubes.
Association data cubes.
Virtual data cubes.
The four types of data cubes are not mutually exclusive, but certain combinational restrictions apply. For example, consider that all calculation data cubes contain formulas, and association data cubes may or may not contain formulas. When an association data cube does contain a formula, it is considered to be a type of calculation data cube. Similarly, when an input data cube contains a formula, it is also considered to be a type of calculation data cube. Any of these data cubes may also be considered virtual data cubes if their values are not stored in the database.
The following table lists each type of data cube and specifies whether the data cube can contain a formula, whether the data cube can lack a formula, whether the data cube can be virtual, and whether the data cube can be nonvirtual.
Data Cube Type |
Formula Allowed? |
No Formula Allowed? |
Can Be Virtual? |
Can Be Nonvirtual? |
---|---|---|---|---|
Input |
Yes Note: When input data cubes contain formulas, they must use the INPUT built-in function. |
Yes |
No |
Yes |
Calculation |
Yes |
No |
Yes |
Yes |
Association |
Yes |
Yes |
Yes |
Yes |
Virtual |
Yes |
No |
Yes |
No |
Example: Working with Data Cubes and Dimensions
To be useful, a data cube must work with one or more dimensions. For example, suppose that you want to track the sales of multiple products in multiple regions. First, create an input data cube called SALES and dimensions called PRODUCTS and REGIONS. Next, attach the PRODUCTS dimension and REGIONS dimension to the SALES data cube.
Note: When a cube collection is mapped to either a Writable-only record or a record with the Readable and Writable attributes, all data cubes in the cube collection should share the same set of dimensions.
The combined string of all the attached dimension names should not exceed 256 characters.
Input data cubes receive their data from either the end user in the application or tables and views in the database. Input data cubes can exist in all types of cube collections, although they do not serve a purpose in intermediate/calculation cube collections. Use the INPUT built-in function to work with input cube data.
Note: Even though an input cube that uses either the INPUT built-in function is considered to be a type of calculation data cube, it would not serve a purpose in an intermediate/calculation cube collection.
See INPUT.
Calculation data cubes contain formulas that calculate data based on the data of other cubes. Calculation data cubes can exist in all types of cube collections.
Note: Even though an input cube that uses the INPUT built-in function is considered to be a type of calculation data cube, it would not serve a purpose in an intermediate/calculation cube collection.
An association data cube is a data cube that is formatted as a member of a dimension and has one or more attached dimensions. An association data cube associates two dimensions, enabling the end user to group members of one dimension into categories that are defined by the members of a different dimension. When an association data cube receives its values from dimension members, it can be considered to be a type of input data cube. When an association data cube receives its values from a calculation formula, it can be considered to be a type of calculation data cube.
Association data cubes can exist in all types of cube collections.
Example: Creating the DEPT_TO_REGION Association Data Cube
This example associates the DEPTID dimension with the REGION dimension. The following table lists the members that are included in each dimension.
DEPTID Dimension Members |
REGION Dimension Members |
---|---|
Note: In the application, the end users group or categorize these members by categories that are defined by the members of the REGION dimension. |
Note: In the application, the end users select members from this dimension to group members of the DEPTID dimension. |
AUS01 |
APAC |
AUS02 |
LATAM |
BRA01 |
NAMER |
CAN01 |
EUROP |
EUR01 |
N/A |
GBR01 |
N/A |
JAP01 |
N/A |
JAP02 |
N/A |
MEX01 |
N/A |
USA01 |
N/A |
USA02 |
N/A |
This association enables the end user to group the members of the DEPTID dimension into categories that are defined by the members of the REGION dimension.
To create the DEPT_TO_REGION association data cube:
Create a new data cube named DEPT_TO_REGION.
Format the data cube as a member of the REGION dimension.
This dimension contains the categories that the end user will use to group the members of the DEPTID dimension. These members appear in the right-hand column of the data cube's data. The end user can select these members from a drop-down list box.
Attach the DEPTID dimension to the DEPT_TO_REGION association data cube.
This dimension contains the members that the end user will group or categorize. These members appear in the left-hand column of the data cube's data.
A virtual data cube is a type of calculation data cube whose values are not saved to the database. Virtual data cubes can exist in intermediate/calculation and presentation cube collections.
The following table describes the characteristics of virtual data cubes and the resulting benefits to the analytic model.
Characteristic |
Benefit |
---|---|
Value data of virtual data cube is not stored in the database. |
Reduces:
|
The analytic calculation engine does not recalculate the virtual data cube unless the virtual data cube has nonvirtual dependents. |
Reduces recalculation time. |
The analytic calculation engine neither allocates memory nor calculates virtual data cubes until it receives a request for recalculation of the virtual data cube. |
Reduces memory consumption and recalculation time. |
See Defining General Data Cube Properties.
When an end user loads an analytic instance, the underlying analytic model's virtual data cubes do not contain data. However, as soon as the analytic calculation engine receives a request for a virtual cube's data, the analytic calculation engine calculates the entire cube and places the totals and all nonzero values in a temporary storage area. After this point, if the application requires the data, the analytic calculation engine retrieves the data from the temporary storage area.
Virtual cube data is recalculated for these circumstances:
The virtual data cube's data is displayed in an analytic grid.
The virtual data cube is used during a step of a recalculation.
The virtual data cube is accessed by a user function, even if the cube's data does not appear in the application.
An application uses a PeopleCode program to request data from the virtual data cube.
Note: Whenever a circumstance requires a recalculation of all the data in an analytic model (for example, when the application adds a member to a dimension), the temporary storage for all virtual data cubes is discarded. This storage is created again as needed.
Virtual data cubes have the following two restrictions. Otherwise, you can use virtual data cubes in the way you use nonvirtual data cubes.
Because a virtual data cube does not permanently store data, it must contain a formula to generate its data.
Note: Deleting the formula for a virtual data cube results in an invalid analytic model.
A virtual data cube cannot participate in recursive or circular systems because a virtual data cube's formula cannot refer to itself, either directly or indirectly.
This restriction applies because the first time a virtual cube's data is requested, the analytic calculation engine calculates and stores the data for the entire virtual data cube. In recursive or circular systems, the analytic calculation engine cannot calculate all of the data at the same time for any given data cube.
Note: If a virtual cube is part of a recursive or circular system, the analytic calculation engine generates an error value for all of the cube's values. Use the Recalculate function in the Analytic Calculation Engine classes to determine whether you violated this restriction. The Recalculate function returns a VIRTUAL error for the data cube cells that are affected.
PeopleSoft recommends that you create virtual data cubes when you expect the cubes to be large, sparse, and output-only, especially when a relatively small slice of the ordinary cubes is used in any given analytic instance ID. The analytic calculation engine takes a long time to recalculate nonvirtual cubes that are large, sparse, and output-only. When you make these cubes into virtual cubes, you eliminate them from the recalculation process and drastically reduce memory requirements. If an analytic instance uses only a small slice of the cube, the cube calculates on demand quickly and requires less memory because of the sparsity compression.
Virtual cubes are also useful for intermediate calculations that do not require permanent storage permanently, especially if these cubes would normally be large and sparse.
Note: You cannot use virtual cubes for intermediate calculations that are part of a recursive or circular system.
Do not create virtual cubes out of large, dense cubes that are displayed frequently and take a long time to recalculate. Such virtual cubes cause delays when an application requests data. To be certain of recalculation time, PeopleSoft recommends that you test whether using a virtual cube causes a significant delay in the generation of data.
Intermediate virtual cubes can count as output-only cubes, as long as they do not have nonvirtual dependents. For example, you can create formulas such as the following for output-only virtual cubes:
This formula is for the SALARY_BY_EMPLOYEE data cube:
GROUPSUM(RCD JOB, SALARY, BUDGET_PERIOD, BUS_UNIT, EMPID, LEDGER, VERSION)
This formula is for the BENEFITS_BY_EMPLOYEE data cube:
GROUPSUM(RCD JOB, BENEFITS, BUDGET_PERIOD, BUS_UNIT, EMPID, LEDGER, VERSION)
This formula is for the SALARY_AND_BENEFITS_BY_EMPLOYEE data cube:
SALARY_BY_EMPLOYEE + BENEFITS_BY_EMPLOYEE
Even though SALARY_BY_EMPLOYEE and BENEFITS_BY_EMPLOYEE are used by another virtual cube, they are not recalculated by the analytic calculation engine if there are no nonvirtual dependents. For this reason, you must write the final formula for the SALARY_AND_BENEFITS_BY_EMPLOYEE data cube in this way:
GROUPSUM(RCD_JOB, SALARY, BUDGET_PERIOD, BUS_UNIT, EMPID, LEDGER, VERSION) + GROUPSUM(RCD JOB, Benefits, Budget Period, Bus Unit, EmpID, Ledger, Version)