Transform Tabular Data to Cubes

You can create a cube from tabular data by extracting fact tables from a relational database into an Excel file and then deploying the cube.

Patterns in the relationships between column headers and data are detected to deploy a multidimensional cube. The process for transforming tabular data into a structure that can be used in a multidimensional cube include these concepts:

  • Correlations between columns

  • Correlations between column types (such as date, number, and text)

  • Header text analysis for common prefixes and business intelligence-related terms (such as cost, price, account)

  • Report structure (such as merged cells and empty cells)

  • (Optional) Forced-designation headers that are used to explicitly define the shape of a cube and can include formulas to create measures dimensions.

  • Measures hierarchies (which can also be generated in Transform Data in Cube Designer).

Sample tabular data Excel files are provided to demonstrate the concepts of intrinsic and forced-designation headers.

When working with tabular data, you should analyze the data before you create a cube from it. Then, after the cube is created, you should determine if the cube outline is the way you want it.

You can create a cube from tabular data in the Essbase instance or in Cube Designer. See Create and Update a Cube from Tabular Data.

Use Intrinsic Headers to Transform Tabular Data to Cubes

Intrinsic headers use table.column format, which is demonstrated in the Sample_Table.xlsx file. In this sample file, the column headers have names such as Units, Discounts, Time.Month, Regions.Region, and Product.Brand.

The transformation process creates this hierarchy:

Units
Discounts
Fixed Costs
Variable Costs
Revenue
Time
   Month
   Quarter
Years
Regions
   Region
   Area
   Country
Channel
Product
   Brand
...

Use Forced Designation Headers to Transform Tabular Data Into Cubes

With forced-designation headers (hints), you can specify how tabular data should be handled during the transformation process.

For example, you can force a column to be treated as a measures or an attributes dimension. Most forced-designation headers require a keyword in brackets [ ]. Forced-designation headers are demonstrated in the templates Unstr_Hints.xlsx and Sample_Table.xlsx templates (available in the gallery).

Supported forced-designation header formats:

Table 6-1 Forced-designation Header Formats

Designation Header Format Example
Dimension generation

ParentGeneration.CurrentGeneration

Category.Product

Alias

ReferenceGeneration.Generation[alias]

Year.ShortYearForm[alias]

Attribute

ReferenceGeneration.AttributeDimName[attr]

Product.Discounted[attr]

Measures

MeasureName[measure]

Price[measure]

Measure generation

Parent.child[measure]

Top-most parent, if unique, is the account dimension name. If not unique, this member is auto–generated in the account dimension.

Measures.profit[measure]

profit.cost[measure]

cost.price[measure]

Measures formula

MeasureName[=formula_syntax;]

profit[="price"-"cost";]

profit[="D1"-"E1";]

price[=IF ("S1" == #MISSING) "R1"; ELSE "S1"; ENDIF;]

Measures consolidation

MeasureName[+] : add to parent

MeasureName[-] : subtract from parent

MeasureName[~] : no consolidation (equivalent to [measure])

The default is no consolidation.

price.shipment[+]

Consolidation can be defined only for measure dim

Formula consolidation

FormulaName[+=<formula>] : add to parent

FormulaName[-=<formula>] : subtract from parent

profit[+=price-cost]

cost.external[+=ExternalWork+ExternalParts]

UDA

ReferenceGeneration[uda]

Product[uda]

Skip

The column is not read.

ColumnName[skip]

column[skip]

Recur

The last column cell value is used for empty cells

Recur can be combined with other forced designations; include a comma separated list of forced designations within a bracket, ColumnName[designationA,recur].

ColumnName[recur]

Product[recur]

Product[uda,recur]

You can specify columns to be measures dimensions and you can use formulas to create measures dimensions with calculated data during the transformation process. The measures and measures formula forced-designation headers are specified with the name for the measures dimension, followed by a keyword or formula that is enclosed in square brackets and appended to the measures dimension name.

You can also consolidate measures and formulas by adding them to, or subtracting from, the parent.

To specify a column to be a measures dimension, in the column header, you enter the name of the measures dimension and then append the keyword [measure]. For example, you can specify the Units and Fixed Costs columns as measures dimensions by using this syntax: Units[measure] and Fixed Costs[measure].

The transformation process creates this hierarchy, with Units, Discounts, Fixed Costs, Variable Costs, and Revenue as measures:

Time
   Year
      Quarter
         Month
Regions
   Region
      Area
         Country
...
Product
   Brand
...
Units
Discounts
Fixed Costs
Variable Costs
Revenue

You can create a measure generation hierarchy ( parent.child[measure] hierarchy), in a similar way that you create regular dimension generations.

For example, to create a measure hierarchy, you enter Measures.profit[measure], profit.cost[measure] and cost.price[measure], which produces the following hierarchy:

Measures
      profit
            cost
                price

To create measures dimensions from formulas, in the column header, you enter the name of the measures dimension and then append the formula syntax in brackets [ ]. Within the brackets, start the formula with an equal sign (=) and end the formula with a semicolon (;). The arguments in the formula correspond to column names or cell coordinates, which must be enclosed in quotes. You can use Essbase calculation functions and commands in the formula.

Assume that you have an Excel file named Spend_Formulas.xlsx with tabular data on the SpendHistory worksheet, which has many columns. For example, there are dimensions named Year (column A) and Quarter (column B), and measures dimensions named Spend (column J) and Addressable Spend (column K). These columns have data. Then there are column headers that use formulas to create a measures dimensions. These columns do not have data. For example, to create the Total Spend dimension, the header in column O uses this Essbase formula: Measure.Total Spend[="Addressable Spend" + "Non-Addressable Spend";]. To create the AddSpendPercent dimension, the header in column P uses this Essbase formula: Measure.AddSpendPercent[="Addressable Spend"/"Total Spend";].

The transformation process creates this hierarchy:


Image of tabular data transformation preview.

The transformation process can also identify measures dimensions when a dimension name is duplicated. Assume that you have a column header that uses this formula, Meas.profit[="a1"-"b1";], which creates the Meas dimension. If, in another column header, you use the Meas dimension name as the top parent, such as Meas.Sales, the Sales dimension is also considered a measures dimension.