Writing the Create Aggregates Specification Manually

You can write the script file manually, instead of using the Aggregate Persistence Wizard to create the script file. Oracle recommends that you use the Aggregate Persistence Wizard.

If you do not want the Oracle BI Server to modify your databases during aggregate creation, then you can specify this in the Aggregate Persistence Wizard by selecting the option Generate target DDL in a separate file. The Aggregate Persistence Wizard creates a DDL file, the prepare aggregates script, that you can use to create the empty aggregate tables. After this, you need to run the create aggregates script to populate the aggregate tables. This option provides some flexibility in case the database access to create tables is restricted. Note that you must run the prepare aggregates script before you run the create aggregates script.

This section contains the following topics:

What Constraints Are Imposed During the Create Process?

You can learn about constraints are imposed during the create process.

The following constraints are imposed during the create process:

  • Valid measures. A valid measure must have a valid aggregation rule. The following constraints apply to level-based measures:

    • If the level is grand total alias, then that dimension must not be present in the list of levels for that aggregate specification.

    • Any other level defined for this measure must be present in the list of levels for that aggregate specification.

    If the above constraints are not met, then the entire aggregate specification is discarded. In addition, a measure is ignored by the create process if any of the following conditions are true:

    • Measure is mapped to a session or repository variable.

    • Measure is a derived measure.

    • Measure has a default aggregation rule of None.

    Measures that are ignored do not necessarily affect the aggregate specification. The remaining measures are used to create the aggregate.

  • Valid levels. A valid level must have a valid primary key. If a level is invalid, the aggregate specification is be discarded. Also, attributes of a level or its primary key are ignored if any of the following conditions are true:

    • Attribute is mapped to session or repository variables.

    • Attributes are not from the same logical table.

  • Valid aggregate specification. A valid aggregate specification has the following properties:

    • Name length is between 1 and 18 characters (inclusive).

    • At least one valid level must be specified.

    • At least one valid measure must be specified.

    • Must have a valid connection pool.

    • Must have a valid output container (database/catalog/schema).

    • Connection pool and container must belong to the same database.

    • Only one level per dimension can be specified.

    • Measures can only be from the same fact table.

    • All logical components of the specification must be from the same subject area.

    An aggregate specification is ignored if the name already exists in the output container because level aggregates are scoped by the entire database. However, if different catalogs or schemas are specified for the same fact aggregate name, it is allowed to have multiple facts with the same name but different scope in the same database.

    Note that the aggregate specification is discarded if any dimension is not joined to a fact.

Writing the Create Aggregates Specification

All metadata names, except logical fact columns, are fully qualified.

There are two modes of operation: Create and Delete. It is strongly recommended that you place all aggregate specifications under a single Create Aggregates statement.

See Adding Surrogate Keys to Dimension Aggregate Tables.

Delete Statement for Aggregate Specification

Begin the script file with a Delete statement. It is essential to delete system-generated aggregates before creating new ones.

This ensures that data is consistent and removes invalid or incomplete aggregates before you run the Create operation. The following statement is the syntax for deleting aggregates:

Delete aggregates [list of fully qualified physical table names];

For example:

Delete aggregates "src".."INCR"."fact_1", "src".."INCR"."fact_2";

You can optionally include a comma-separated list of physical tables to delete. The tables you include must be system-generated, by a previous run of the aggregate creation script. Any dimension tables joined to listed fact tables are also deleted.

If a dimension table is joined to more than one fact table, it will not be deleted unless the other joined table is also listed.

In addition to fact tables, you can also use the Delete statement to delete orphan dimension tables, these are dimension tables that are not joined to any other fact table. Orphan dimension tables sometimes occur when aggregate creation fails.

The Delete statement also removes the logical key and logical column that were added to the time dimension's logical table when chronological keys were added for aggregate persistence.

Create Statement for Aggregate Specification

The Create statement should follow the Delete statement.

The following is the syntax for creating aggregates:

Create|Prepare aggregates 
aggr_name_1
for  logical_fact_table_1 [(logical_fact_column_1, logical_fact_column_2, count_distinct_logical_fact_column_1 as_raw_values, ...)]   
at levels (level_1, level_2, …)
using connection pool connection_pool_name_1
in schema_name_1
[ ,aggr_name_2
for logical_fact_table_3 [(logical_fact_column_5, logical_fact_column_2,…)]   
at levels (level_3, level_2, …)
using connection pool connection_pool_name_2
in schema_name_2] ;

The as_raw_values must accompany a count-distinct measure with a simple aggregate rule. A simple aggregate rule has only one rule, which is not dimension-based.

Multiple Aggregates in Aggregate Specification

Use these guideline to specify multiple aggregates in a single Create Aggregates statement.

  • Ensure that each of the multiple aggregate specifications are separated by a comma, and the entire aggregate creation script is terminated with a semicolon.

  • In this file, only one Delete Aggregates statement should be specified at the beginning. Make sure that only one delete is issued per ETL run, unless a reset is needed.

    Caution:

    Any aggregate scripts that are run after the first one should not have a Delete Aggregates statement, or all previously created aggregates are removed.

Where Clause for Aggregate Specification

You can add an optional Where clause to the Create statement.

The Where clause filters the data that you want to aggregate and creates fragmented aggregates, or aggregates for only a fragment of data in the base fact table. The Where clause also sets the Fragmentation content field located on the Logical Table Source dialog. In most cases, the creation of fragmented aggregates maximizes query acceleration while minimizing the cost of creating and maintaining the aggregate.

The following examples show when you would use fragmented aggregates:

  • If you are working with the time dimension and want your aggregates to include data only from the last three years.

  • If your company reports primarily on revenue in the United States and wants the aggregates to include only United States data.

The following is an example of a valid Create statement with the Where clause:

Create Aggregates
Revenue_By_Year
for  "sales"."sales" at levels("sales".timedim.year)   
where("sales"."time"."calendar year"=2007)
using connection pool aggrtarget.cp1
in aggrtarget..schema1

Logical Column Requirements

The logical column that you specify in the Where clause must meet the following requirements:

  • The logical column must belong to a dimension.

  • If the logical column belongs to a dimension included in the aggregate specification, then it must be at or above the level of the aggregate.

  • If you use operational_oper, then the logical column's data type must match the constant's data type in inlist.

    If you use inclusion_oper, then the logical column's data type must match all the constants' data type in inlist.

Where Clause Grammar

The grammar for the Where clause in the create aggregate specification is a subset of the Where grammar for a Logical SQL filter. The grammar the Oracle BI Server supports for the create aggregate specification differs slightly from the Logical SQL filter.

Review the following Create statement and its Where clause:

create aggregate aggr1 for fact1 at levels(11,12...) where (filter_list) using ....

The following are the acceptable grammar rules:

filter_list ::= filter logical_oper filter_list

    | filter

    | '(' filter_list ')'

filter ::= logical_column relational_oper constant

    | logical_column inclusion_oper '(' inlist ')'

relational_oper ::= '=' | '!=' | '<' | '>' | '<=' | '>=' | 'like'

inlist ::= constant ',' inlist

    | constant

logical_oper ::= 'and' | 'or'

inclusion_oper ::= 'in' | 'not in'

Adding Surrogate Keys to Dimension Aggregate Tables

The join option default between fact and level aggregate tables uses primary keys from the level aggregate.

If the primary key of the level is large and complex (composite of many columns), then the join to the fact table is expensive. A surrogate key is an artificially generated key, usually a number. A surrogate key, in the level aggregate table, simplifies this join and removes unnecessary columns (level primary key) from the fact table, resulting in a smaller-sized fact table. Adding surrogate keys to the dimension (level) aggregate tables can simplify joins to the fact tables and might improve query performance. Additionally, a surrogate key makes sure that each aggregate table has a unique identifier.

There might be cases in which a level is shared among multiple fact tables. One fact might use surrogate keys, and another might use primary keys from the dimension aggregate. The following are some options for resolving this issue:

  • Set a metadata property for levels that indicates whether to use surrogate keys or primary keys.

  • Always create a surrogate key for a level aggregate (relatively low cost operation). Then, decide later if the fact aggregate should join to it using a surrogate or primary key.

An alternative to specifying the join type for each dimension is to specify if surrogate keys should be used for the entire star. This would result in simpler syntax, but would also restrict the available user options and slow the aggregate creation process.

About the Create/Prepare Aggregates Syntax

The syntax for create/prepare aggregates contains the change for [Using_Surrogate_Key].

The surrogate key option can be specified for each level. If unspecified, the fact and dimension tables are joined using the primary key from the level aggregate.

Create|Prepare aggregates 
aggr_name_1
[file output_file_name]
for  logical_fact_table_1 [(logical_fact_column_1, logical_fact_column_2,…)]   
at levels (level_1 [Using_Surrogate_Key], level_2, …)
using connection pool connection_pool_name_1
in schema_name_1
[ ,aggr_name_2
for logical_fact_table_3 [(logical_fact_column_5, logical_fact_column_2,…)]   
at levels (level_3, level_2, …)
using connection pool connection_pool_name_2
in schema_name_2] ;

About Surrogate Key Output from Create/Prepare Aggregates

The changes to the current process are restricted to the physical metadata layer in the repository and the database.

When you use the Using_Surrogate_Key join option, the following describes the results:

  • For a level aggregate, the following occurs:

    • In the physical metadata, the following occurs:

      • The level aggregate table has a new column called levelName_upgradeIDSK, check for collisions. This is the surrogate key column for the dimension aggregate. Note that levelName is truncated if the total number of characters exceeds 18.

    • In the database, the following occurs:

      • The level aggregate table also has a corresponding column called levelName_upgradeIDSK. Again, levelName is truncated if the total number of characters exceeds 18.

      • It can be populated using RCOUNT().

  • For a fact aggregate, the following occurs:

    • In the physical metadata, the following occurs:

      • The fact aggregate table no longer contains columns from the level's primary keys.

      • Instead, a new column that corresponds to the level aggregate's surrogate key is added to the table.

      • The type of this column is identical to the level's surrogate key.

      • The column has the same name as that in the level aggregate (check for collisions).

      • The fact table and the level table are joined using this surrogate key only.

    • In the database, the following occurs:

      • The fact aggregate table also has the corresponding surrogate key.

      • It is populated using new capabilities to be available through Populate.