Oracle® Business Intelligence Server Administration Guide > Oracle BI Server SQL Reference > SQL Syntax and Semantics >
Rules for Queries with Aggregate Functions
The Oracle BI Server simplifies the SQL needed to craft aggregate queries. This section outlines the rules that the Oracle BI Server follows with respect to whether or not a query contains a GROUP BY clause and, if a GROUP BY clause is specified, what results you should expect from the query. The rules outlined in this section apply to all aggregates used in SQL statements (SUM, AVG, MIN, MAX, COUNT(*), and COUNT). Computing Aggregates of Baseline Columns
A baseline column is a column that has no aggregation rule defined in the Aggregation tab of the Logical Column dialog in the repository. Baseline columns map to nonaggregated data at the level of granularity of the logical table to which they belong. If you perform aggregation (SUM, AVG, MIN, MAX, or COUNT) on a baseline column through a SQL request, the Oracle BI Server calculates the aggregation at the level based on the following rules: - If there is no GROUP BY clause specified, the level of aggregation is grouped by all of the nonaggregate columns in the SELECT list.
- If there is a GROUP BY clause specified, the level of aggregation is based on the columns specified in the GROUP BY clause.
For example, consider the following query, where the column revenue is defined in the repository as a baseline column (no aggregation rules specified in the Logical Column > Aggregation tab): select year, product, sum(revenue)
from time, products, facts
YEAR PRODUCT SUM(REVENUE)
1998 Coke 500
1998 Pepsi 600
1999 Coke 600
1999 Pepsi 550
2000 Coke 800
2000 Pepsi 600
This query returns results grouped by year and product; that is, it returns one row for each product and year combination. The sum calculated for each row is the sum of all the sales for that product in that year. It is logically the same query as the following: select year, product, sum(revenue)
from time, products, facts
group by year, product
If you change the GROUP BY clause to only group by year, then the sum calculated is the sum of all products for the year, as follows: select year, product, sum(revenue)
from time, products, facts
group by year
YEAR PRODUCT SUM(REVENUE) 1998 Coke 1100 1998 Pepsi 1100 1999 Coke 1150 1999 Pepsi 1150 2000 Coke 1400 2000 Pepsi 1400
In this query result set, the sum of revenue is the same for each row corresponding to a given year, and that sum represents the total sales for that year. In this case, it is the sales of Coke plus the sales of Pepsi. If you add a column to the query requesting the COUNT of revenue, the Oracle BI Server calculates the number of records used to calculate the results for each group. In this case, it is a year, as shown in the following example: select year, product, sum(revenue), count(revenue) from time, products, facts group by year
YEAR PRODUCT SUM(REVENUE) COUNT(REVENUE) 1998 Coke 1100 6000 1998 Pepsi 1100 6000 1999 Coke 1150 6500 1999 Pepsi 1150 6500 2000 Coke 1400 8000 2000 Pepsi 1400 8000
Computing Aggregates of Measure Columns
A measure column is a column that has a default aggregation rule defined in the Aggregation tab of the Logical Column dialog in the repository. Measure columns always calculate the aggregation with which they are defined. If you perform explicit aggregation (SUM, AVG, MIN, MAX, or COUNT) on a measure column through a SQL request, you are actually asking for an aggregate of an aggregate. For these nested aggregates, the Oracle BI Server calculates the aggregation based on the following rules: - A request for a measure column without an aggregate function defined in a SQL statement is always grouped at the level of the nonaggregate columns in the SELECT list, regardless of whether the query specifies a GROUP BY clause.
- If there is no GROUP BY clause specified, the nested aggregate is a grand total of each group determined by all of the nonaggregate columns in the SELECT list.
- If there is a GROUP BY clause specified, the nested aggregation calculates the total for each group as specified in the GROUP BY clause.
For example, consider the following query, where the column SumOfRevenue is defined in the repository as a measure column with a default aggregation rule of SUM (SUM aggregation rule specified in the Aggregation tab of the Logical Column dialog): select year, product, SumOfRevenue, sum(SumOfRevenue) from time, products, facts
YEAR PRODUCT SUMofREVENUE SUM(SUMofREVENUE) 1998 Coke 500 3650 1998 Pepsi 600 3650 1999 Coke 600 3650 1999 Pepsi 550 3650 2000 Coke 800 3650 2000 Pepsi 600 3650
This query returns results grouped by year and product; that is, it returns one row for each product and year combination. The sum calculated for each row in the SumOfRevenue column is the sum of all the sales for that product in that year because the measure column is always at the level defined by the nonaggregation columns in the query. It is logically the same query as the following: select year, product, SumOfRevenue, sum(SumOfRevenue) from time, products, facts group by year, product
If you change the GROUP BY clause to only group by year, then the sum calculated in the SumOfRevenue column is the sum of each product for the year, and the sum calculated in the SUM(SumOfRevenue) column is total sales of all products for the given year, as follows: select year, product, SumOfRevenue, sum(SumOfRevenue) from time, products, facts group by year
YEAR PRODUCT SUMofREVENUE SUM(SUMofREVENUE) 1998 Coke 500 1100 1998 Pepsi 600 1100 1999 Coke 600 1150 1999 Pepsi 550 1150 2000 Coke 800 1400 2000 Pepsi 600 1400
In this result set, the sum calculated for each row in the SumOfRevenue column is the sum of all the sales for that product in that year because the measure column is always at the level defined by the nonaggregation columns in the query. The SUM(SumOfRevenue) is the same for each row corresponding to a given year, and that sum represents the total sales for that year. In this case, it is the sales of Coke plus the sales of Pepsi. Display Function Reset Behavior
A display function is a function that operates on the result set of a query. The display functions the Oracle BI Server supports (RANK, TOPn, BOTTOMn, PERCENTILE, NTILE, MAVG, MEDIAN, and varieties of standard deviation) are specified in the SELECT list of a SQL query. Queries that use display functions conform to the following rules: - If no GROUP BY clause is specified, the display function operates across the entire result set; that is, the grouping level for the display function is the same as for the query.
- If there is a GROUP BY clause specified, the display function resets its values for each group as specified in the GROUP BY clause.
For example, in the following query, SumOfRevenue is defined as a measure column with the default aggregation rule of SUM: select year, product, SumOfRevenue, rank(SumOfRevenue) from time, products, facts
YEAR PRODUCT SUMOFREVENUE RANK(SUMOFREVENUE) 1998 Coke 500 6 1998 Pepsi 600 2 1999 Coke 600 2 1999 Pepsi 550 5 2000 Coke 800 1 2000 Pepsi 600 2
In this query result set, there is no GROUP BY clause specified, so the rank is calculated across the entire result set. The query is logically the same query as the following: select year, product, SumOfRevenue, rank(SumOfRevenue)) from time, products, facts group by year, product
If you change the GROUP BY clause to only group by year, then the rank is reset for each year, as follows: select year, product, sum(revenue), rank(sum(revenue)) from time, products, facts group by year
YEAR PRODUCT SUM(REVENUE) RANK(SUM(REVENUE)) 1998 Coke 500 2 1998 Pepsi 600 1 1999 Coke 600 1 1999 Pepsi 550 2 2000 Coke 800 1 2000 Pepsi 600 2
In this result set, the rank is reset each time the year changes, and because there are two rows for each year, the value of the rank is always either 1 or 2. Alternative Syntax
When using an aggregate function, you can calculate a specified level of aggregation using BY within the aggregate function. If you do this, you do not need a GROUP BY clause. For example, the query: select year, product, revenue, sum(revenue by year) as year_revenue from softdrinks
will return the column year_revenue that displays revenue aggregated by year. The same syntax can be used with display functions. The query: select year, product, revenue, rank(revenue), rank(revenue by year) from softdrinks order by 1, 5
will calculate overall rank of revenue for each product for each year (each row in the entire result set) and also the rank of each product's revenue within each year. Using FILTER to Compute a Conditional Aggregate
In SQL query language, traditional aggregates, such as SUM, COUNT, MIN, and MAX are evaluated on a group of tuples (an ordered list of objects, each of a specified type), determined by the GROUP BY clause. All the aggregates specified in the SELECT clause of a query are evaluated over the same subset of tuples. Conditional aggregates extend SQL by restricting their input using a predicate. FILTER is an operator that restricts the set of rows used to compute its aggregate argument to rows that satisfy the USING condition. The FILTER operator is a logical SQL construct. It may be used in logical queries referring to the metadata, or in logical columns that use existing logical columns as the source. Syntax
Conditional aggregates are only notational concepts and they do not represent executable operators. Conditional aggregates are expressed in the form of a function as shown in the following statement: FILTER(<measure_expression> USING <boolean_expression>)
Where: - <measure_expression> is an expression that contains at least one measure. The following is a list of examples:
- The expression Sales + 1 is allowed if Sales is a measure.
- The expression productid is not allowed if productid is a scalar attribute.
- <boolean_expression> is a boolean expression (evaluates to TRUE or FALSE) that does not contain any measures. This expression may not contain any nested queries.
Example of the FILTER Function
The following is a simple example of the FILTER function: SELECT year,
FILTER(sales USING product = 'coke'),
FILTER(sales USING product = 'pepsi')
FROM logBeverages
After navigation, this query is executed as follows: SELECT year,
SUM(CASE WHEN product = 'coke' THEN sales),
SUM(CASE WHEN product = 'pepsi' THEN sales)
FROM physBeverages
WHERE product = 'coke' OR product = 'pepsi'
GROUP BY year
Error handling
In the example FILTER(X USING Y), error messages will be returned in the following situations: - The Y expression is not a boolean expression.
- The Y expression contains measures.
- FILTER is used in outer query block.
- Explicit aggregates are used in the X (measure) expression. For example, FILTER(COUNT(product), C).
|