GROUP BY Clause

Specify the GROUP BY clause if you want the database to group the selected rows based on the value of expressions for each row and return a single row of summary information for each group. If the GROUP BY clause is omitted, the entire query result is treated as one group. If this clause contains CUBE or ROLLUP, the results contain superaggregate groupings in addition to the regular groupings.

The expressions in the GROUP BY clause can do the following:

  • Designate single or multiple columns.

  • Include arithmetic operations, the ROWID pseudocolumn, or NULL.

  • Include a date, a constant, or a dynamic parameter.

  • Include ROLLUP or CUBE clauses, where the results produce superaggregate groupings in addition to the regular groupings. Superaggregate groupings are calculated subtotals and totals returned with the regular groupings in the GROUP BY clause.

  • Include GROUPING SETS clause to distinguish which superaggregate groupings to produce.

When you use the GROUP BY clause, the select list can contain only aggregate functions and columns referenced in the GROUP BY clause. If the select list contains the construct *, TableName.*, or Owner.TableName.*, the GROUP BY clause must contain all columns that the * includes. NULL values are considered equivalent in grouping rows. If all other columns are equal, all NULL values in a column are placed in a single group.

Note:

To identify and potentially eliminate NULL groupings from the superaggregate groupings, use the GROUPING function. See "GROUPING" for information.

SQL Syntax

The general syntax for the GROUP BY clause is the following:

GROUP BY
 {Expression | RollupCubeClause | GroupingSetsClause }[,...]
 
GroupingSetsClause::=  GROUPING SETS 
 GroupingExpressionList | RollupCubeClause [,...]

RollupCubeClause
{ ROLLUP | CUBE } ( GroupingExpressionList ) }

GroupingExpressionList::=   
{ Expression | ExpressionList [, { Expression | ExpressionList } ] ...}
 
ExpressionList :: = ( Expression [, Expression ] ...)

Parameters

Parameter Description

Expression

Valid expression syntax. See Expressions for more information.

RollupCubeClause

The GROUP BY clause may include one or more ROLLUP or CUBE clauses.

GroupingSetsClause

The GROUP BY clause may include one or more GROUPING SETS clauses. The GROUPING SETS clause enables you to explicitly specify which groupings of data that the database returns. See "GROUPING SETS" for more information.

GroupingExpressionList

The GROUP BY clause can contain multiple expressions or expression lists.

ROLLUP GroupingExpressionList

The ROLLUP clause is used to generate super aggregate rows from groups. See "ROLLUP" for more information.

CUBE GroupingExpressionList

The CUBE clause groups selected rows based on the values of all possible combinations of the grouping columns in the CUBE clause. See "CUBE" for more information.

ExpressionList

A list of one or more expressions, each separated by a comma.

Examples

The following GROUP BY example sums the salaries for employees in the employees table and uses the SUBSTR expression to group the data by job function.

Command> SELECT SUBSTR (job_id, 4,10), SUM (salary) FROM employees
         GROUP BY SUBSTR (job_id,4,10);
< PRES, 24000 >
< VP, 34000 >
< PROG, 28800 >
< MGR, 24000 >
< ACCOUNT, 47900 >
< MAN, 121400 >
< CLERK, 133900 >
< REP, 273000 >
< ASST, 4400 >
9 rows found.

Query emp_details_view to select the first 10 departments and managers within the department and count the number of employees in the department with the same manager. Use the GROUP BY clause to group the result by department and manager.

Command> columnlabels on;
Command> SELECT first 10 department_id AS DEPT, manager_id AS MGR,
         COUNT(employee_id) AS NUM_EMP
         FROM emp_details_view
         GROUP BY (department_id, manager_id)
         ORDER BY department_id, manager_id;

DEPT, MGR, NUM_EMP
< 10, 101, 1 >
< 20, 100, 1 >
< 20, 201, 1 >
< 30, 100, 1 >
< 30, 114, 5 >
< 40, 101, 1 >
< 50, 100, 5 >
< 50, 120, 8 >
< 50, 121, 8 >
< 50, 122, 8 >
10 rows found.

ROLLUP, CUBE and GROUPING SETS Clauses

The following definitions describe how columns can be grouped within the ROLLUP, CUBE, and GROUPING SETS clauses:

  • Grouping column: A single column used in a GROUP BY clause. For example, in the following GROUP BY clause, X, Y, and Z are group columns.

    GROUP BY X, GROUPING SETS(Y, Z)
    
  • Composite Column: A list of grouping columns inside parentheses. For example, in the following clause, (C1, C2) and (C3, C4) are composite columns.

    GROUP BY ROLLUP( (C1,C2), (C3,C4), C5);
    
  • Grouping: Grouping is a single level of aggregation from within a grouping set. For example, in the following statement, (C1) and (C2, C3) are individual groupings.

    GROUP BY GROUPING SETS(C1, (C2,C3));
    
  • Grouping Set: A collection of groupings inside parentheses. For example, in the following statement, (C1, (C2, C3)) and (C2, (C4, C5)) are two individual grouping sets.

    GROUP BY GROUPING SETS(C1, (C2,C3)), GROUPING SETS(C2, (C4, C5));
    
  • Concatenated grouping sets: Separate multiple grouping sets with commas. The result is a cross-product of groupings from each grouping set.

  • Grand Total or Empty set column: A grand total or empty set grouping computes aggregation by considering all rows as one group. Grand totals are automatically provided in the results for ROLLUP and CUBE clauses; however, you request the grand total in the GROUPING SETS clause by providing empty parentheses, ( ).

Duplicate grouping columns can be used in ROLLUP, CUBE or GROUPING SETS. However, it does result in duplicated result rows.

The ROLLUP, CUBE and GROUPING SETS clauses are not supported in a materialized view definition.

The following sections describe the GROUPING SETS, ROLLUP, and CUBE clauses:

GROUPING SETS

The GROUPING SETS clause enables you to explicitly specify which groupings of data that the database returns. You specify only the desired groups by enclosing them within parentheses, so the database only generates the superaggregate summaries in which you are interested.

The following statement produces three groups: one group returns results for each gender and year columns, a second for a summary superaggregate for each of the months and the last result for the grand total.

SELECT GENDER, YEAR, MONTH,
   SUM (NUM_OF_STUDENTS) AS TOTAL
   FROM INSTRUCTOR_SUMMARY
   GROUP BY GROUPING SETS ((GENDER, YEAR),  -- 1ST GROUP
                           (MONTH), -- 2ND GROUP
                           ()); -- 3RD GROUP
 

You can combine multiple GROUPING SETS to generate specific combinations between the multiple GROUPING SETS. The following statement contains two GROUPING SETS clauses:

GROUP BY GROUPING SETS (YEAR, MONTH),
         GROUPING SETS (WEEK, DAY);

This is equivalent to the following GROUPING SETS statement:

GROUP BY GROUPING SETS (YEAR, WEEK), 
 (YEAR, DAY), 
 (MONTH, WEEK), 
 (MONTH, DAY);

When a GROUP BY clause has both regular grouping columns and a GROUPING SETS clause, the results are grouped by the regular grouping column as follows:

GROUP BY a, b GROUPING SETS(c, d);

This is equivalent to the following:

GROUP BY GROUPING SETS((a, b, c), (a, b, d));

The following example specifies the grouping sets of (region_name, country_name), state_province, and grand totals.

Command> SELECT region_name AS Region,
           country_name AS Country,
           state_province AS State,
            COUNT(employee_id) AS "Total Emp"
         FROM regions r, countries c, locations l, departments d, employees e
         WHERE r.region_id = c.region_id AND
           l.country_id = c.country_id AND
           d.location_id = l.location_id AND
           d.department_id = e.department_id
         GROUP BY grouping sets((region_name, country_name), state_province, ())
         ORDER BY region_name, state_province;
 
REGION, COUNTRY, STATE, TOTAL EMP
< Americas, Canada, <NULL>, 2 >
< Americas, United States of America, <NULL>, 68 >
< Europe, Germany, <NULL>, 1 >
< Europe, United Kingdom, <NULL>, 35 >
< <NULL>, <NULL>, Bavaria, 1 >
< <NULL>, <NULL>, California, 45 >
< <NULL>, <NULL>, Ontario, 2 >
< <NULL>, <NULL>, Oxford, 34 >
< <NULL>, <NULL>, Texas, 5 >
< <NULL>, <NULL>, Washington, 18 >
< <NULL>, <NULL>, <NULL>, 106 >
< <NULL>, <NULL>, <NULL>, 1 >
12 rows found.

ROLLUP

ROLLUP is used within the GROUP BY clause. When used with SUM, ROLLUP generates subtotals from most detailed level (all columns specified in the ROLLUP clause) to the grand total level, by removing one column at each level. These are known as superaggregate rows.

The ROLLUP clause returns the following:

  • Regular aggregate rows that would be produced by GROUP BY without using ROLLUP.

  • Subtotals following the grouping list specified in the ROLLUP clause. ROLLUP takes as its argument an ordered list of grouping columns. Each subtotal is created for the ordered list of grouping columns dropping the right-most grouping column until it reaches the grand total. For instance, if you specify GROUP BY ROLLUP(x, y, z), the returned superaggregate groups would be as follows: (x,y,z), (x,y), (x), ( ).

    The number of subtotals created is n+1 aggregate levels, where n is the number of grouping columns. For example, if there are three expressions (n=3) in the ROLLUP clause, then n+1 = 3+1, resulting in four groupings.

  • Grand total row.

You can group columns using composite columns inside parentheses. For example, in the following statement:

GROUP BY ROLLUP( (a, b), (c, d), e);

The (a, b) and (c, d) composite columns are treated as a unit when the database produces the ROLLUP results. In this example, the grouping sets returned are as follows: ((a, b), (c, d), e ), ((a, b), (c, d)), (a, b) and ().

You can execute several ROLLUP clauses within your SELECT statement, as follows:

SELECT C1, COUNT(*)
FROM T
GROUP BY ROLLUP(a, b), ROLLUP(c, d);

This is equivalent to the following statement:

SELECT C1, COUNT(*)
FROM T
GROUP BY GROUPING SETS((a, b),(a),()), 
 GROUPING SETS((c, d),(c), ());

This example queries the employees table to select the first 10 departments and return the number of employees under each manager in each department. Use ROLLUP to subtotal the number of employees in each department and return a grand total of all employees in the company.

Command> SELECT first 10 department_id AS Dept,
          manager_id AS Mgr,
         COUNT(employee_id) AS "Total emp"
         FROM employees
         GROUP BY ROLLUP(department_id, manager_id)
         ORDER BY department_id, manager_id;
 
DEPT, MGR, TOTAL EMP
< 10, 101, 1 >
< 10, <NULL>, 1 >
< 20, 100, 1 >
< 20, 201, 1 >
< 20, <NULL>, 2 >
< 30, 100, 1 >
< 30, 114, 5 >
< 30, <NULL>, 6 >
< 40, 101, 1 >
< 40, <NULL>, 1 >
10 rows found.

The following query returns the number of employees in each region, country and state or province. The rollup returns superaggregate rows for subtotals of all employees in each state or province and in each country and a grand total for all employees in the company. By combining the region and country as its own unit (within parentheses), the rollup does not return all employees for each region.

Command> SELECT region_name AS Region,
          country_name AS Country,
          state_province AS State,
          COUNT(employee_id) AS "Total Emp"
         FROM regions r, countries c, locations l, departments d, employees e
         WHERE r.region_id = c.region_id
         AND l.country_id = c.country_id
         AND d.location_id = l.location_id
         AND d.department_id = e.department_id
         GROUP BY rollup((region_name, country_name), state_province)
         ORDER BY region_name;

REGION, COUNTRY, STATE, TOTAL EMP
< Americas, Canada, Ontario, 2 > 
< Americas, United States of America, Texas, 5 > 
< Americas, United States of America, California, 45 > 
< Americas, United States of America, Washington, 18 > 
< Americas, Canada, <NULL>, 2 > 
< Americas, United States of America, <NULL>, 68 > 
< Europe, Germany, Bavaria, 1 > 
< Europe, United Kingdom, <NULL>, 1 > 
< Europe, United Kingdom, Oxford, 34 > 
< Europe, Germany, <NULL>, 1 > 
< Europe, United Kingdom, <NULL>, 35 > 
< <NULL>, <NULL>, <NULL>, 106 > 
12 rows found. 

CUBE

The CUBE clause groups the selected rows based on the values of all possible combinations of the grouping columns in the CUBE clause. It returns a single row of summary information for each group. For example, if there are three expressions (n=3) in the CUBE clause, then 2n = 23, resulting in eight groupings. Rows grouped on the values of n expressions are called regular rows; all others are called superaggregate rows. You can group using composite columns. For example, a commonly requested CUBE operation is for state sales subtotals on all combinations of month, state, and product sold.

If you specify GROUP BY CUBE(a, b, c), the resulting aggregate groupings generated are as follows: (a,b,c), (a,b), (a,c), (b,c), a, b, c, ( ).

This example returns the number of employees for each region and country, issue the following query.

Command> SELECT region_name AS Region,
          country_name AS Country,
          COUNT(employee_id) AS "Total Emp"
       FROM regions r, countries c, locations l, departments d, employees e
       WHERE r.region_id = c.region_id
       AND l.country_id = c.country_id
       AND d.location_id = l.location_id
       AND d.department_id = e.department_id
       GROUP BY CUBE(region_name, country_name)
       ORDER BY region_name;
 
REGION, COUNTRY, TOTAL EMP
< Americas, Canada, 2 >
< Americas, United States of America, 68 >
< Americas, <NULL>, 70 >
< Europe, Germany, 1 >
< Europe, United Kingdom, 35 >
< Europe, <NULL>, 36 >
< <NULL>, Canada, 2 >
< <NULL>, Germany, 1 >
< <NULL>, United Kingdom, 35 >
< <NULL>, United States of America, 68 >
< <NULL>, <NULL>, 106 >
11 rows found.