Using Aggregate Functions

You can use built in aggregate functions to find information such as a count, a sum, an average, a minimum, or a maximum.

The following functions are called SQL aggregate functions, because their semantics are similar to those in standard SQL: they work in conjunction with grouping and they aggregate values across the rows of a group.

  • long count(*)
  • long count(any*)
  • number sum(any*)
  • number avg(any*)
  • any_atomic min(any*)
  • any_atomic max(any*)

Note:

All SQL aggregate function names are case sensitive.

long count(*)

The count star function returns the number of rows in a group.

long count(any*)

The count function computes its input expression on each row in a group and counts all the non-NULL values returned by these evaluations of the input expression.

number sum(any*)

The sum function computes its input expression on each row in a group and sums up all the numeric values returned by these evaluations of the input expression. Any non-numeric values are skipped. However, if it can be determined at compile time that the input expression does not return any numeric values, an error is thrown. The resulting value has type long, double, or number, depending on the type of the input items: if there is at least one input item of type number, the result will be a number, otherwise if there is at least one item of type double or float, the result will be double, otherwise the result will be a long. Finally, if no numeric values are returned by sum’s input, the result is NULL.

number avg(any*)

The avg (average) function computes its input expression on each row in a group and sums up as well as counts all the numeric values returned by these evaluations of the input expression. Any nonnumeric values are skipped. However, if it can be determined at compile time that the input expression does not return any numeric values, an error is thrown. The resulting value is the division of the sum by the count. This value has type double, or number, depending on the type of the input items: if there is at least one input item of type number, the result will be a number, otherwise the result will be double. Finally, if no numeric values are returned by avg’s input, the result is NULL.

any_atomic min(any*)

The min function returns the minimum value among all the values returned by the evaluations of the input expression on each row in a group. More specifically: If it can be determined at compile time that the values returned by the input expression belong to a type for which an order comparison is not defined (i.e., RECORD, MAP, BINARY, or FIXED_BINARY), an error is thrown. Otherwise, the min value for each group is initialized to NULL. Next, let M be the current minimum value and N be the next input value. If N is a record, map, array, binary or fixed binary value, NULL, or the json null, it is skipped. If M is NULL, M is set to N. Else, if N is less than M, M is set to N. In comparing the two values M and N, the rules of the value comparison operator are used when the values are comparable according to the same rules. See Value Comparison Operators. When the values are not comparable, the following order is used: numeric values < timestamps < strings and enums < booleans. Notice that according to these rules, the min function will return NULL if and only if all the input values in a group are records, maps, arrays, binary or fixed binary values, NULL, or the json null.

any_atomic max(any*)

The max function returns the maximum value in all the sequences returned by the evaluations of the input expression on each row in a group. The specific rules are the same as for the min function, except that the current max value M will be replaced by the next input value N if N is not skipped and is greater than M.

Example 6-5 Aggregate Function

CREATE INDEX idx11 ON users (age);

SELECT 
age, count(*) AS count, avg(income) AS income 
FROM users 
GROUP BY age;

The above query groups users by their age, and for each age, return the number of users having that age and their average income.