ORDER BY Clause

The ORDER BY clause reorders the sequence of rows it receives as input. The relative order between any two input rows is determined by evaluating, for each row, the expressions listed in the ORDER BY clause and comparing the resulting values, taking into account the sort_spec associated with each ORDER BY expression.

Syntax

orderby_clause ::= ORDER BY 
   expression sort_spec 
   ("," expression sort_spec)*

sort_spec ::= [ASC|DESC] [NULLS (FIRST|LAST)]

Semantics

Each ordering expression must return at most one atomic value. If an ordering expression returns an empty sequence, the special value EMPTY is used as the returned value. If the SELECT expression includes GROUP BY as well, then the expressions in the ORDER BY must be the grouping expressions (in the GROUP BY clause, if any), or aggregate functions, or expressions that are built on top of grouping expression and/or aggregate functions.

sort_spec : A sort_spec specifies the "direction" of the sort (ascending or descending) and how to compare the special values NULL, JNULL, and EMPTY with the non-special values.
  • If NULLS LAST is specified, the special values will appear after all the non-special values.
  • If NULLS FIRST is specified, the special values will appear before all the non-special values.
The relative ordering among the 3 special values themselves is fixed:
  • if the direction is ASC, the ordering is EMPTY < JNULL < NULL;
  • otherwise the ordering is reversed.
Notice that in the grammar, sort_specs are optional.
  • If no sort_spec is given, the default is ASC order and NULLS LAST.
  • If only the sort order is specified, then NULLS LAST is used if the order is ASC, otherwise NULLS FIRST.
  • If the sort order is not specified, ASC is used.
Taking into account the above rules, the relative order between any two input rows is determined as follows. Let N be the number of order-by expressions and let Vi1, Vi2, … ViN be the atomic values (including EMPTY) returned by evaluating these expressions, from left to right, on a row Ri. Two rows Ri, Rj are considered equal if Vik is equal to Vjk for each k in 1, 2, …, N. In this context, NULLs are considered to be equal only to themselves. Otherwise, Ri is considered less than Rj if there is a pair Vim, Vjm such that:
  • m is 1, or Vik is equal to Vjk for each k in 1, 2, …, (m-1), and
  • Vim is not equal to Vjm, and
  • the m-th sort_spec specifies ascending order and Vim is less than Vjm, or
  • the m-th sort_spec specifies descending order and Vim is greater than Vjm

In the above rules, comparison of any two values Vik and Vjk, when neither of them is special and they are comparable to each other, is done according to the rules of the value-comparison operators defined in the Value Comparison Operators section.

If Vik and Vjk do not have comparable types (which, for example, - can arise when sorting by json fields), the following rule applies:
  • If the direction is ASC, the ordering is numeric items < timestamps < strings and enums < booleans.
  • Otherwise the ordering is reversed.

As with grouping, sorting can be index-based or generic. Index-based sorting is possible only if there is an index that sorts the rows in the desired order. More precisely, let e1, e2, …, eN by the ORDER BY expressions as they appear in the ORDER BY clause (from left to right). Then, there must exist an index (which may be the primary-key index or one of the existing secondary indexes) such that for each i in 1,2,...,N, ei matches the definition of the i-th index field. Furthermore, all the sort_specs must specify the same ordering direction and for each sort_spec, the desired ordering with respect to the special values must match the way these values are sorted by the index. In the current implementation, the special values are always sorted last in an index. So, if the sort order is ASC, all sort_specs must specify NULL LAST, and if the sort order is DESC, all sort_specs must specify NULLS FIRST.

Note:

If no appropriate index exists or is not selected by the query optimizer, the sorting will be generic. This implies that all query results must be fetched into the driver memory and cached there before they can be sorted. So, as with grouping, generic sorting can consume a lot of driver memory, and is therefore best avoided.
For both generic ORDER BY and GROUP BY, applications can programmatically specify how much memory such operations are allowed to consume at the client driver. We have specific methods for this functionality in each of the available language drivers as given below.

Table 6-1 APIS for Memory Consumption

Language Driver Get maximum memory consumption Set maximum memory consumption
Java getMaxMemoryConsumption() setMaxMemoryConsumption​(long v)
Python get_max_memory_consumption() set_max_memory_consumption (memory_consumption)
Node.js maxMemoryMB maxMemoryMB
Go GetMaxMemoryConsumption() MaxMemoryConsumption

Example 6-7 ORDER BY Clause

This example selects the id and the last name for users whose age is greater than 30, returning the results sorted by id. Sorting is possible in this case because id is the primary key of the users table.

SELECT id, lastName 
FROM users 
WHERE age > 30 
ORDER BY id;

Example 6-8 ORDER BY Clause

This example selects the id and the last name for users whose age is greater than 30, returning the results sorted by age. Sorting is possible only if there is a secondary index on the age column (or more generally, a multi-column index whose first column is the age column).

SELECT id, lastName 
FROM users 
WHERE age > 30 
ORDER BY age;

Example 6-9 ORDER BY Clause

The following example returns all the rows sorted by the first name.

SELECT id, firstName, lastName
    FROM users
    ORDER BY firstName;
 +----+-----------+----------+
 | id | firstName | lastName |
 +----+-----------+----------+
 | 10 | John      | Smith    |
 | 20 | Mary      | Ann      |
 | 30 | Peter     | Paul     |
 +----+-----------+----------+
3 rows returned

Example 6-10 ORDER BY Clause

The following example returns the firstName, lastName and income sorted by the income from highest to lowest.

SELECT firstName, lastName, income
    from users
    ORDER BY income DESC;
 +-----------+----------+--------+
 | firstName | lastName | income |
 +-----------+----------+--------+
 | Mary      | Ann      |  90000 |
 | Peter     | Paul     |  53000 |
 | John      | Smith    |  45000 |
 +-----------+----------+--------+
3 rows returned

Example 6-11 ORDER BY Clause

The following example groups the data by age and returns the number of users having that age and their average income ordered by their average income.

SELECT
    age, count(*), avg(income)
    FROM users
    GROUP BY age
    ORDER BY avg(income);
 +-----+----------+----------+
 | age | Column_2 | Column_3 |
 +-----+----------+----------+
 |  22 |        1 |  45000.0 |
 |  25 |        1 |  53000.0 |
 |  43 |        1 |  90000.0 |
 +-----+----------+----------+
3 rows returned

Example 6-12 ORDER BY Clause

In the following example, Query 1 returns the state and income sorted by income. However, if we want to group Query 1 by state, then we can use the GROUP BY clause. However, when a SELECT expression includes grouping, expressions in the SELECT and ORDER BY clauses must reference grouping expressions, aggregate functions or external variable only. So, to get the desired result, we need to rewrite Query 1 as given in Query 2.

Query 1:
 
SELECT
    u.address.state, u.income
    FROM users u
    ORDER BY u.income;
 +---------------+--------+
 |     state     | income |
 +---------------+--------+
 | NV            |  45000 |
 +---------------+--------+
 | CA            |  53000 |
 +---------------+--------+
 | CA            |  90000 |
 +---------------+--------+
3 rows returned
 
Query 2:
 
SELECT
    u.address.state, max(u.income)
    FROM users u
    GROUP BY u.address.state
    ORDER BY max(u.income);
 +---------------+----------+
 |     state     | Column_2 |
 +---------------+----------+
 | NV            |    45000 |
 +---------------+----------+
 | CA            |    90000 |
 +---------------+----------+
2 rows returned

Example 6-13 ORDER BY Clause

In the following example, the Query 1 returns the income and state of all the rows in the users table. The Query 2 gets the average income for each state.

Query 1:
 
SELECT
    u.address.state, u.income
    FROM users u;
 +---------------+--------+
 |     state     | income |
 +---------------+--------+
 | CA            |  53000 |
 +---------------+--------+
 | NV            |  45000 |
 +---------------+--------+
 | CA            |  90000 |
 +---------------+--------+
3 rows returned
 
Query 2:
 
SELECT
    u.address.state, avg(u.income)
    FROM users u
    GROUP BY u.address.state
    ORDER BY avg(u.income);
 +---------------+----------+
 |     state     | Column_2 |
 +---------------+----------+
 | NV            |  45000.0 |
 +---------------+----------+
 | CA            |  71500.0 |
 +---------------+----------+
2 rows returned