Sorting, Grouping & Limiting results

If you want to follow along with the examples, download the script baggageschema_loaddata.sql and execute it as shown below. This script creates the table used in the example and loads data into the table.

Start your KVSTORE or KVLite and open the SQL.shell.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
Using the load command, execute the script.
load -file baggageschema_loaddata.sql

Ordering results

Use the ORDER BY clause to order the results by any column, primary key or non-primary key.

Example 1: Sort the ticket number of all passengers by their full name.
SELECT bag.ticketNo, bag.fullName 
FROM BaggageInfo bag 
ORDER BY bag.fullName

Explanation: You are sorting the ticket number of passengers in the BaggageInfo schema based on the full name of the passengers in ascending order.

Output:
{"ticketNo":1762344493810,"fullName":"Adam Phillips"}
{"ticketNo":1762392135540,"fullName":"Adelaide Willard"}
{"ticketNo":1762376407826,"fullName":"Dierdre Amador"}
{"ticketNo":1762355527825,"fullName":"Doris Martin"}
{"ticketNo":1762324912391,"fullName":"Elane Lemons"}
{"ticketNo":1762350390409,"fullName":"Fallon Clements"}
{"ticketNo":1762341772625,"fullName":"Gerard Greene"}
{"ticketNo":176234463813,"fullName":"Henry Jenkins"}
{"ticketNo":1762383911861,"fullName":"Joanne Diaz"}
{"ticketNo":1762377974281,"fullName":"Kendal Biddle"}
{"ticketNo":1762355854464,"fullName":"Lisbeth Wampler"}
{"ticketNo":1762320369957,"fullName":"Lorenzo Phil"}
{"ticketNo":1762320569757,"fullName":"Lucinda Beckman"}
{"ticketNo":1762340683564,"fullName":"Mary Watson"}
{"ticketNo":1762330498104,"fullName":"Michelle Payne"}
{"ticketNo":1762348904343,"fullName":"Omar Harvey"}
{"ticketNo":1762399766476,"fullName":"Raymond Griffin"}
{"ticketNo":1762311547917,"fullName":"Rosalia Triplett"}
{"ticketNo":1762357254392,"fullName":"Teena Colley"}
{"ticketNo":1762390789239,"fullName":"Zina Christenson"}
{"ticketNo":1762340579411,"fullName":"Zulema Martindale"}
Example 2: Fetch the passenger details( full name, tag number) by the last seen time ( latest first) for passengers (sorted by their name) whose last seen station is MEL.
SELECT bag.fullName, bag.bagInfo[].tagNum, 
bag.bagInfo[].lastSeenTimeGmt
FROM BaggageInfo bag 
WHERE bag.bagInfo[].lastSeenStation=any "MEL"
ORDER BY bag.bagInfo[].lastSeenTimeGmt DESC
Explanation: You first filter the data in the BaggageInfo table based on the last seen station and you sort the filtered results based on the last seen time and the full name of the passengers in descending order. You do this using the ORDER BY clause.

Note:

You can use more than one column to sort the output of the query.
Output:
{"fullName":"Adam Phillips","tagNum":"17657806255240","lastSeenTimeGmt":"2019-02-01T16:13:00Z"}
{"fullName":"Zina Christenson","tagNum":"17657806228676","lastSeenTimeGmt":"2019-02-04T10:08:00Z"}
{"fullName":"Joanne Diaz","tagNum":"17657806292518","lastSeenTimeGmt":"2019-02-16T16:13:00Z"}
{"fullName":"Zulema Martindale","tagNum":"17657806288937","lastSeenTimeGmt":"2019-02-25T20:15:00Z"}

Limit and offset results

Use the LIMIT clause to limit the number of results returned from a SELECT statement. For example, if there are 1000 rows in a table, limit the number of rows to return by specifying a LIMIT value. It is recommended to use LIMIT and OFFSET with an ORDER BY clause. Otherwise, the results are returned in a random order, producing unpredictable results.

A good use-case/example of using LIMIT and OFFSET is the application paging of results. Say for example your application wants to show 4 results per page. You can use limit and offset to implement stateless paging in the application. If you are showing n ( say 4 ) results per page, then the results for page m (say 2) are being displayed, then offset would be (n*m-1) which is 4 in this example and the limit would be n(which is 4 here).

Example 1: Your application can show 4 results on a page. Fetch the details fetched by your application in the first page for passengers whose last seen station is JTR.
SELECT $bag.fullName, $bag.bagInfo.tagNum, $flt_time
FROM BaggageInfo $bag, 
$bag.bagInfo[].lastSeenTimeGmt $flt_time
WHERE $bag.bagInfo[].lastSeenStation=any "JTR" 
ORDER BY $flt_time LIMIT 4

Explanation: You filter the data in the BaggageInfo table based on the last seen station and you sort the result based on the last seen time. You use an unnest array to flatten your data. That is the bagInfo array is flattened and the last seen time is fetched. You need to just display the first 4 rows from the result set.

Output:
{"fullName":"Michelle Payne","tagNum":"17657806247861","flt_time":"2019-02-02T23:59:00Z"}
{"fullName":"Gerard Greene","tagNum":"1765780626568","flt_time":"2019-03-07T16:01:00Z"}
{"fullName":"Lorenzo Phil","tagNum":["17657806240001","17657806340001"],"flt_time":"2019-03-12T15:05:00Z"}
{"fullName":"Lucinda Beckman","tagNum":"17657806240001","flt_time":"2019-03-12T15:05:00Z"}
Example 2: Your application can show 4 results on a page. Fetch the details fetched by your application in the second page for passengers whose last seen station is JTR.
SELECT $bag.fullName, $bag.bagInfo.tagNum, $flt_time 
FROM BaggageInfo $bag, 
$bag.bagInfo[].lastSeenTimeGmt $flt_time 
WHERE $bag.bagInfo[].lastSeenStation=any "JTR" 
ORDER BY $flt_time LIMIT 4 OFFSET 4

Explanation: You filter the data in the BaggageInfo table based on the last seen station and you sort the result based on the last seen time. You use an unnest array to flatten your data. You need to display the contents of the second page, so you set an OFFSET 4. Though you LIMIT to 4 rows, only one row is displayed as the total result set is only 5. The first few are skipped and the fifth one is displayed.

Output:
{"fullName":"Lorenzo Phil","tagNum":["17657806240001","17657806340001"],
"flt_time":"2019-03-12T16:05:00Z"}

Grouping results

Use the GROUP BY clause to group the results by one or more table columns. Typically, a GROUP BY clause is used in conjunction with an aggregate expression such as COUNT, SUM, and AVG.

Example 1: Display the number of bags for each reservation made.
SELECT bag.confNo, 
count(bag.bagInfo) AS TOTAL_BAGS 
FROM BaggageInfo bag 
GROUP BY bag.confNo

Explanation: Every passenger has one reservation code (confNo). A passenger can have more than one baggage. Here you group the data based on the reservation code and you get the count of the bagInfo array which gives the number of bags per reservation.

Output:
{"confNo":"FH7G1W","TOTAL_BAGS":1}
{"confNo":"PQ1M8N","TOTAL_BAGS":1}
{"confNo":"XT6K7M","TOTAL_BAGS":1}
{"confNo":"DN3I4Q","TOTAL_BAGS":1}
{"confNo":"QB1O0J","TOTAL_BAGS":1}
{"confNo":"TX1P7E","TOTAL_BAGS":1}
{"confNo":"CG6O1M","TOTAL_BAGS":1}
{"confNo":"OH2F8U","TOTAL_BAGS":1}
{"confNo":"BO5G3H","TOTAL_BAGS":1}
{"confNo":"ZG8Z5N","TOTAL_BAGS":1}
{"confNo":"LE6J4Z","TOTAL_BAGS":1}
{"confNo":"XT1O7T","TOTAL_BAGS":1}
{"confNo":"QI3V6Q","TOTAL_BAGS":2}
{"confNo":"RL3J4Q","TOTAL_BAGS":1}
{"confNo":"HJ4J4P","TOTAL_BAGS":1}
{"confNo":"CR2C8MY","TOTAL_BAGS":1}
{"confNo":"LN0C8R","TOTAL_BAGS":1}
{"confNo":"MZ2S5R","TOTAL_BAGS":1}
{"confNo":"KN4D1L","TOTAL_BAGS":1}
{"confNo":"MC0E7R","TOTAL_BAGS":1}
Example 2: Select the total baggage originating from each airport (excluding the transit baggage).
SELECT $flt_src as SOURCE, 
count(*) as COUNT 
FROM BaggageInfo $bag, 
$bag.bagInfo.flightLegs[0].fltRouteSrc $flt_src 
GROUP BY $flt_src

Explanation: You want to get the total count of baggage originating from each airport. However, you don't want to consider the airports that are part of the transit. So you group the data with the flight source values of the first record of the flightLegs array( as the first record is the source). You then determine the count of baggage.

Output:
{"SOURCE":"SFO","COUNT":6}
{"SOURCE":"BZN","COUNT":1}
{"SOURCE":"GRU","COUNT":1}
{"SOURCE":"LAX","COUNT":1}
{"SOURCE":"YYZ","COUNT":1}
{"SOURCE":"MEL","COUNT":1}
{"SOURCE":"MIA","COUNT":4}
{"SOURCE":"MSQ","COUNT":2}
{"SOURCE":"MXP","COUNT":2}
{"SOURCE":"JFK","COUNT":3}

Aggregating results

Use the built in aggregate and sequence aggregate functions to find information such as a count, a sum, an average, a minimum, or a maximum.

Example 1: Find the total number of checked bags that are estimated to arrive at the LAX airport at a particular time.
SELECT $estdate as ARRIVALDATE, 
count($flight) AS COUNT
FROM BaggageInfo $bag, 
$bag.bagInfo.flightLegs.estimatedArrival $estdate,
$bag.bagInfo.flightLegs.flightNo $flight,  
$bag.bagInfo.flightLegs.fltRouteDest $flt_dest 
WHERE $estdate =any "2019-02-01T11:00:00Z" AND $flt_dest =any "LAX"
GROUP BY $estdate

Explanation: In an airline baggage tracking application, you can get the total count of checked bags that are estimated to arrive at a particular airport and time. For each flight leg, the estimatedArrival field in the flightLegs array of the BaggageInfo table contains the arrival time of the checked bags and the fltRouteDest field contains the destination airport code. In the above query, to determine the total number of checked bags arriving at the LAX airport at a given time, you first group the data with the estimated arrival time value using the GROUP BY clause. From the group, you select only the rows that have the destination airport as LAX. You then determine the bag count for the resultant rows using the count function.

Here, you can compare the string-formatted dates in ISO-8601 format due to the natural sorting order of strings without having to cast them into timestamp data types. The $bag.bagInfo.flightLegs.estimatedArrival and $bag.bagInfo.flightLegs.fltRouteDest are sequences. Since the comparison expression '=' cannot operate on sequences of more than one item, the sequence comparison operator '=any' is used instead to compare the estimatedArrival and fltRouteDest fields.

Output:
{"ARRIVALDATE":"2019-02-01T11:00:00Z","COUNT":2}
Example 2: Display an automated message regarding the number of checked bags, travel route, and flight count to a passenger in the airline baggage tracking application.
SELECT fullName,
b.baginfo[0].routing,
size(baginfo) AS BAGS,
    CASE
        WHEN seq_count(b.bagInfo[0].flightLegs.flightNo) = 1
        THEN "You have one flight to catch"
        WHEN seq_count(b.bagInfo[0].flightLegs.flightNo) = 2
        THEN "You have two flights to catch"
        WHEN seq_count(b.bagInfo[0].flightLegs.flightNo) = 3
        THEN "You have three flights to catch"
        ELSE "You do not have any travel listed today"
    END AS FlightInfo
FROM BaggageInfo b
WHERE ticketNo = 1762320369957

Explanation: In the airline baggage tracking application, it is helpful to display a quick look-up message regarding the flight count, number of checked bags, and routing details of an upcoming travel for a passenger. The bagInfo array holds the checked bag details of the passenger. The size of the bagInfo array determines the number of checked bags per passenger. The flightLegs array in the bagInfo includes the flight details corresponding to each travel leg. The routing field includes the airport codes of all the travel fragments. You can determine the number of flights by counting the flightNo fields in the flightLegs array. If a passenger has more than one checked bag, there will be more than one element in the bagInfo array, one for each bag. In such cases, the flightLegs array in all the elements of the bagInfo field of a passenger data will contain the same values. This is because the destination of all the checked bags for a passenger will be the same. While counting the flightNo fields, you must consider only one element of the bagInfo array to avoid duplication of results. In this query, you consider only the first element, that is, bagInfo[0]. As the flightLegs array has a flightNo field for each travel fragment, it is a sequence and you determine the count of the flightNo fields per passenger using the seq_count function.

You use the CASE statement to introduce different messages based on the flight count. For ease of use, only three transits are considered in the query.

Output:
{"fullName":"Lorenzo Phil","routing":"SFO/IST/ATH/JTR","BAGS":2,"FlightInfo":"You have three flights to catch"}