Sorting, Grouping & Limiting results
If you want to follow along with the examples, download the script baggageschema_loaddata.sql and execute it as shown below. This script creates the table used in the example and loads data into the table.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
load
command, execute the
script.load -file baggageschema_loaddata.sql
Ordering results
Use the ORDER BY clause to order the results by any column, primary key or non-primary key.
SELECT bag.ticketNo, bag.fullName
FROM BaggageInfo bag
ORDER BY bag.fullName
Explanation: You are sorting the ticket number of passengers in the
BaggageInfo
schema based on the full name of the passengers in
ascending order.
{"ticketNo":1762344493810,"fullName":"Adam Phillips"}
{"ticketNo":1762392135540,"fullName":"Adelaide Willard"}
{"ticketNo":1762376407826,"fullName":"Dierdre Amador"}
{"ticketNo":1762355527825,"fullName":"Doris Martin"}
{"ticketNo":1762324912391,"fullName":"Elane Lemons"}
{"ticketNo":1762350390409,"fullName":"Fallon Clements"}
{"ticketNo":1762341772625,"fullName":"Gerard Greene"}
{"ticketNo":176234463813,"fullName":"Henry Jenkins"}
{"ticketNo":1762383911861,"fullName":"Joanne Diaz"}
{"ticketNo":1762377974281,"fullName":"Kendal Biddle"}
{"ticketNo":1762355854464,"fullName":"Lisbeth Wampler"}
{"ticketNo":1762320369957,"fullName":"Lorenzo Phil"}
{"ticketNo":1762320569757,"fullName":"Lucinda Beckman"}
{"ticketNo":1762340683564,"fullName":"Mary Watson"}
{"ticketNo":1762330498104,"fullName":"Michelle Payne"}
{"ticketNo":1762348904343,"fullName":"Omar Harvey"}
{"ticketNo":1762399766476,"fullName":"Raymond Griffin"}
{"ticketNo":1762311547917,"fullName":"Rosalia Triplett"}
{"ticketNo":1762357254392,"fullName":"Teena Colley"}
{"ticketNo":1762390789239,"fullName":"Zina Christenson"}
{"ticketNo":1762340579411,"fullName":"Zulema Martindale"}
SELECT bag.fullName, bag.bagInfo[].tagNum,
bag.bagInfo[].lastSeenTimeGmt
FROM BaggageInfo bag
WHERE bag.bagInfo[].lastSeenStation=any "MEL"
ORDER BY bag.bagInfo[].lastSeenTimeGmt DESC
BaggageInfo
table based on the last seen station and you sort the
filtered results based on the last seen time and the full name of the passengers in
descending order. You do this using the ORDER BY clause.
Note:
You can use more than one column to sort the output of the query.{"fullName":"Adam Phillips","tagNum":"17657806255240","lastSeenTimeGmt":"2019-02-01T16:13:00Z"}
{"fullName":"Zina Christenson","tagNum":"17657806228676","lastSeenTimeGmt":"2019-02-04T10:08:00Z"}
{"fullName":"Joanne Diaz","tagNum":"17657806292518","lastSeenTimeGmt":"2019-02-16T16:13:00Z"}
{"fullName":"Zulema Martindale","tagNum":"17657806288937","lastSeenTimeGmt":"2019-02-25T20:15:00Z"}
Limit and offset results
Use the LIMIT
clause to limit the number of results returned from a
SELECT statement. For example, if there are 1000 rows in a table, limit the number of
rows to return by specifying a LIMIT value. It is recommended to use LIMIT and OFFSET
with an ORDER BY clause. Otherwise, the results are returned in a random order,
producing unpredictable results.
A good use-case/example of using LIMIT and OFFSET is the application paging of results. Say for example your application wants to show 4 results per page. You can use limit and offset to implement stateless paging in the application. If you are showing n ( say 4 ) results per page, then the results for page m (say 2) are being displayed, then offset would be (n*m-1) which is 4 in this example and the limit would be n(which is 4 here).
SELECT $bag.fullName, $bag.bagInfo.tagNum, $flt_time
FROM BaggageInfo $bag,
$bag.bagInfo[].lastSeenTimeGmt $flt_time
WHERE $bag.bagInfo[].lastSeenStation=any "JTR"
ORDER BY $flt_time LIMIT 4
Explanation: You filter the data in the BaggageInfo
table based on the last seen station and you sort the result based on the last seen
time. You use an unnest array to flatten your data. That is the bagInfo
array is flattened and the last seen time is fetched. You need to just display the first
4 rows from the result set.
{"fullName":"Michelle Payne","tagNum":"17657806247861","flt_time":"2019-02-02T23:59:00Z"}
{"fullName":"Gerard Greene","tagNum":"1765780626568","flt_time":"2019-03-07T16:01:00Z"}
{"fullName":"Lorenzo Phil","tagNum":["17657806240001","17657806340001"],"flt_time":"2019-03-12T15:05:00Z"}
{"fullName":"Lucinda Beckman","tagNum":"17657806240001","flt_time":"2019-03-12T15:05:00Z"}
SELECT $bag.fullName, $bag.bagInfo.tagNum, $flt_time
FROM BaggageInfo $bag,
$bag.bagInfo[].lastSeenTimeGmt $flt_time
WHERE $bag.bagInfo[].lastSeenStation=any "JTR"
ORDER BY $flt_time LIMIT 4 OFFSET 4
Explanation: You filter the data in the BaggageInfo
table based on the last seen station and you sort the result based on the last seen
time. You use an unnest array to flatten your data. You need to display the contents of
the second page, so you set an OFFSET 4. Though you LIMIT to 4 rows, only one row is
displayed as the total result set is only 5. The first few are skipped and the fifth one
is displayed.
{"fullName":"Lorenzo Phil","tagNum":["17657806240001","17657806340001"],
"flt_time":"2019-03-12T16:05:00Z"}
Grouping results
Use the GROUP BY clause to group the results by one or more table columns. Typically, a GROUP BY clause is used in conjunction with an aggregate expression such as COUNT, SUM, and AVG.
SELECT bag.confNo,
count(bag.bagInfo) AS TOTAL_BAGS
FROM BaggageInfo bag
GROUP BY bag.confNo
Explanation: Every passenger has one reservation code
(confNo
). A passenger can have more than one baggage. Here you
group the data based on the reservation code and you get the count of the
bagInfo
array which gives the number of bags per reservation.
{"confNo":"FH7G1W","TOTAL_BAGS":1}
{"confNo":"PQ1M8N","TOTAL_BAGS":1}
{"confNo":"XT6K7M","TOTAL_BAGS":1}
{"confNo":"DN3I4Q","TOTAL_BAGS":1}
{"confNo":"QB1O0J","TOTAL_BAGS":1}
{"confNo":"TX1P7E","TOTAL_BAGS":1}
{"confNo":"CG6O1M","TOTAL_BAGS":1}
{"confNo":"OH2F8U","TOTAL_BAGS":1}
{"confNo":"BO5G3H","TOTAL_BAGS":1}
{"confNo":"ZG8Z5N","TOTAL_BAGS":1}
{"confNo":"LE6J4Z","TOTAL_BAGS":1}
{"confNo":"XT1O7T","TOTAL_BAGS":1}
{"confNo":"QI3V6Q","TOTAL_BAGS":2}
{"confNo":"RL3J4Q","TOTAL_BAGS":1}
{"confNo":"HJ4J4P","TOTAL_BAGS":1}
{"confNo":"CR2C8MY","TOTAL_BAGS":1}
{"confNo":"LN0C8R","TOTAL_BAGS":1}
{"confNo":"MZ2S5R","TOTAL_BAGS":1}
{"confNo":"KN4D1L","TOTAL_BAGS":1}
{"confNo":"MC0E7R","TOTAL_BAGS":1}
SELECT $flt_src as SOURCE,
count(*) as COUNT
FROM BaggageInfo $bag,
$bag.bagInfo.flightLegs[0].fltRouteSrc $flt_src
GROUP BY $flt_src
Explanation: You want to get the total count of baggage originating
from each airport. However, you don't want to consider the airports that are part of the
transit. So you group the data with the flight source values of the first record of the
flightLegs
array( as the first record is the source). You then
determine the count of baggage.
{"SOURCE":"SFO","COUNT":6}
{"SOURCE":"BZN","COUNT":1}
{"SOURCE":"GRU","COUNT":1}
{"SOURCE":"LAX","COUNT":1}
{"SOURCE":"YYZ","COUNT":1}
{"SOURCE":"MEL","COUNT":1}
{"SOURCE":"MIA","COUNT":4}
{"SOURCE":"MSQ","COUNT":2}
{"SOURCE":"MXP","COUNT":2}
{"SOURCE":"JFK","COUNT":3}