Primary Expressions in SQL
If you want to follow along with the examples, download the script baggageschema_loaddata.sql and execute it as shown below. This script creates the table used in the example and loads data into the table.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
load
command, execute the
script.load -file baggageschema_loaddata.sql
Parenthesized Expressions
Parenthesized expressions are used primarily to alter the default precedence among operators. They are also used as a syntactic aid to mix expressions in ways that would otherwise cause syntactic ambiguities.
SELECT fullName, bag.bagInfo.tagNum,
bag.bagInfo.routing,
bag.bagInfo[].flightLegs[].fltRouteDest
FROM BaggageInfo bag
WHERE bag.bagInfo.flightLegs[].fltRouteSrc=any "JFK" AND
(bag.bagInfo[].flightLegs[].fltRouteDest=any "MAD" OR
bag.bagInfo[].flightLegs[].fltRouteDest=any "VIE" )
Explanation: You want to fetch the full name, tag number, and routing details of passengers. The first filter condition is that the boarding point/transit is JFK. Once this is satisfied the second filter condition is that destination is either MAD or VIE. You use an OR condition to filter the destination value.
{"fullName":"Dierdre Amador","tagNum":"17657806240229","routing":"JFK/MAD","fltRouteDest":"MAD"}
{"fullName":"Rosalia Triplett","tagNum":"17657806215913","routing":"JFK/IST/VIE","fltRouteDest":["IST","VIE"]}
{"fullName":"Kendal Biddle","tagNum":"17657806296887","routing":"JFK/IST/VIE","fltRouteDest":["IST","VIE"]}
Case Expressions
The searched CASE expression is similar to the if-then-else statements of traditional programming languages. It consists of a number of WHEN-THEN pairs, followed by an optional ELSE clause at the end. Each WHEN expression is a condition, i.e., it must return BOOLEAN. The THEN expressions as well as the ELSE expression may return any sequence of items. The CASE expression is evaluated by first evaluating the WHEN expressions from top to bottom until the first one that returns true. If it is the i-th WHEN expression that returns true, then the i-th THEN expression is evaluated and its result is the result of the whole CASE expression. If no WHEN expression returns true, then if there is an ELSE, its expression is evaluated and its result is the result of the whole CASE expression; Otherwise, the result of the CASE expression is the empty sequence.
SELECT
fullName,
CASE
WHEN NOT exists bag.bagInfo.flightLegs[0]
THEN "you have no bag info"
WHEN NOT exists bag.bagInfo.flightLegs[1]
THEN "you have one hop"
WHEN NOT exists bag.bagInfo.flightLegs[2]
THEN "you have two hops."
ELSE "you have three hops."
END AS NUMBER_HOPS
FROM BaggageInfo bag WHERE ticketNo=1762340683564
Explanation: You want to determine how many transits are there for the
passenger bagInfo
using a CASE statement. If the
flightLegs
array has no elements, then the passenger has no bag
data. When the flightLegs
array has only one element, then there is
only one transit point. Similarly, if the flightLegs
array has two
elements, then there is two hops. Else there is three transit points. Here you assume
that a bag can have at the most three transit points/hops.
{"fullName":"Mary Watson","NUMBER_HOPS":"you have two hops."}
tagNum
of passengers if the existing value is not a
string.SELECT bag.bagInfo[].tagNum,
CASE
WHEN bag.bagInfo[0].tagNum is of type (NUMBER)
THEN "Tagnumber is not a STRING. Update the data"
ELSE "Tagnumber has correct datatype"
END AS tag_NUM_TYPE
FROM BaggageInfo bag
Explanation: The tagNum
of passengers in the
bagInfo
schema is a STRING data type. But the application could
take in a NUMBER value as the value of tagNum
by mistake. The query
uses "is of type" operator to capture this and prompts the system to update the
tagNum
if the existing value is not a string.
{"tagNum":"17657806240001","tag_NUM_TYPE":"Tagnumber has correct datatype"}
{"tagNum":"17657806224224","tag_NUM_TYPE":"Tagnumber has correct datatype"}
{"tagNum":17657806243578,"tag_NUM_TYPE":"Tagnumber is not a STRING. Update the data"}
{"tagNum":"1765780623244","tag_NUM_TYPE":"Tagnumber has correct datatype"}
Cast Expression
The cast expression creates, if possible, new items of a given target type from the items of its input sequence. For example, a STRING can be converted to TIMESTAMP(0) using CAST expression.
- If the type of the input item is equal to the target item type, the cast is a no-op: the input item itself is returned.
- If the target type is a wildcard type other than JSON and the type of the input item is a subtype of the wild card type, the cast is a no-op.
- If the target type is JSON, then an error is raised if the input item is a non-json atomic type.
- If the target type is an array type, an error is raised if the input item is not an array.
- If the target type is string, the input item may be of any type. That means
every item can be cast to a string. For timestamps, their string value is in UTC
and has the format
uuuu-MM-dd['T'HH:mm:ss]
. - If the target type is an atomic type other than string, the input item must also
be atomic.
-
- Integers and longs can be cast to timestamps. The input value is interpreted as the number of milliseconds since January 1, 1970, 00:00:00 GMT.
- String items may be castable to all other atomic types. Whether the cast succeeds or not depends on whether the actual string value can be parsed into a value that belongs to the domain of the target type.
- Timestamp items are castable to all the timestamp types. If the target type has a smaller precision that the input item, the resulting timestamp is the one closest to the input timestamp in the target precision.
-
- To cast a STRING to TIMESTAMP, if the input has STRING values in
ISO-8601 format, then it will be automatically converted by the SQL runtime into
TIMESTAMP data type.
Note:
ISO8601 describes an internationally accepted way to represent dates, times, and durations.Syntax: Date with time: YYYY-MM-DDThh:mm:ss[.s[s[s[s[s[s]]]]][Z|(+|-)hh:mm]
where- YYYY specifies the year, as four decimal digits
- MM specifies the month, as two decimal digits, 00 to 12
- DD specifies the day, as two decimal digits, 00 to 31
- hh specifies the hour, as two decimal digits, 00 to 23
- mm specifies the minutes, as two decimal digits, 00 to 59
- ss[.s[s[s[s[s]]]]] specifies the seconds, as two decimal digits, 00 to 59, optionally followed by a decimal point and 1 to 6 decimal digits (representing the fractional part of a second).
- Z specifies UTC time (time zone 0). (It can also be specified by +00:00, but not by –00:00.)
- (+|-)hh:mm specifies the time-zone as difference from UTC. (One of + or – is required.)
SELECT CAST (bag.bagInfo.bagArrivalDate AS Timestamp(3))
AS BAG_ARRIVING_DATE
FROM BaggageInfo bag WHERE bag.confNo=DN3I4Q
Explanation: The bagArrivalDate
is a STRING. Using CAST you are
converting this field into a TIMESTAMP format.
{"BAG_ARRIVING_DATE":"2019-02-15T21:21:00.000Z"}
SELECT fullName, bag.ticketNo,
bag.bagInfo[].bagArrivalDate
FROM BaggageInfo bag WHERE
exists bag.bagInfo[$element.bagArrivalDate >="2019-01-01T00:00:00"]
Explanation: You want to filter and display details of the baggage
that are shipped after 2019. The bag arrival date for every element in the
flightLegs
array is compared with the given timestamp
(2019-01-01T00:00:00
). Here the casting is implicit as
bagArrivalDate
is a STRING and is directly compared with a static
Timestamp value. An explicit CAST function is not needed when an implicit casting can be
done. However, your data should be in the format YYYY-MM-DDTHH:MI:SS
.
You then use the EXISTS condition to check if the bagInfo is present for this timestamp
condition.
{"fullName":"Kendal Biddle","ticketNo":1762377974281,"bagArrivalDate":"2019-03-05T12:00:00Z"}
{"fullName":"Lucinda Beckman","ticketNo":1762320569757,"bagArrivalDate":"2019-03-12T15:05:00Z"}
{"fullName":"Adelaide Willard","ticketNo":1762392135540,"bagArrivalDate":"2019-02-15T21:21:00Z"}
{"fullName":"Raymond Griffin","ticketNo":1762399766476,"bagArrivalDate":"2019-02-03T08:09:00Z"}
{"fullName":"Elane Lemons","ticketNo":1762324912391,"bagArrivalDate":"2019-03-15T10:13:00Z"}
{"fullName":"Zina Christenson","ticketNo":1762390789239,"bagArrivalDate":"2019-02-04T10:08:00Z"}
{"fullName":"Zulema Martindale","ticketNo":1762340579411,"bagArrivalDate":"2019-02-25T20:15:00Z"}
{"fullName":"Dierdre Amador","ticketNo":1762376407826,"bagArrivalDate":"2019-03-07T13:51:00Z"}
{"fullName":"Henry Jenkins","ticketNo":176234463813,"bagArrivalDate":"2019-03-02T13:18:00Z"}
{"fullName":"Rosalia Triplett","ticketNo":1762311547917,"bagArrivalDate":"2019-02-12T07:04:00Z"}
{"fullName":"Lorenzo Phil","ticketNo":1762320369957,"bagArrivalDate":["2019-03-12T15:05:00Z","2019-03-12T16:25:00Z"]}
{"fullName":"Gerard Greene","ticketNo":1762341772625,"bagArrivalDate":"2019-03-07T16:01:00Z"}
{"fullName":"Adam Phillips","ticketNo":1762344493810,"bagArrivalDate":"2019-02-01T16:13:00Z"}
{"fullName":"Doris Martin","ticketNo":1762355527825,"bagArrivalDate":"2019-03-22T10:17:00Z"}
{"fullName":"Joanne Diaz","ticketNo":1762383911861,"bagArrivalDate":"2019-02-16T16:13:00Z"}
{"fullName":"Teena Colley","ticketNo":1762357254392,"bagArrivalDate":"2019-02-13T11:15:00Z"}
{"fullName":"Michelle Payne","ticketNo":1762330498104,"bagArrivalDate":"2019-02-02T23:59:00Z"}
{"fullName":"Mary Watson","ticketNo":1762340683564,"bagArrivalDate":"2019-03-14T06:22:00Z"}
{"fullName":"Omar Harvey","ticketNo":1762348904343,"bagArrivalDate":"2019-03-02T16:09:00Z"}
{"fullName":"Fallon Clements","ticketNo":1762350390409,"bagArrivalDate":"2019-02-21T14:08:00Z"}
{"fullName":"Lisbeth Wampler","ticketNo":1762355854464,"bagArrivalDate":"2019-02-10T10:01:00Z"}
Sequence Transform Expressions
A sequence transform expression transforms a sequence into another sequence.
Syntactically it looks like a function whose name is seq_transform
. The
first argument is an expression that generates the sequence to be transformed (the input
sequence) and the second argument is a "mapper" expression that is computed for each
item of the input sequence. The result of the seq_transform
expression
is the concatenation of sequences produced by each evaluation of the mapper expression.
The mapper expression can access the current input item via the $ variable.
ticketNo
, fetch a flat array
containing all the actions performed on the luggage of that
ticketNo
.SELECT seq_transform(l.bagInfo[],
seq_transform(
$sq1.flightLegs[],
seq_transform(
$sq2.actions[],
{
"at" : $sq3.actionAt,
"action" : $sq3.actionCode,
"flightNo" : $sq2.flightNo,
"tagNum" : $sq1.tagNum
}
)
)
) AS actions
FROM baggageInfo l WHERE ticketNo=1762340683564
Explanation: You can use the sequence transform expression for
transforming JSON documents stored in table rows. In such cases, you often use multiple
sequence transform expressions nested inside each other. Here the mapper expression of
an inner sequence transform may need to access the current item of an outer sequence
transform. To allow this, each sequence transform expression 'S' declares a variable
with name $sqN
, where N is the level of nesting of the expression
S
within the outer sequence transform expressions.
$sqN
is basically a synonym for $
, that is, it is
bound to the items returned by the input expression S
. However,
$sqN
can be accessed by other sequence transform expressions that
may be nested inside the expression S
.
{
"actions":[
{"action":"ONLOAD to HKG","at":"YYZ","flightNo":"BM267","tagNum":"17657806299833"},
{"action":"BagTag Scan at YYZ","at":"YYZ","flightNo":"BM267","tagNum":"17657806299833"},
{"action":"Checkin at YYZ","at":"YYZ","flightNo":"BM267","tagNum":"17657806299833"},
{"action":"Offload to Carousel at BLR","at":"BLR","flightNo":"BM115","tagNum":"17657806299833"},
{"action":"ONLOAD to BLR","at":"HKG","flightNo":"BM115","tagNum":"17657806299833"},
{"action":"OFFLOAD from HKG","at":"HKG","flightNo":"BM115","tagNum":"17657806299833"}
]
}