Unnest Arrays & Maps
Use unnest_syntax to flatten rows that include arrays or maps.
Syntax of unnest
unnest_expression::=expression AS VARNAME
unnest_clause ::= UNNEST((unnest_expression)*)
unnest_syntax ::=(unnest_expression | unnest_clause)*
Semantics
The FROM clause of a SELECT statement may contain a list of expressions (with associated variables). Normally, you will use expressions that unnest arrays or maps and the FROM clause will create a new set of rows out of the values of the unnested arrays/maps. Some or all of these expressions may be enclosed in an UNNEST clause. Semantically, the unnest operator is a no-op, that is, whether an UNNEST clause is used or not does not change the result of the FROM clause, that is unnest(expr) is the same as expr. The purpose of the UNNEST clause is to act as an optimization hint. Specifically, when there is an index on the arrays/maps that are being unnested, the index may be used by the query only if the query uses the UNNEST clause. Furthermore, to help the query processor in using such an index, the UNNEST clause places some restrictions on the expressions that can appear inside it.
You normally use unnesting when you want to group by a field that is inside an array/map. However the two examples below illustrate the semantics of unnesting expressions. So no GROUP BY clause is used in the queries below.
Using a single unnest expression in the FROM clause (to unnest a single map):
Note:
The JSON column used in the example is a map.create table sample_unnest ( samp_id INTEGER, samp_data JSON, PRIMARY KEY(samp_id))
INSERT INTO sample_unnest VALUES(1,
{ "episodeID" : 20, "lengthMin" : 40, "minWatched" : 40 })
INSERT INTO sample_unnest VALUES(2,
{ "episodeID" : 25, "lengthMin" : 20, "minWatched" : 18 })
select * from sample_unnest
$s1,unnest($s1.samp_data.values() as $s2) where $s1.samp_id=1
The above FROM clause references the table sample_unnest
by the
variable $s1
and the values of the map samp_data
by the variable $s2
. Conceptually, the result of the FROM clause
is a table temp_tbl
with two columns. Column 1 stores all rows of
table sample_unnest
and Column 2 stores all values of the elements
of the samp_data
map.
Table 6-4 Records in the temp_tbl table
$s1 | $s2 |
---|---|
"samp_id":1,"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40}
|
20 |
"samp_id":1,"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40} |
40 |
"samp_id":1,"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40} |
40 |
"samp_id":2,"samp_data":{"episodeID":25,"lengthMin":20,"minWatched":18} |
25 |
"samp_id":2,"samp_data":{"episodeID":25,"lengthMin":20,"minWatched":18} |
20 |
"samp_id":2,"samp_data":{"episodeID":25,"lengthMin":20,"minWatched":18} |
18 |
$s1.samp_id=1
), the corresponding row is included in the query
result. The condition evaluates to TRUE for the first three rows and to FALSE for
the last three rows.
Table 6-5 Query output
$s1 | $s2 |
---|---|
{"samp_id":1,
"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40}} |
20 |
{"samp_id":1,
"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40}} |
40 |
{"samp_id":1,
"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40}}
|
40 |
Using two unnest expressions in the FROM clause (to unnest an array and the maps nested under the array)
Note:
The JSON column used in the example is a array of maps.create table sample_unnest (
samp_id INTEGER, samp_data JSON, PRIMARY KEY(samp_id))
INSERT INTO sample_unnest VALUES(1,
{
"episodes":[
{
"episodeID" : 10,
"lengthMin" : 40,
"minWatched" : 25
},
{
"episodeID" : 20,
"lengthMin" : 35,
"minWatched" : 30
}
]
}
)
INSERT INTO sample_unnest VALUES(2,
{
"episodes":[
{
"episodeID" : 30,
"lengthMin" : 40,
"minWatched" : 25
},
{
"episodeID" : 40,
"lengthMin" : 35,
"minWatched" : 30
}
]
}
)
INSERT INTO sample_unnest VALUES(3,
{
"episodes":[
{
"episodeID" : 10,
"lengthMin" : 40,
"minWatched" : 25
},
{
"episodeID" : 20,
"lengthMin" : 35,
"minWatched" : 30
}
]
}
)
SELECT n.samp_id as customer,
$epi.episodeID,$epi.minWatched AS length,$epiVal AS episode_details FROM sample_unnest n,
unnest(n.samp_data.episodes[] AS $epi, $epi.values() AS $epiVal)
temp_tbl
1 which is the
result of first two expressions in the FROM clause (sample_unnest n,
unnest(n.samp_data.episodes[] AS $epi)
Table 6-6 Records in temp_tbl1
n | $epi |
---|---|
"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25}, |
"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":20,"lengthMin":35,"minWatched":30} |
"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]} |
{"episodeID":30,"lengthMin":40,"minWatched":25}, |
"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]} |
{"episodeID":40,"lengthMin":35,"minWatched":30} |
"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25}, |
"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":20,"lengthMin":35,"minWatched":30} |
$epi.values()as $epiVal
)
Table 6-7 Records in temp_tbl2
n | $epi | $epival |
---|---|---|
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25}, |
10 |
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25} |
40 |
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25} |
25 |
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]}}
|
{"episodeID":20,"lengthMin":35,"minWatched":30} |
20 |
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":20,"lengthMin":35,"minWatched":30} |
35 |
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":20,"lengthMin":35,"minWatched":30} |
30 |
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":30,"lengthMin":40,"minWatched":25} |
30 |
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":30,"lengthMin":40,"minWatched":25} |
40 |
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":30,"lengthMin":40,"minWatched":25} |
25 |
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":40,"lengthMin":35,"minWatched":30} |
40 |
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":40,"lengthMin":35,"minWatched":30} |
35 |
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} |
{"episodeID":40,"lengthMin":35,"minWatched":30} |
30 |
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25} |
10 |
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25} |
40 |
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":10,"lengthMin":40,"minWatched":25} |
25 |
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":20,"lengthMin":35,"minWatched":30} |
20 |
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":20,"lengthMin":35,"minWatched":30} |
35 |
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} |
{"episodeID":20,"lengthMin":35,"minWatched":30} |
30 |
Table 6-8 Query result
customer | episode_id | length | episode_details |
---|---|---|---|
2 | 30 | 25 | 30 |
2 | 30 | 25 | 40 |
2 | 30 | 25 | 25 |
2 | 40 | 30 | 40 |
2 | 40 | 30 | 35 |
2 | 40 | 30 | 30 |
1 | 10 | 25 | 10 |
1 | 10 | 25 | 40 |
1 | 10 | 25 | 25 |
1 | 20 | 30 | 20 |
1 | 20 | 30 | 35 |
1 | 20 | 30 | 30 |
3 | 10 | 25 | 10 |
3 | 10 | 25 | 40 |
3 | 10 | 25 | 25 |
3 | 20 | 30 | 20 |
3 | 20 | 30 | 35 |
3 | 20 | 30 | 30 |
Limitation for expression usage in the UNNEST clause
- All expressions used in the UNNEST clause must be path expressions.
- No predicates are allowed in any array-filter or map-filter steps within the path expressions.
- The expression must start with a variable. If the expression is a first expression in a list of expressions , then it should be a table alias associated with the target table.
- When you unnest nested arrays, each path expression unnests one level deeper, and it operates on the values produced by the previous level of unnesting. These values are represented by the variable associated with the previous path expression. So the starting variable of each expression must be the variable associated with the previous expression. This is not applicable if this is a first expression in a list of expressions in the UNNEST clause.
- The expression must finish with a [] or .values() step.
- The variables defined inside the UNNEST clause cannot be referenced in the rest of the FROM clause. They can be referenced within the UNNEST clause and outside the FROM clause.