Unnest Arrays & Maps

Use unnest_syntax to flatten rows that include arrays or maps.

Syntax of unnest

Syntactically, unnesting is specified as a list of expressions (with associated variables) in the FROM clause of a SELECT statement. Some or all of these expressions may be enclosed in an UNNEST clause.
unnest_expression::=expression AS VARNAME
unnest_clause ::= UNNEST((unnest_expression)*)
unnest_syntax ::=(unnest_expression | unnest_clause)*

Semantics

The FROM clause of a SELECT statement may contain a list of expressions (with associated variables). Normally, you will use expressions that unnest arrays or maps and the FROM clause will create a new set of rows out of the values of the unnested arrays/maps. Some or all of these expressions may be enclosed in an UNNEST clause. Semantically, the unnest operator is a no-op, that is, whether an UNNEST clause is used or not does not change the result of the FROM clause, that is unnest(expr) is the same as expr. The purpose of the UNNEST clause is to act as an optimization hint. Specifically, when there is an index on the arrays/maps that are being unnested, the index may be used by the query only if the query uses the UNNEST clause. Furthermore, to help the query processor in using such an index, the UNNEST clause places some restrictions on the expressions that can appear inside it.

You normally use unnesting when you want to group by a field that is inside an array/map. However the two examples below illustrate the semantics of unnesting expressions. So no GROUP BY clause is used in the queries below.

Using a single unnest expression in the FROM clause (to unnest a single map):

Create a table with two columns , one of data type INTEGER, which is the primary key column and second of JSON data type.

Note:

The JSON column used in the example is a map.
create table sample_unnest ( samp_id INTEGER, samp_data JSON, PRIMARY KEY(samp_id))
INSERT INTO sample_unnest VALUES(1,
{ "episodeID" : 20, "lengthMin" : 40, "minWatched" : 40 })
INSERT INTO sample_unnest VALUES(2,
{ "episodeID" : 25, "lengthMin" : 20, "minWatched" : 18 })
select * from sample_unnest
        $s1,unnest($s1.samp_data.values() as $s2) where $s1.samp_id=1

The above FROM clause references the table sample_unnest by the variable $s1 and the values of the map samp_data by the variable $s2 . Conceptually, the result of the FROM clause is a table temp_tbl with two columns. Column 1 stores all rows of table sample_unnest and Column 2 stores all values of the elements of the samp_data map.

Table 6-4 Records in the temp_tbl table

$s1 $s2
"samp_id":1,"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40} 20
"samp_id":1,"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40} 40
"samp_id":1,"samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40} 40
"samp_id":2,"samp_data":{"episodeID":25,"lengthMin":20,"minWatched":18} 25
"samp_id":2,"samp_data":{"episodeID":25,"lengthMin":20,"minWatched":18} 20
"samp_id":2,"samp_data":{"episodeID":25,"lengthMin":20,"minWatched":18} 18
To see how the rest of the query is executed, take every row from the above table and apply the WHERE condition. For those rows where the condition evaluates to TRUE ($s1.samp_id=1), the corresponding row is included in the query result. The condition evaluates to TRUE for the first three rows and to FALSE for the last three rows.

Table 6-5 Query output

$s1 $s2
{"samp_id":1, "samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40}} 20
{"samp_id":1, "samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40}} 40
{"samp_id":1, "samp_data":{"episodeID":20,"lengthMin":40,"minWatched":40}} 40

Using two unnest expressions in the FROM clause (to unnest an array and the maps nested under the array)

You can use more than one path expression in an UNNEST clause. Let "M" be the number of from-expressions, then the result table of the FROM clause is computed in M steps, each producing an intermediate table.

Note:

The JSON column used in the example is a array of maps.
create table sample_unnest (
        samp_id INTEGER, samp_data JSON, PRIMARY KEY(samp_id))

INSERT INTO sample_unnest VALUES(1,
{
   "episodes":[
     {
       "episodeID" : 10,
       "lengthMin" : 40,
       "minWatched" : 25
     },
     {
        "episodeID" : 20,
        "lengthMin" : 35,
        "minWatched" : 30
      }
    ]
  }
)

INSERT INTO sample_unnest VALUES(2,
{
   "episodes":[
    {
      "episodeID" : 30,
      "lengthMin" : 40,
      "minWatched" : 25
   },
   {
       "episodeID" : 40,
       "lengthMin" : 35,
       "minWatched" : 30
    }
   ]
  }
)

INSERT INTO sample_unnest VALUES(3,
{
   "episodes":[
     {
       "episodeID" : 10,
       "lengthMin" : 40,
       "minWatched" : 25
     },
     {
       "episodeID" : 20,
       "lengthMin" : 35,
       "minWatched" : 30
     }
   ]
 }
)
Query using two path expressions in the UNNEST clause:
SELECT n.samp_id as customer,
        $epi.episodeID,$epi.minWatched AS length,$epiVal AS episode_details FROM  sample_unnest n,
        unnest(n.samp_data.episodes[] AS $epi, $epi.values()  AS $epiVal)
Step 1 : Records in the first table temp_tbl1 which is the result of first two expressions in the FROM clause (sample_unnest n, unnest(n.samp_data.episodes[] AS $epi)

Table 6-6 Records in temp_tbl1

n $epi
"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25},
"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":20,"lengthMin":35,"minWatched":30}
"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]} {"episodeID":30,"lengthMin":40,"minWatched":25},
"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]} {"episodeID":40,"lengthMin":35,"minWatched":30}
"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25},
"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":20,"lengthMin":35,"minWatched":30}
Step 2 : Records in the second table temp_tbl2 - result of the full FROM clause ( Above table joined with the result of the second path expression in the UNNEST clause ($epi.values()as $epiVal )

Table 6-7 Records in temp_tbl2

n $epi $epival
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25}, 10
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25} 40
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25} 25
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]}} {"episodeID":20,"lengthMin":35,"minWatched":30} 20
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]}} {"episodeID":20,"lengthMin":35,"minWatched":30} 35
{"s1":{"samp_id":1,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]}} {"episodeID":20,"lengthMin":35,"minWatched":30} 30
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} {"episodeID":30,"lengthMin":40,"minWatched":25} 30
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} {"episodeID":30,"lengthMin":40,"minWatched":25} 40
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} {"episodeID":30,"lengthMin":40,"minWatched":25} 25
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} {"episodeID":40,"lengthMin":35,"minWatched":30} 40
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} {"episodeID":40,"lengthMin":35,"minWatched":30} 35
{"s1":{"samp_id":2,"samp_data":{"episodes":[{"episodeID":30,"lengthMin":40,"minWatched":25},{"episodeID":40,"lengthMin":35,"minWatched":30}]}} {"episodeID":40,"lengthMin":35,"minWatched":30} 30
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25} 10
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25} 40
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":10,"lengthMin":40,"minWatched":25} 25
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":20,"lengthMin":35,"minWatched":30} 20
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":20,"lengthMin":35,"minWatched":30} 35
{"s1":{"samp_id":3,"samp_data":{"episodes":[{"episodeID":10,"lengthMin":40,"minWatched":25},{"episodeID":20,"lengthMin":35,"minWatched":30}]} {"episodeID":20,"lengthMin":35,"minWatched":30} 30
Step 3 : There is no filter in the query and the fields in the SELECT clause are fetched. The final result of the query is :

Table 6-8 Query result

customer episode_id length episode_details
2 30 25 30
2 30 25 40
2 30 25 25
2 40 30 40
2 40 30 35
2 40 30 30
1 10 25 10
1 10 25 40
1 10 25 25
1 20 30 20
1 20 30 35
1 20 30 30
3 10 25 10
3 10 25 40
3 10 25 25
3 20 30 20
3 20 30 35
3 20 30 30

Limitation for expression usage in the UNNEST clause

There are some restrictions on the expressions used in the UNNEST clause.
  • All expressions used in the UNNEST clause must be path expressions.
  • No predicates are allowed in any array-filter or map-filter steps within the path expressions.
  • The expression must start with a variable. If the expression is a first expression in a list of expressions , then it should be a table alias associated with the target table.
  • When you unnest nested arrays, each path expression unnests one level deeper, and it operates on the values produced by the previous level of unnesting. These values are represented by the variable associated with the previous path expression. So the starting variable of each expression must be the variable associated with the previous expression. This is not applicable if this is a first expression in a list of expressions in the UNNEST clause.
  • The expression must finish with a [] or .values() step.
  • The variables defined inside the UNNEST clause cannot be referenced in the rest of the FROM clause. They can be referenced within the UNNEST clause and outside the FROM clause.
For more details on path expression and to understand how these expressions are evaluated for an array or a map, see Path Expressions.