Example: Using unnesting with a GROUP BY clause

Consider a TV streaming application. It streams various shows that are watched by customers across the globe. Every show has number of seasons and every season has multiple episodes. You need a persistent meta-data store which keeps track of the current activity of the customers using the TV streaming application. A customer is interested to know about the episodes they watched, the watch time per episode, the total number of seasons of the show they watched etc. The customer also wants the streaming application to start streaming from where they left off watching. The streaming application needs reports on which show is most popular among customers, how many minutes a show is being watched etc. These reports can be generated using UNNEST clause in queries.

Create table and Load data for the TV streaming application

Download the script acctstream_loaddata.sql and run it as shown below. This script creates the table used in the example and loads data into the table.

Start your KVSTORE or KVLite and open the SQL.shell.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
Using the load command, run the script.
load -file acctstream_loaddata.sql
Example 1: Fetch the different shows watched by people in the US alone and the number of people watching them
SELECT $show.showId, count(*) as cnt FROM stream_acct $s,
unnest($s.acct_data.contentStreamed[] as $show)
WHERE $s.acct_data.country = "USA" GROUP BY $show.showId 
ORDER BY count(*) DESC ;

{"showId":15,"cnt":2}
{"showId":16,"cnt":2}
Example 2: For every show aired by the application, fetch the total watch time by all users:
SELECT $show.showId, sum($show.seriesInfo.episodes.minWatched) AS total_time
FROM stream_acct $s, unnest($s.acct_data.contentStreamed[] AS $show) 
GROUP BY $show.showId ORDER BY sum($show.seriesInfo.episodes.minWatched);

{"showId":26,"total_time":225}
{"showId":16,"total_time":440}
{"showId":15,"total_time":642}

Note:

The unnest operator ( that is the keyword unnest) can be omitted as it is a no-op operator. The use of the UNNEST clause is recommended when there is an index on the array(s) or map(s) that are being unnested. See Examples: Using Indexes for Query Optimization for more information.
The below query without the UNNEST clause is equivalent to the above query ( with the UNNEST clause) and gives the same result.
SELECT $show.showId, sum($show.seriesInfo.episodes.minWatched) AS total_time
FROM stream_acct $s, $s.acct_data.contentStreamed[] AS $show 
GROUP BY $show.showId ORDER BY sum($show.seriesInfo.episodes.minWatched);

{"showId":26,"total_time":225}
{"showId":16,"total_time":440}
{"showId":15,"total_time":642}
Example 3: Fetch the total watch time of users per show and season
SELECT $show.showId, $seriesInfo.seasonNum, 
sum($seriesInfo.episodes.minWatched) AS length
FROM stream_acct n,
unnest(n.acct_data.contentStreamed[] AS $show, 
$show.seriesInfo[] as $seriesInfo)
GROUP BY $show.showId, $seriesInfo.seasonNum
ORDER BY sum($seriesInfo.episodes.minWatched);

{"showId":26,"seasonNum":2,"length":80}
{"showId":26,"seasonNum":1,"length":145}
{"showId":16,"seasonNum":2,"length":190}
{"showId":16,"seasonNum":1,"length":250}
{"showId":15,"seasonNum":2,"length":295}
{"showId":15,"seasonNum":1,"length":347}

Example 4: Using unnest with a non-path expression

This is an example where the unnesting expression is not a path expression, and as a result, the UNNEST clause cannot be used. For example, a user may have multiple phone numbers in the same area code. To determine the number of users having phone numbers in different area codes, a particular user should be counted only once (even if the user has more than one phone number with the same area code). The below query returns, for each area code, the number of users having phone numbers in that area code. The query uses the seq_distinct() function, which eliminates duplicate values.
SELECT $area, count(*) AS cnt
FROM Users u, seq_distinct(u.address.phones.area) AS $area
GROUP BY $area;

{"area":408,"cnt":1}
{"area":831,"cnt":1}

To know more about seq_disctinct function, see Functions on Sequences.