Using NESTED TABLES clause to query multiple tables in the same hierarchy

To query multiple tables in the same hierarchy, the NESTED TABLES clause must be used inside the FROM clause.

Syntax

nested_tables ::=
   NESTED TABLES "(" single_from_table
   [ANCESTORS "(" ancestor_tables ")"]
   [DESCENDANTS "(" descendant_tables ")"]
")"

ancestor_tables ::= nested_from_table ("," nested_from_table)*

descendant_tables ::= nested_from_table ("," nested_from_table)*

nested_from_table ::= aliased_table_name [ON or_expression]

The NESTED TABLES clause specifies the participating tables and separates them in 3 groups. First the target table is specified. Then the ANCESTORS clause, if present, specifies a number of tables that must be ancestors of the target table in the table hierarchy. Finally, the DESCENDANTS clause, if present, specifies a number of tables that must be descendants of the target table in the table hierarchy. For each table an alias may be specified (and if not, one is created internally as described in the CREATE TABLE Statement section). The aliases must be unique.

Semantically, a NESTED TABLES clause is equivalent to a number of left-outer-join operations "centered" around the target table. The left-outer-join is an operation defined by standard SQL and supported by all major RDBMSs. For those not familiar with it already, we give a brief description in the last section of this chapter.

Our implementation of left outer join diverges slightly from the standard definition. The difference is in the "shape" of the results. Specifically, the result of a NESTED TABLES clause is a set of records, all having the same type, where (a) the number of fields is equal to the number of participating tables, (b) each field corresponds to one of the tables and stores either a row from that table or NULL, (c) the name of each field is the alias used by the associated table, and (d) the fields are ordered in the order that the participating tables would be encountered in a depth-first traversal of the table hierarchy.

So, in a NESTED TABLES result the columns of each table are grouped inside a subrecord. In contrast, the standard left-outer-join produces a "flat" result, where each result is a record/tuple whose number of fields is the sum of all the columns in the participating tables.

The mapping of a NESTED TABLES to a number of left-outer-joins is best explained with a few examples. For brevity, we will use the keyword LOJ in place of LEFT OUTER JOIN. Let’s start with the following create table statements:

create table A (ida integer, a1 string, primary key(ida));
create table A.B (idb integer, b1 string, primary key(idb));
create table A.B.C (idc integer, c1 integer, primary key(idc));
create table A.B.C.D (idd integer, d1 double, primary key(idd));
create table A.B.E (ide integer, e1 integer, primary key(ide));
create table A.G (idg integer, g1 string, primary key(idg));
create table A.G.J (idj integer, j1 integer, primary key(idj));
create table A.G.H (idh integer, h1 integer, primary key(idh));
create table A.G.J.K (idk integer, k1 integer, primary key(idk));

The above statements create the following table hierarchy:

The NESTED TABLES clause specifies the join tree as a "projection" of the table hierarchy that includes the tables in the NESTED TABLES. For example, the join tree for NESTED TABLES(A.B) ancestors(A) descendants (A.B.C.D, A.B.E) is shown below. The arrows indicate the direction of the LOJs (from the left table to the right table).

Now, let’s look at the following NESTED TABLES cases and their equivalent LOJ operations

```
NESTED TABLES (A.B.C c ancestors(A a, A.B b));
```
is equivalent to
```
A.B.C c LOJ A a ON c.ida = a.ida LOJ A.B b ON c.ida = b.ida AND
        c.idb = b.idb
```
We can see that the join predicates are implicit in the NESTED TABLES clause, and they are always on the primary key columns of the participating tables.

Because for each A.B.C row there is at most one matching A and A.B row, the number of records in the result is the same as the number of A.B.C rows. This is always true when the NESTED TABLES clause includes ancestors only. In this case, the effect of the operation is to decorate the target table rows with the columns from the matching ancestor rows (if any), without eliminating or adding any other rows.
```
NESTED TABLES (A a descendants(A.B b, A.B.C c))
```
is equivalent to
```
A a LOJ A.B b ON a.ida = b.ida
 LOJ A.B.C c ON b.ida = c.ida AND b.idb = c.idb
```
Another way to explain the semantics of the DESCENDANTS clause is to use the contains relationship defined in Table Hierarchies section, but restricted to the descendant tables in the join tree only. Let R be a target table row, and S(R) be the set containing all the descendant rows that are reachable from R via the contains relationship (i.e., S(R) is the transitive closure of contains applied on R). If S(R) is empty, a single record is returned for R, that contains R and a NULL value for each of the descendant tables in the join tree. Otherwise, let B(R) be the boundary subset of S(R), i.e., all rows in S(R) that do not have any descendant rows in the join tree. Then, a result is generated for each row in B(R) as follows: Let RR be a row in B(R) and T be its containing table. The result associated with RR is a record containing all the rows in the path from R to RR and a NULL value for every table that is not in the path from the target table to T.
```
NESTED TABLES (A a descendants(A.B b, A.G g))
```
is equivalent to
```
A a LOJ
(A.B b UNION A.G g)
ON (a.ida = b.ida or b.ida IS NULL) and (a.ida = g.ida or g.ida IS NULL)
```
As in case 2, target table A is joined with 2 descendant tables. However, because the descendant tables come from 2 different branches of the join tree, we have to use a UNION operation in the SQL expression above. This UNION unions both the rows and the columns of tables A.B and A.G. So, if table A.B has N rows with n columns each and table A.G has M rows with m columns, the result of the UNION has N + M rows, with n + m columns each. The first N rows contain the rows of A.B with NULL values for the m columns, and the following M rows contain the rows of A.G with NULL values for the n columns. When matching A rows with the UNION rows, we distinguish whether a UNION row comes from table A.B or A.g by checking whether g.ida is NULL or b.ida is NULL, respectively.

Notice that the contains-base definition given in case 2 applies here as well.

NESTED TABLES (A a descendants(A.B b, A.B.C c, A.B.E e, A.G.J.K k))

is equivalent to

A a LOJ
(
A.B b LOJ
(A.B.C c UNION A.B.E e)
ON (b.ida = c.ida and b.idb = c.idb or c.ida IS NULL) and
(b.ida = e.ida and b.idb = e.idb or e.ida IS NULL)
UNION
A.G.J.K k
)
ON (a.ida = b.ida or b.ida IS NULL) and (a.ida = k.ida or k.ida IS NULL)

This example is just a more complex version of case 3.

```
NESTED TABLES (A.B b ancestors(A a ON a.a1 = “abc”)
descendants(A.B.C c ON c.c1 > 10,
A.B.C.D d,
A.B.E e))
```
is equivalent to
```
(A.B b LOJ A a ON b.ida = a.ida and a.a1 = “abc”) LOJ
(
A.B.C c LOJ A.B.C.D
ON c.ida = d.ida and c.idb = d.idb and c.idc = d.idc
UNION
E
)
ON (b.ida = c.ida and b.idb = c.idb or c.ida IS NULL) and
(b.ida = e.ida and b.idb = e.idb or e.ida IS NULL)
```
This is an example that just puts everything together. It contains both ANCESTOR and DESCENDANT clauses, as well as ON predicates. The mapping from NESTED TABLES to LOJs uses the same patterns as in the previous cases. The ON predicates are just and-ed with the join predicates. In most cases, the ON predicates inside a NESTED TABLES will be local predicates on the right table of an LOJ, but more generally, they can also reference any columns from any table that is an ancestor of the table the ON appears next to.