Query execution plan

A query execution plan is the sequence of operations Oracle NoSQL Database performs to run a query.

Overview of query plan

A query execution plan is internally structured as a tree of plan iterators.

Each kind of iterator evaluates a different kind of expression that may appear in a query. In general, the choice of index and the kind of associated index predicates can have a drastic effect on query performance. As a result, you as a developer often want to see what index is used by a query and what predicates have been pushed down to it. Based on this information, you may want to force the use of a different index via index hints. This information is contained in the query execution plan. All Oracle NoSQL drivers provide APIs to display the execution plan of a query. All Oracle NoSQL graphical UIs including the IntelliJ, VSCode, and Eclipse plugins along with the Oracle Cloud Infrastructure Console include controls for displaying the query execution plan.

Some of the most common and important iterators used in queries are :

TABLE iterator

A table iterator is responsible for
  • Scanning the index used by the query (which may be the primary index).
  • Applying any filtering predicates pushed to the index
  • Retrieve the rows pointed to by the qualifying index entries if necessary. If the index is covering, the result set of the TABLE iterator is a set of index entries, otherwise, it is a set of table rows.

Note:

An index is called a covering index with respect to a query if the query can be evaluated using only the entries of that index, that is, without the need to retrieve the associated rows.
A TABLE iterator will always have the following properties:
  • target table: The name of the target table in the query.
  • index used: The name of the index used by the query. If the primary index were used, “primary index” would appear as the value of this property.
  • covering index: Whether the index is covering or not.
  • row variable: The name of a variable ranging over the table rows produced by the TABLE iterator. If the index is covering, no table rows are produced and this variable is not used.
  • index scans: Contains the start and stop conditions that define the index scans to be performed.
A TABLE iterator has 2 more optional properties:
  • index row variable: The name of a variable ranging over the index entries produced by the TABLE iterator. Every time a new index entry is produced by the index scan, the index variable will be bound to that entry.
  • index filtering predicate: A predicate evaluated on every index entry produced by the index scan. If the result of this evaluation is true, the index variable is bound to this entry and the entry or its associated table row is returned as the result of the next() call on the TABLE iterator. Otherwise, the entry is skipped, the next entry from the index scan is produced, the predicate is evaluated again on this entry and it continues until a qualifying entry is found.

SELECT iterator

It is responsible for executing the SELECT expression.

RECEIVE iterator

It is a special internal iterator that separates the query plan into 2 parts:
  1. The RECEIVE iterator itself and all iterators that are above it in the iterator tree are executed at the driver.
  2. All iterators below the RECEIVE iterator are executed at the replication nodes (RNs); these iterators form a subtree rooted at the unique child of the RECEIVE iterator.

In general, the RECEIVE iterator acts as a query coordinator. It sends its subplan to appropriate RNs for execution and collects the results. It may perform additional operations such as sorting and duplicate elimination and propagates the results to its ancestor iterators (if any) for further processing.

Distribution kinds

A distribution kind specifies how the query will be distributed for execution across the RNs participating in an Oracle NoSQL database (a store). The distribution kind is a property of the RECEIVE iterator.

Different choices of Distribution kinds are:
  • SINGLE_PARTITION: A SINGLE_PARTITION query specifies a complete shard key in its WHERE clause. As a result, its full result set is contained in a single partition, and the RECEIVE iterator will send its subplan to a single RN that stores that partition. A SINGLE_PARTITION query may use either the primary-key index or a secondary index.
  • ALL_PARTITIONS: Queries use the primary-key index here and they don’t specify a complete shard key. As a result, if the store has M partitions, the RECEIVE iterator will send M copies of its subplan to be executed over one of the M partitions each.
  • ALL_SHARDS: Queries use a secondary index here and they don’t specify a complete shard key. As a result, if the store has N shards, the RECEIVE iterator will send N copies of its subplan to be executed over one of the N shards each.

Populating the tables to view the query execution plan :

If you want to follow along with the examples, download the script baggageschema_loaddata.sql and run it as shown below. This script creates the table used in the example and loads data into the table.

Start your KVSTORE or KVLite and open the SQL.shell.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
Using the load command, run the script.
load -file baggageschema_loaddata.sql

Creating indexes:

Create the following indexes in the baggageInfo table as shown below.
  1. Create an index on passengers reservation code.
    CREATE INDEX fixedschema_conf ON baggageInfo confNo)
  2. Create an index on the full name and phone number of passengers
    CREATE INDEX compindex_namephone ON baggageInfo (fullName,contactPhone)
  3. Create an index on three fields, when the bag was last seen, the last seen station, and the arrival date and time.
    CREATE INDEX simpleindex_arrival ON 
    baggageInfo(bagInfo[].lastSeenTimeGmt as ANYATOMIC,
    bagInfo[].bagArrivalDate as ANYATOMIC, 
    bagInfo[].lastSeenTimeStation as ANYATOMIC)

Query 1: Using primary key index with an index range scan

Fetch the bag details of passengers for ticket numbers in a range.
SELECT fullname, ticketNo,
bag.bagInfo[].tagNum,bag.bagInfo[].routing 
FROM BaggageInfo bag WHERE
1762340000000 < ticketNo AND ticketNo < 1762352000000
Plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "ALL_PARTITIONS",
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "BaggageInfo",
      "row variable" : "$$bag",
      "index used" : "primary index",
      "covering index" : false,
      "index scans" : [
        {
          "equality conditions" : {},
          "range conditions" : { "ticketNo" : { "start value" : 1762340000000, "start inclusive" : false, "end value" : 1762352000000, "end inclusive" : false } }
        }
      ]
    },
    "FROM variable" : "$$bag",
    "SELECT expressions" : [
      {
        "field name" : "fullname",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fullname",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "ticketNo",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "ticketNo",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "tagNum",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "tagNum",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      },
      {
        "field name" : "routing",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "routing",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      }
    ]
  }
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
  • The value of the FROM field is an iterator; in this case, it is a TABLE iterator.
  • The primary key index is used and the index is not covering (as you need to scan the table to fetch columns other than the index entries).
  • The index scan property contains the start and stop conditions that define the index scans to be performed.
  • The FROM variable is the name of a variable ranging over the records produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM variable ($$bag)is the same as the row variable of the TABLE iterator, as the index used is not covering.
  • In the SELECT expression four fields (fullname,ticketNo,bag.bagInfo[].tagNum,bag.bagInfo[].routing) are fetched. These correspond to four field names and field expressions in the SELECT expression clause. For the first two fields, the field expression is computed using FIELD_STEP iterator. For the last 2 fields, an ARRAY_CONSTRUCTOR iterator is used which iterates over the corresponding arrays to fetch the field value.

Query 2: Using primary key index with an index predicate

Fetch the bag details of passengers for ticket numbers satisfying one of the two ranges of values.
SELECT fullname, ticketNo, bag.bagInfo[].tagNum,bag.bagInfo[].routing 
FROM BaggageInfo bag WHERE ticketNo > 1762340000000 OR ticketNo < 1762352000000;
Plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "ALL_PARTITIONS",
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "BaggageInfo",
      "row variable" : "$$bag",
      "index used" : "primary index",
      "covering index" : false,
      "index scans" : [
        {
          "equality conditions" : {},
          "range conditions" : {}
        }
      ],
      "index filtering predicate" :
      {
        "iterator kind" : "OR",
        "input iterators" : [
          {
            "iterator kind" : "GREATER_THAN",
            "left operand" :
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "ticketNo",
              "input iterator" :
              {
                "iterator kind" : "VAR_REF",
                "variable" : "$$bag"
              }
            },
            "right operand" :
            {
              "iterator kind" : "CONST",
              "value" : 1762340000000
            }
          },
          {
            "iterator kind" : "LESS_THAN",
            "left operand" :
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "ticketNo",
              "input iterator" :
              {
                "iterator kind" : "VAR_REF",
                "variable" : "$$bag"
              }
            },
            "right operand" :
            {
              "iterator kind" : "CONST",
              "value" : 1762352000000
            }
          }
        ]
      }
    },
    "FROM variable" : "$$bag",
    "SELECT expressions" : [
      {
        "field name" : "fullname",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fullname",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "ticketNo",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "ticketNo",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "tagNum",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "tagNum",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      },
      {
        "field name" : "routing",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "routing",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      }
    ]
  }
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
  • The value of the FROM field is an iterator; in this case, it is a TABLE iterator.
  • The primary key index is used and the index is not covering (as you need to scan the table to fetch columns other than the index entries).
  • The index filtering predicate evaluates the filter criteria on the ticketNo field. Using the greater than and less than operators the filter condition is evaluated.
  • The FROM variable is the name of a variable ranging over the records produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM variable ($$bag) is the same as the row variable of the TABLE iterator, as the index used is not covering.
  • In the SELECT expression four fields (fullname,ticketNo,bag.bagInfo[].tagNum, bag.bagInfo[].routing) are fetched. These correspond to four field names and field expressions in the SELECT expression clause. For the first two fields, the field expression is computed using FIELD_STEP iterator. For the last 2 fields, an ARRAY_CONSTRUCTOR iterator is used which iterates over the corresponding arrays to fetch the field value.

Query 3: Using a secondary index with an index range scan

Fetch the bag details for a particular reservation code.
SELECT fullName,bag.ticketNo, bag.confNo, bag.bagInfo[].tagNum,
bag.bagInfo[].routing FROM BaggageInfo bag WHERE bag.confNo="FH7G1W"
Plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "ALL_SHARDS",
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "BaggageInfo",
      "row variable" : "$$bag",
      "index used" : "fixedschema_conf",
      "covering index" : false,
      "index scans" : [
        {
          "equality conditions" : {"confNo":"FH7G1W"},
          "range conditions" : {}
        }
      ]
    },
    "FROM variable" : "$$bag",
    "SELECT expressions" : [
      {
        "field name" : "fullName",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fullName",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "ticketNo",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "ticketNo",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "confNo",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "confNo",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "tagNum",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "tagNum",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      },
      {
        "field name" : "routing",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "routing",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      }
    ]
  }
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
  • The value of the FROM field is an iterator; in this case, it is a TABLE iterator.
  • The index fixedschema_conf is used and the index is not covering (as you need to scan the table to fetch columns other than the index entries).
  • The index scan property contains the start and stop conditions that define the index scans to be performed.
  • The FROM variable is the name of a variable ranging over the records produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM variable ($$bag) is the same as the row variable of the TABLE iterator, as the index used is not covering.
  • In the SELECT expression four fields (fullname,ticketNo,confNo,bag.bagInfo[].tagNum,bag.bagInfo[].routing) are fetched. These correspond to five field names and field expressions in the SELECT expression clause. For the first three fields, the field expression is computed using FIELD_STEP iterator. For the last two fields, an ARRAY_CONSTRUCTOR iterator is used which iterates over the corresponding arrays to fetch the field value.

Query 4: Using the primary index

Fetch the name and routing details of all male passengers.
SELECT fullname,bag.bagInfo[].routing FROM BaggageInfo bag 
WHERE gender!="F"
Plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "ALL_PARTITIONS",
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "BaggageInfo",
      "row variable" : "$$bag",
      "index used" : "primary index",
      "covering index" : false,
      "index scans" : [
        {
          "equality conditions" : {},
          "range conditions" : {}
        }
      ]
    },
    "FROM variable" : "$$bag",
    "WHERE" :
    {
      "iterator kind" : "NOT_EQUAL",
      "left operand" :
      {
        "iterator kind" : "FIELD_STEP",
        "field name" : "gender",
        "input iterator" :
        {
          "iterator kind" : "VAR_REF",
          "variable" : "$$bag"
        }
      },
      "right operand" :
      {
        "iterator kind" : "CONST",
        "value" : "F"
      }
    },
    "SELECT expressions" : [
      {
        "field name" : "fullname",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fullname",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "routing",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "routing",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      }
    ]
  }
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
  • The value of the FROM field is an iterator; in this case, it is a TABLE iterator.
  • The primary key index is used and the index is not covering ( as you need to scan the table to fetch columns other than the index entries).
  • The FROM variable is the name of a variable ranging over the records produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM variable ($$bag) is the same as the row variable of the TABLE iterator, as the index used is not covering.
  • In the SELECT expression two fields (fullname,bag.bagInfo[].routing) are fetched. These correspond to two field names and field expressions in the SELECT expression clause. For the first field, the field expression is computed using FIELD_STEP iterator. For the second field, an ARRAY_CONSTRUCTOR iterator is used which iterates over the corresponding array to fetch the field value.

Query 5: Sort the data using a Covering index

Fetch the name and phone number of all passengers.
SELECT bag.contactPhone, bag.fullName FROM BaggageInfo bag 
ORDER BY bag.fullName
Plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "ALL_SHARDS",
  "order by fields at positions" : [ 1 ],
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "BaggageInfo",
      "row variable" : "$$bag",
      "index used" : "compindex_namephone",
      "covering index" : true,
      "index row variable" : "$$bag_idx",
      "index scans" : [
        {
          "equality conditions" : {},
          "range conditions" : {}
        }
      ]
    },
    "FROM variable" : "$$bag_idx",
    "SELECT expressions" : [
      {
        "field name" : "contactPhone",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "contactPhone",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag_idx"
          }
        }
      },
      {
        "field name" : "fullName",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fullName",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag_idx"
          }
        }
      }
    ]
  }
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator. The only property of the RECEIVE iterator in this example is the distribution kind whose value is ALL_SHARDS.
  • The results need to be sorted by fullName. The fullName is part of the compindex_namephone index. So in this example, you don't need a separate SORT operator. The sorting is done by the RECEIVE operator using its property order by fields at positions, which is an array. The value of this array depends on the position of the field which is sorted in the SELECT expression.
    "order by fields at positions" : [ 1 ]
  • In this example, the order by is done using the fullName which is the second field in the SELECT expression. That is why you see 1 in the order by fields at position property of the iterator.
  • The index compindex_namephone is used here and in this example, it is a covering index as the query can be evaluated using only the entries of the index.
  • The index row variable is $$bag_idx which is the name of a variable ranging over the index entries produced by the TABLE iterator. Every time a new index entry is produced by the index scan, the $$bag_idx variable will be bound to that entry.
  • When the FROM iterator is a TABLE iterator, the FROM variable is the same as either the index row variable or the row variable of the TABLE iterator, depending on whether the used index is covering or not. In this example, the FROM variable is the same as the index row variable ($$bag_idx) as the index is covering.
  • This index row variable ($$bag_idx) will be referenced by iterators implementing the other clauses of the SELECT expression.
  • In the SELECT expression two fields ( contactPhone,fullName) are fetched. These correspond to two field names and field expressions in the SELECT expression clause. For both fields, the field expression is computed using FIELD_STEP iterator.

Query 6: Using a secondary index with an index predicate

Fetch the name, ticket number, and arrival date of passengers whose arrival date is greater than a given value.
SELECT fullName, bag.ticketNo, bag.bagInfo[].bagArrivalDate 
FROM BaggageInfo bag WHERE EXISTS 
bag.bagInfo[$element.bagArrivalDate >="2019-01-01T00:00:00"]
Plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "ALL_SHARDS",
  "distinct by fields at positions" : [ 1 ],
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "BaggageInfo",
      "row variable" : "$$bag",
      "index used" : "simpleindex_arrival",
      "covering index" : false,
      "index row variable" : "$$bag_idx",
      "index scans" : [
        {
          "equality conditions" : {},
          "range conditions" : {}
        }
      ],
      "index filtering predicate" :
      {
        "iterator kind" : "GREATER_OR_EQUAL",
        "left operand" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "bagInfo[].bagArrivalDate",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag_idx"
          }
        },
        "right operand" :
        {
          "iterator kind" : "CONST",
          "value" : "2019-01-01T00:00:00"
        }
      }
    },
    "FROM variable" : "$$bag",
    "SELECT expressions" : [
      {
        "field name" : "fullName",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fullName",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "ticketNo",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "ticketNo",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "bagArrivalDate",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "bagArrivalDate",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      }
    ]
  }
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
  • The value of the FROM field is an iterator; in this case, it is a TABLE iterator.
  • The EXISTS condition is actually converted to a filtering predicate. There is one filtering predicate which is the whole WHERE expression. The index simpleindex_arrival is the only one applicable here and is used.
  • The index filtering predicate evaluates the filter criteria on the bagArrivalDate field. Using the greater than and less than operators the filter condition is evaluated.
  • The FROM variable is the name of a variable ranging over the records produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM variable ($$bag)is the same as the row variable of the TABLE iterator, as the index used is not covering.
  • In the SELECT expression three fields (fullname, ticketNo,bag.bagInfo[].bagArrivalDat) are fetched. These correspond to three field names and field expressions in the SELECT expression clause. For the first two fields, the field expression is computed using FIELD_STEP iterator. For the last field, an ARRAY_CONSTRUCTOR iterator is used which iterates over the corresponding arrays to fetch the field value.

Query 7: Group data with fields as part of the index

Fetch the reservation code and count of bags for all passengers.
SELECT bag.confNo, count(bag.bagInfo) AS TOTAL_BAGS 
FROM BaggageInfo bag GROUP BY bag.confNo;
Plan:
{
  "iterator kind" : "SELECT",
  "FROM" :
  {
    "iterator kind" : "RECEIVE",
    "distribution kind" : "ALL_SHARDS",
    "order by fields at positions" : [ 0 ],
    "input iterator" :
    {
      "iterator kind" : "SELECT",
      "FROM" :
      {
        "iterator kind" : "TABLE",
        "target table" : "BaggageInfo",
        "row variable" : "$$bag",
        "index used" : "fixedschema_conf",
        "covering index" : false,
        "index scans" : [
          {
            "equality conditions" : {},
            "range conditions" : {}
          }
        ]
      },
      "FROM variable" : "$$bag",
      "GROUP BY" : "Grouping by the first expression in the SELECT list",
      "SELECT expressions" : [
        {
          "field name" : "confNo",
          "field expression" :
          {
            "iterator kind" : "FIELD_STEP",
            "field name" : "confNo",
            "input iterator" :
            {
              "iterator kind" : "VAR_REF",
              "variable" : "$$bag"
            }
          }
        },
        {
          "field name" : "TOTAL_BAGS",
          "field expression" :
          {
            "iterator kind" : "FN_COUNT",
            "input iterator" :
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "bagInfo",
              "input iterator" :
              {
                "iterator kind" : "VAR_REF",
                "variable" : "$$bag"
              }
            }
          }
        }
      ]
    }
  },
  "FROM variable" : "$from-1",
  "GROUP BY" : "Grouping by the first expression in the SELECT list",
  "SELECT expressions" : [
    {
      "field name" : "confNo",
      "field expression" :
      {
        "iterator kind" : "FIELD_STEP",
        "field name" : "confNo",
        "input iterator" :
        {
          "iterator kind" : "VAR_REF",
          "variable" : "$from-1"
        }
      }
    },
    {
      "field name" : "TOTAL_BAGS",
      "field expression" :
      {
        "iterator kind" : "FUNC_SUM",
        "input iterator" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "TOTAL_BAGS",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$from-1"
          }
        }
      }
    }
  ]
}
Explanation:
  • In this query, you group all bags based on the confNo of the users and determine the total count of bags belonging to each confNo.
  • The group-by is index-based, that is the group-by field (confNo) is also part of the index used. This is indicated by the lack of any GROUP iterators. Instead, the grouping is done by the SELECT iterators.
  • There are two SELECT iterators, the inner one has a GROUP BY property that specifies which of the SELECT-clause expressions are also grouping expressions. Here the group by fields is the first expression in the SELECT list (bag.confNo).
    "GROUP BY" : "Grouping by the first expression in the SELECT list"
  • The index fixedschema_conf is used here and in this example, it is a non-covering index as the query also needs to fetch count(bag.bagInfo)which is outside of the entries of the index.
  • When the FROM iterator is a TABLE iterator, the FROM variable is the same as either the index row variable or the row variable of the TABLE iterator, depending on whether the used index is covering or not. In this example, the inner FROM variable is the same as the row variable ($$bag) as the index is not covering.
  • In the SELECT expression two fields are fetched: bag.confNo,count(bag.bagInfo). These correspond to two field names and field expressions in the SELECT expression clause.
  • The results returned by the inner SELECT iterators from the various RNs are partial groups, because rows with the same bag.confNo may exist at multiple RNs. So, regrouping and re-aggregation have to be performed by the driver. This is done by the outer SELECT iterator (above the RECEIVE iterator).
  • The result is also sorted by confNo. The order by fields at positions property specifies the field used for sorting. The value of this array depends on the position of the field which is sorted in the SELECT expression. In this example bag.confNo is the first field in the SELECT expression. So order by fields at positions takes an array index of 0.
    "order by fields at positions" : [ 0 ]
  • In the outer SELECT expression, two fields are fetched: bag.confNo,count(bag.bagInfo). The $from-1 FROM variable will be referenced by iterators implementing the other clauses of the outer SELECT expression. These correspond to two field names and field expressions in the outer SELECT expression clause. For the first field, the field expression uses FIELD_STEP iterator. The second field is the aggregate function count. The iterator FUNC_SUMis used to iterate the result produced by its parent iterator and determine the total number of bags.

Query 8: Using the secondary index with multiple index scans

Fetch the full name and tag number of passengers who are in the given list of names.
SELECT bagdet.fullName, bagdet.bagInfo[].tagNum 
FROM BaggageInfo bagdet WHERE bagdet.fullName IN 
("Lucinda Beckman", "Adam Phillips",
"Zina Christenson","Fallon Clements");
Plan:
{
  "iterator kind" : "SELECT",
  "FROM" :
  {
     "iterator kind" : "RECEIVE",
     "distribution kind" : "ALL_SHARDS",
     "order by fields at positions" : [ 0 ],
     "input iterator" :
     {
       "iterator kind" : "SELECT",
       "FROM" :
       {
         "iterator kind" : "TABLE",
         "target table" : "BaggageInfo",
         "row variable" : "$$bag",
         "index used" : "fixedschema_conf",
         "covering index" : false,
         "index scans" : [
           {
             "equality conditions" : {},
             "range conditions" : {}
           }
         ]
      },
      "FROM variable" : "$$bag",
      "GROUP BY" : "Grouping by the first expression in the SELECT list",
      "SELECT expressions" : [
         {
           "field name" : "confNo",
           "field expression" :
           {
             "iterator kind" : "FIELD_STEP",
             "field name" : "confNo",
             "input iterator" :
             {
               "iterator kind" : "VAR_REF",
               "variable" : "$$bag"
             }
           }
         },
         {
           "field name" : "TOTAL_BAGS",
           "field expression" :
           {
             "iterator kind" : "FN_COUNT",
             "input iterator" :
             {
               "iterator kind" : "FIELD_STEP",
               "field name" : "bagInfo",
               "input iterator" :
               {
                 "iterator kind" : "VAR_REF",
                 "variable" : "$$bag"
               }
             }
           }
         }
       ]
     }
   },
   "FROM variable" : "$from-1",
   "GROUP BY" : "Grouping by the first expression in the SELECT list",
   "SELECT expressions" : [
     {
       "field name" : "confNo",
       "field expression" :
       {
         "iterator kind" : "FIELD_STEP",
         "field name" : "confNo",
         "input iterator" :
         {
           "iterator kind" : "VAR_REF",
           "variable" : "$from-1"
         }
       }
     },
     {
       "field name" : "TOTAL_BAGS",
       "field expression" :
       {
         "iterator kind" : "FUNC_SUM",
         "input iterator" :
         {
           "iterator kind" : "FIELD_STEP",
           "field name" : "TOTAL_BAGS",
           "input iterator" :
           {
             "iterator kind" : "VAR_REF",
             "variable" : "$from-1"
           }
         }
       }
     }
   ]
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
  • The value of the FROM field is an iterator; in this case, it is a TABLE iterator.
  • The index compindex_namephone is used and the index is not covering ( as you need to scan the table to fetch columns other than the index entries).
  • Every value in the IN clause is evaluated using an index scan with an equality condition. There are four index scans that are performed each evaluating one equality condition.
  • The FROM variable is the name of a variable ranging over the records produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM variable ($$bagdet) is the same as the row variable of the TABLE iterator, as the index used is not covering.
  • In the SELECT expression two fields (fullname, bag.bagInfo[].tagNum) are fetched. These correspond to two field names and field expressions in the SELECT expression clause. For the first field, the field expression is computed using FIELD_STEP iterator. For the second field, an ARRAY_CONSTRUCTOR iterator is used which iterates over the corresponding arrays to fetch the field value.

Query 9: A SINGLE PARTITION query using a primary index

Select the ticket details (ticket number, reservation code, tag number, and routing) for a passenger with a specific ticket number and reservation code.
SELECT fullName,bag.ticketNo, bag.confNo, bag.bagInfo[].tagNum,
bag.bagInfo[].routing FROM BaggageInfo bag WHERE
bag.ticketNo=1762311547917 AND bag.confNo="FH7G1W"
Plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "SINGLE_PARTITION",
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "BaggageInfo",
      "row variable" : "$$bag",
      "index used" : "primary index",
      "covering index" : false,
      "index scans" : [
        {
          "equality conditions" : {"ticketNo":1762311547917},
          "range conditions" : {}
        }
      ]
    },
    "FROM variable" : "$$bag",
    "WHERE" :
    {
      "iterator kind" : "EQUAL",
      "left operand" :
      {
        "iterator kind" : "FIELD_STEP",
        "field name" : "confNo",
        "input iterator" :
        {
          "iterator kind" : "VAR_REF",
          "variable" : "$$bag"
        }
      },
      "right operand" :
      {
        "iterator kind" : "CONST",
        "value" : "FH7G1W"
      }
    },
    "SELECT expressions" : [
      {
        "field name" : "fullName",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fullName",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "ticketNo",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "ticketNo",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "confNo",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "confNo",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$bag"
          }
        }
      },
      {
        "field name" : "tagNum",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "tagNum",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      },
      {
        "field name" : "routing",
        "field expression" :
        {
          "iterator kind" : "ARRAY_CONSTRUCTOR",
          "conditional" : true,
          "input iterators" : [
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "routing",
              "input iterator" :
              {
                "iterator kind" : "ARRAY_FILTER",
                "input iterator" :
                {
                  "iterator kind" : "FIELD_STEP",
                  "field name" : "bagInfo",
                  "input iterator" :
                  {
                    "iterator kind" : "VAR_REF",
                    "variable" : "$$bag"
                  }
                }
              }
            }
          ]
        }
      }
    ]
  }
}
Explanation:
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
  • This query specifies a complete shard key in its WHERE clause. As a result, its full result set is contained in a single partition, and the RECEIVE iterator will send its subplan to a single RN that stores that partition.
  • The value of the FROM field is an iterator; in this case, it is a TABLE iterator.
  • A SINGLE_PARTITION query can reference a primary index or a secondary index. The primary key index is used in this example. The index is not covering ( as you need to scan the table to fetch columns other than the index entries).
  • The index scan property contains the start and stop conditions that define the index scans to be performed.
  • The FROM variable is the name of a variable ranging over the records produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM variable ($$bag) is the same as the row variable of the TABLE iterator, as the index used is not covering.
  • In the SELECT expression five fields (fullname, ticketNo,confNo, bag.bagInfo[].tagNum,bag.bagInfo[].routing) are fetched. These correspond to five field names and field expressions in the SELECT expression clause. For the first three fields, the field expression is computed using FIELD_STEP iterator. For the last 2 fields, an ARRAY_CONSTRUCTOR iterator is used which iterates over the corresponding arrays to fetch the field value.

Query 10: Group data with fields not part of any index

Fetch the source of passenger bags and the count of bags for all passengers and group the data by the source.
SELECT $flt_src as SOURCE, count(*) as COUNT FROM BaggageInfo $bag,
$bag.bagInfo.flightLegs[0].fltRouteSrc $flt_src GROUP BY $flt_src
Plan:
{
  "iterator kind" : "GROUP",
  "input variable" : "$gb-2",
  "input iterator" :
  {
    "iterator kind" : "RECEIVE",
    "distribution kind" : "ALL_PARTITIONS",
    "input iterator" :
    {
      "iterator kind" : "GROUP",
      "input variable" : "$gb-1",
      "input iterator" :
      {
        "iterator kind" : "SELECT",
        "FROM" :
        {
          "iterator kind" : "TABLE",
          "target table" : "BaggageInfo",
          "row variable" : "$bag",
          "index used" : "primary index",
          "covering index" : false,
          "index scans" : [
            {
              "equality conditions" : {},
              "range conditions" : {}
            }
          ]
        },
        "FROM variable" : "$bag",
        "FROM" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "fltRouteSrc",
          "input iterator" :
          {
            "iterator kind" : "ARRAY_SLICE",
            "low bound" : 0,
            "high bound" : 0,
            "input iterator" :
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "flightLegs",
              "input iterator" :
              {
                "iterator kind" : "FIELD_STEP",
                "field name" : "bagInfo",
                "input iterator" :
                {
                  "iterator kind" : "VAR_REF",
                  "variable" : "$bag"
                }
              }
            }
          }
        },
        "FROM variable" : "$flt_src",
        "SELECT expressions" : [
          {
            "field name" : "SOURCE",
            "field expression" :
            {
              "iterator kind" : "VAR_REF",
              "variable" : "$flt_src"
            }
          },
          {
            "field name" : "COUNT",
            "field expression" :
            {
              "iterator kind" : "CONST",
              "value" : 1
            }
          }
        ]
      },
      "grouping expressions" : [
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "SOURCE",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$gb-1"
          }
        }
      ],
      "aggregate functions" : [
        {
          "iterator kind" : "FUNC_COUNT_STAR"
        }
      ]
    }
  },
  "grouping expressions" : [
    {
      "iterator kind" : "FIELD_STEP",
      "field name" : "SOURCE",
      "input iterator" :
      {
        "iterator kind" : "VAR_REF",
        "variable" : "$gb-2"
      }
    }
  ],
  "aggregate functions" : [
    {
      "iterator kind" : "FUNC_SUM",
      "input iterator" :
      {
        "iterator kind" : "FIELD_STEP",
        "field name" : "COUNT",
        "input iterator" :
        {
          "iterator kind" : "VAR_REF",
          "variable" : "$gb-2"
        }
      }
    }
  ]
}
Explanation:
  • In this query, you group passenger bags based on the flight source and determine the total number of bags belonging to one flight source.
  • As the GROUP BY field (bagInfo.flightLegs[0].fltRouteSrc in this example) is not part of any index, you need a separate GROUP operator to do the grouping. This is indicated by the existence of the GROUP iterators in the execution plan. There are two GROUP iterators: one that operates at the driver (above the RECEIVE iterator) and another that operates at the RNs (below the RECEIVE iterator).
  • The lower GROUP iterator has a SELECT iterator as input. The SELECT returns the fltRouteSrc and count of bags. The GROUP iterator will operate until the batch limit is reached. If the batch limit is defined as the max number N of results produced, the GROUP iterator will stop when up to N flight source groups have been created. If the batch limit is defined as the max number of bytes read, it will stop when this max is reached. The GROUP operator has an input variable. For the inner GROUP operator, the input variable is $gb-1 and for the outer GROUP operator it is $gb-2.
    "iterator kind" : "GROUP","input variable" : "$gb-1",
  • The primary key index is used here and in this example, it is not a covering index as the query has fields that are not part of the entries of the primary index.
  • When the FROM iterator is a TABLE iterator, the FROM variable is the same as either the index row variable or the row variable of the TABLE iterator, depending on whether the used index is covering or not. Every time a next() call on the FROM iterator returns true, the variable will be bound to the result produced by that iterator. In this example, the FROM variable is the row variable as the index is not covering.
  • This row variable ($bag) will be referenced by iterators implementing the other clauses of the inner SELECT expression.
  • The GROUP iterator creates an internal variable ($gb-1) that iterates over the records produced by the SELECT expression.
  • The result set produced by the lower GROUP iterator is partial: it may not contain all the fltRouteSrc groups and for the fltRouteSrc groups that it does contain, the count may be a partial sum (because all rows for a given fltRouteSrc may not have been retrieved when query execution stops). The upper GROUP iterator receives the partial results from each RN and performs the final grouping and aggregation. It operates the same way as the lower GROUP iterators and will keep operating until the are no more partial results from the RNs. At that point, the full and final result set is cached at the upper GROUP iterator and is returned to the application.
  • The upper GROUP iterator creates an internal variable ($gb-2) that iterates over the records produced by the outer SELECT. The $gb-2 variable has the fltRouteSrc and count of all bags grouped by fltRouteSrc.
  • In the SELECT expression, two fields are fetched: fltRouteSrc,count(*). These correspond to two field names and field expressions in the SELECT expression clause. For the first field, the field expression uses FIELD_STEP iterator. The second field is the aggregate function count. The iterator FUNC_SUM is used to iterate the result produced by its parent iterator and determine the total number of bags.