Query execution plan
A query execution plan is the sequence of operations Oracle NoSQL Database performs to run a query.
- Overview of query plan
- Query 1: Using primary key index with an index range scan
- Query 2: Using primary key index with an index predicate
- Query 3: Using a secondary index with an index range scan
- Query 4: Using the primary index
- Query 5: Sort the data using a Covering index
- Query 6: Using a secondary index with an index predicate
- Query 7: Group data with fields as part of the index
- Query 8: Using the secondary index with multiple index scans
- Query 9: A SINGLE PARTITION query using a primary index
- Query 10: Group data with fields not part of any index
Overview of query plan
A query execution plan is internally structured as a tree of plan iterators.
Each kind of iterator evaluates a different kind of expression that may appear in a query. In general, the choice of index and the kind of associated index predicates can have a drastic effect on query performance. As a result, you as a developer often want to see what index is used by a query and what predicates have been pushed down to it. Based on this information, you may want to force the use of a different index via index hints. This information is contained in the query execution plan. All Oracle NoSQL drivers provide APIs to display the execution plan of a query. All Oracle NoSQL graphical UIs including the IntelliJ, VSCode, and Eclipse plugins along with the Oracle Cloud Infrastructure Console include controls for displaying the query execution plan.
Some of the most common and important iterators used in queries are :
TABLE iterator
- Scanning the index used by the query (which may be the primary index).
- Applying any filtering predicates pushed to the index
- Retrieve the rows pointed to by the qualifying index entries if necessary. If the index is covering, the result set of the TABLE iterator is a set of index entries, otherwise, it is a set of table rows.
Note:
An index is called a covering index with respect to a query if the query can be evaluated using only the entries of that index, that is, without the need to retrieve the associated rows.- target table: The name of the target table in the query.
- index used: The name of the index used by the query. If the primary index were used, “primary index” would appear as the value of this property.
- covering index: Whether the index is covering or not.
- row variable: The name of a variable ranging over the table rows produced by the TABLE iterator. If the index is covering, no table rows are produced and this variable is not used.
- index scans: Contains the start and stop conditions that define the index scans to be performed.
- index row variable: The name of a variable ranging over the index entries produced by the TABLE iterator. Every time a new index entry is produced by the index scan, the index variable will be bound to that entry.
- index filtering predicate: A predicate evaluated on every index entry produced by the index scan. If the result of this evaluation is true, the index variable is bound to this entry and the entry or its associated table row is returned as the result of the next() call on the TABLE iterator. Otherwise, the entry is skipped, the next entry from the index scan is produced, the predicate is evaluated again on this entry and it continues until a qualifying entry is found.
SELECT iterator
It is responsible for executing the SELECT expression.
RECEIVE iterator
- The RECEIVE iterator itself and all iterators that are above it in the iterator tree are executed at the driver.
- All iterators below the RECEIVE iterator are executed at the replication nodes (RNs); these iterators form a subtree rooted at the unique child of the RECEIVE iterator.
In general, the RECEIVE iterator acts as a query coordinator. It sends its subplan to appropriate RNs for execution and collects the results. It may perform additional operations such as sorting and duplicate elimination and propagates the results to its ancestor iterators (if any) for further processing.
Distribution kinds
A distribution kind specifies how the query will be distributed for execution across the RNs participating in an Oracle NoSQL database (a store). The distribution kind is a property of the RECEIVE iterator.
- SINGLE_PARTITION: A SINGLE_PARTITION query specifies a complete shard key in its WHERE clause. As a result, its full result set is contained in a single partition, and the RECEIVE iterator will send its subplan to a single RN that stores that partition. A SINGLE_PARTITION query may use either the primary-key index or a secondary index.
- ALL_PARTITIONS: Queries use the primary-key index here and they don’t specify a complete shard key. As a result, if the store has M partitions, the RECEIVE iterator will send M copies of its subplan to be executed over one of the M partitions each.
- ALL_SHARDS: Queries use a secondary index here and they don’t specify a complete shard key. As a result, if the store has N shards, the RECEIVE iterator will send N copies of its subplan to be executed over one of the N shards each.
Populating the tables to view the query execution plan :
If you want to follow along with the examples, download the script baggageschema_loaddata.sql and run it as shown below. This script creates the table used in the example and loads data into the table.
java -jar lib/kvstore.jar kvlite -secure-config disable
java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore
load
command, run the
script.load -file baggageschema_loaddata.sql
Creating indexes:
baggageInfo
table as shown
below.
- Create an index on passengers reservation
code.
CREATE INDEX fixedschema_conf ON baggageInfo confNo)
- Create an index on the full name and phone number of
passengers
CREATE INDEX compindex_namephone ON baggageInfo (fullName,contactPhone)
- Create an index on three fields, when the bag was last seen, the last seen
station, and the arrival date and
time.
CREATE INDEX simpleindex_arrival ON baggageInfo(bagInfo[].lastSeenTimeGmt as ANYATOMIC, bagInfo[].bagArrivalDate as ANYATOMIC, bagInfo[].lastSeenTimeStation as ANYATOMIC)
Query 1: Using primary key index with an index range scan
SELECT fullname, ticketNo,
bag.bagInfo[].tagNum,bag.bagInfo[].routing
FROM BaggageInfo bag WHERE
1762340000000 < ticketNo AND ticketNo < 1762352000000
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_PARTITIONS",
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "primary index",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {},
"range conditions" : { "ticketNo" : { "start value" : 1762340000000, "start inclusive" : false, "end value" : 1762352000000, "end inclusive" : false } }
}
]
},
"FROM variable" : "$$bag",
"SELECT expressions" : [
{
"field name" : "fullname",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fullname",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "ticketNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "ticketNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "tagNum",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "tagNum",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
},
{
"field name" : "routing",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "routing",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
}
]
}
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
- The value of the
FROM
field is an iterator; in this case, it is a TABLE iterator. - The primary key index is used and the index is not covering (as you need to scan the table to fetch columns other than the index entries).
- The index scan property contains the start and stop conditions that define the index scans to be performed.
- The FROM variable is the name of a variable ranging over the records
produced by the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM
variable (
$$bag
)is the same as the row variable of the TABLE iterator, as the index used is not covering. - In the SELECT expression four fields
(
fullname,ticketNo,bag.bagInfo[].tagNum,bag.bagInfo[].routing
) are fetched. These correspond to four field names and field expressions in the SELECT expression clause. For the first two fields, the field expression is computed usingFIELD_STEP
iterator. For the last 2 fields, anARRAY_CONSTRUCTOR
iterator is used which iterates over the corresponding arrays to fetch the field value.
Query 2: Using primary key index with an index predicate
SELECT fullname, ticketNo, bag.bagInfo[].tagNum,bag.bagInfo[].routing
FROM BaggageInfo bag WHERE ticketNo > 1762340000000 OR ticketNo < 1762352000000;
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_PARTITIONS",
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "primary index",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {},
"range conditions" : {}
}
],
"index filtering predicate" :
{
"iterator kind" : "OR",
"input iterators" : [
{
"iterator kind" : "GREATER_THAN",
"left operand" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "ticketNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
},
"right operand" :
{
"iterator kind" : "CONST",
"value" : 1762340000000
}
},
{
"iterator kind" : "LESS_THAN",
"left operand" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "ticketNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
},
"right operand" :
{
"iterator kind" : "CONST",
"value" : 1762352000000
}
}
]
}
},
"FROM variable" : "$$bag",
"SELECT expressions" : [
{
"field name" : "fullname",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fullname",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "ticketNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "ticketNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "tagNum",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "tagNum",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
},
{
"field name" : "routing",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "routing",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
}
]
}
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
- The value of the
FROM
field is an iterator; in this case, it is a TABLE iterator. - The primary key index is used and the index is not covering (as you need to scan the table to fetch columns other than the index entries).
- The
index filtering predicate
evaluates the filter criteria on theticketNo
field. Using the greater than and less than operators the filter condition is evaluated. - The FROM variable is the name of a variable ranging over the
records produced by the FROM iterator. Here the FROM
iterator is a TABLE iterator, and the FROM variable
(
$$bag
) is the same as the row variable of the TABLE iterator, as the index used is not covering. - In the SELECT expression four fields
(
fullname,ticketNo,bag.bagInfo[].tagNum, bag.bagInfo[].routing
) are fetched. These correspond to four field names and field expressions in the SELECT expression clause. For the first two fields, the field expression is computed usingFIELD_STEP
iterator. For the last 2 fields, anARRAY_CONSTRUCTOR
iterator is used which iterates over the corresponding arrays to fetch the field value.
Query 3: Using a secondary index with an index range scan
SELECT fullName,bag.ticketNo, bag.confNo, bag.bagInfo[].tagNum,
bag.bagInfo[].routing FROM BaggageInfo bag WHERE bag.confNo="FH7G1W"
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_SHARDS",
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "fixedschema_conf",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {"confNo":"FH7G1W"},
"range conditions" : {}
}
]
},
"FROM variable" : "$$bag",
"SELECT expressions" : [
{
"field name" : "fullName",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fullName",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "ticketNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "ticketNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "confNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "confNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "tagNum",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "tagNum",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
},
{
"field name" : "routing",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "routing",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
}
]
}
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
- The value of the
FROM
field is an iterator; in this case, it is a TABLE iterator. - The index
fixedschema_conf
is used and the index is not covering (as you need to scan the table to fetch columns other than the index entries). - The index scan property contains the start and stop conditions that define the index scans to be performed.
- The FROM variable is the name of a variable ranging over the
records produced by the FROM iterator. Here the FROM
iterator is a TABLE iterator, and the FROM variable
(
$$bag
) is the same as the row variable of the TABLE iterator, as the index used is not covering. - In the SELECT expression four fields
(
fullname,ticketNo,confNo,bag.bagInfo[].tagNum,bag.bagInfo[].routing
) are fetched. These correspond to five field names and field expressions in the SELECT expression clause. For the first three fields, the field expression is computed usingFIELD_STEP
iterator. For the last two fields, anARRAY_CONSTRUCTOR
iterator is used which iterates over the corresponding arrays to fetch the field value.
Query 4: Using the primary index
SELECT fullname,bag.bagInfo[].routing FROM BaggageInfo bag
WHERE gender!="F"
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_PARTITIONS",
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "primary index",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {},
"range conditions" : {}
}
]
},
"FROM variable" : "$$bag",
"WHERE" :
{
"iterator kind" : "NOT_EQUAL",
"left operand" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "gender",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
},
"right operand" :
{
"iterator kind" : "CONST",
"value" : "F"
}
},
"SELECT expressions" : [
{
"field name" : "fullname",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fullname",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "routing",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "routing",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
}
]
}
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
- The value of the
FROM
field is an iterator; in this case, it is a TABLE iterator. - The primary key index is used and the index is not covering ( as you need to scan the table to fetch columns other than the index entries).
- The FROM variable is the name of a variable ranging over the records produced by
the FROM iterator. Here the FROM iterator is a TABLE iterator, and the FROM
variable (
$$bag
) is the same as the row variable of the TABLE iterator, as the index used is not covering. - In the SELECT expression two fields
(
fullname,bag.bagInfo[].routing
) are fetched. These correspond to two field names and field expressions in the SELECT expression clause. For the first field, the field expression is computed usingFIELD_STEP
iterator. For the second field, anARRAY_CONSTRUCTOR
iterator is used which iterates over the corresponding array to fetch the field value.
Query 5: Sort the data using a Covering index
SELECT bag.contactPhone, bag.fullName FROM BaggageInfo bag
ORDER BY bag.fullName
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_SHARDS",
"order by fields at positions" : [ 1 ],
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "compindex_namephone",
"covering index" : true,
"index row variable" : "$$bag_idx",
"index scans" : [
{
"equality conditions" : {},
"range conditions" : {}
}
]
},
"FROM variable" : "$$bag_idx",
"SELECT expressions" : [
{
"field name" : "contactPhone",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "contactPhone",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag_idx"
}
}
},
{
"field name" : "fullName",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fullName",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag_idx"
}
}
}
]
}
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator. The only property of the RECEIVE iterator in this example is the distribution kind whose value is ALL_SHARDS.
- The results need to be sorted by
fullName
. ThefullName
is part of thecompindex_namephone
index. So in this example, you don't need a separate SORT operator. The sorting is done by the RECEIVE operator using its propertyorder by fields at positions
, which is an array. The value of this array depends on the position of the field which is sorted in the SELECT expression."order by fields at positions" : [ 1 ]
- In this example, the order by is done using the
fullName
which is the second field in the SELECT expression. That is why you see1
in theorder by fields at position
property of the iterator. - The index
compindex_namephone
is used here and in this example, it is a covering index as the query can be evaluated using only the entries of the index. - The index row variable is
$$bag_idx
which is the name of a variable ranging over the index entries produced by the TABLE iterator. Every time a new index entry is produced by the index scan, the$$bag_idx
variable will be bound to that entry. - When the FROM iterator is a TABLE iterator, the FROM variable is the same as
either the index row variable or the row variable of the TABLE iterator,
depending on whether the used index is covering or not. In this example, the
FROM variable is the same as the index row variable
(
$$bag_idx
) as the index is covering. - This index row variable (
$$bag_idx
) will be referenced by iterators implementing the other clauses of the SELECT expression. - In the SELECT expression two fields (
contactPhone,fullName
) are fetched. These correspond to two field names and field expressions in the SELECT expression clause. For both fields, the field expression is computed usingFIELD_STEP
iterator.
Query 6: Using a secondary index with an index predicate
SELECT fullName, bag.ticketNo, bag.bagInfo[].bagArrivalDate
FROM BaggageInfo bag WHERE EXISTS
bag.bagInfo[$element.bagArrivalDate >="2019-01-01T00:00:00"]
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_SHARDS",
"distinct by fields at positions" : [ 1 ],
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "simpleindex_arrival",
"covering index" : false,
"index row variable" : "$$bag_idx",
"index scans" : [
{
"equality conditions" : {},
"range conditions" : {}
}
],
"index filtering predicate" :
{
"iterator kind" : "GREATER_OR_EQUAL",
"left operand" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo[].bagArrivalDate",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag_idx"
}
},
"right operand" :
{
"iterator kind" : "CONST",
"value" : "2019-01-01T00:00:00"
}
}
},
"FROM variable" : "$$bag",
"SELECT expressions" : [
{
"field name" : "fullName",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fullName",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "ticketNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "ticketNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "bagArrivalDate",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagArrivalDate",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
}
]
}
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
- The value of the
FROM
field is an iterator; in this case, it is a TABLE iterator. - The EXISTS condition is actually converted to a filtering
predicate. There is one filtering predicate which is the
whole WHERE expression. The index
simpleindex_arrival
is the only one applicable here and is used. - The
index filtering predicate
evaluates the filter criteria on thebagArrivalDate
field. Using the greater than and less than operators the filter condition is evaluated. - The FROM variable is the name of a variable ranging
over the records produced by the FROM iterator. Here the
FROM iterator is a TABLE iterator, and the FROM variable
(
$$bag
)is the same as the row variable of the TABLE iterator, as the index used is not covering. - In the SELECT expression three fields
(
fullname, ticketNo,bag.bagInfo[].bagArrivalDat
) are fetched. These correspond to three field names and field expressions in the SELECT expression clause. For the first two fields, the field expression is computed usingFIELD_STEP
iterator. For the last field, anARRAY_CONSTRUCTOR
iterator is used which iterates over the corresponding arrays to fetch the field value.
Query 7: Group data with fields as part of the index
SELECT bag.confNo, count(bag.bagInfo) AS TOTAL_BAGS
FROM BaggageInfo bag GROUP BY bag.confNo;
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_SHARDS",
"order by fields at positions" : [ 0 ],
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "fixedschema_conf",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {},
"range conditions" : {}
}
]
},
"FROM variable" : "$$bag",
"GROUP BY" : "Grouping by the first expression in the SELECT list",
"SELECT expressions" : [
{
"field name" : "confNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "confNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "TOTAL_BAGS",
"field expression" :
{
"iterator kind" : "FN_COUNT",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
},
"FROM variable" : "$from-1",
"GROUP BY" : "Grouping by the first expression in the SELECT list",
"SELECT expressions" : [
{
"field name" : "confNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "confNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$from-1"
}
}
},
{
"field name" : "TOTAL_BAGS",
"field expression" :
{
"iterator kind" : "FUNC_SUM",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "TOTAL_BAGS",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$from-1"
}
}
}
}
]
}
- In this query, you group all bags based on the
confNo
of the users and determine the total count of bags belonging to eachconfNo
. - The group-by is index-based, that is the group-by field
(
confNo
) is also part of the index used. This is indicated by the lack of any GROUP iterators. Instead, the grouping is done by the SELECT iterators. - There are two SELECT iterators, the inner one has a
GROUP BY
property that specifies which of the SELECT-clause expressions are also grouping expressions. Here the group by fields is the first expression in the SELECT list (bag.confNo
)."GROUP BY" : "Grouping by the first expression in the SELECT list"
- The index
fixedschema_conf
is used here and in this example, it is a non-covering index as the query also needs to fetchcount(bag.bagInfo)
which is outside of the entries of the index. - When the FROM iterator is a TABLE iterator, the FROM variable is
the same as either the index row variable or the
row variable of the TABLE iterator, depending
on whether the used index is covering or not. In this
example, the inner FROM variable is the same as the row
variable (
$$bag
) as the index is not covering. - In the SELECT expression two fields are fetched:
bag.confNo,count(bag.bagInfo)
. These correspond to two field names and field expressions in the SELECT expression clause. - The results returned by the inner SELECT iterators from the
various RNs are partial groups, because rows with the same
bag.confNo
may exist at multiple RNs. So, regrouping and re-aggregation have to be performed by the driver. This is done by the outer SELECT iterator (above the RECEIVE iterator). - The result is also sorted by
confNo
. Theorder by fields at positions
property specifies the field used for sorting. The value of this array depends on the position of the field which is sorted in the SELECT expression. In this examplebag.confNo
is the first field in the SELECT expression. Soorder by fields at positions
takes an array index of0
."order by fields at positions" : [ 0 ]
- In the outer SELECT expression, two fields are fetched:
bag.confNo,count(bag.bagInfo)
. The$from-1
FROM variable will be referenced by iterators implementing the other clauses of the outer SELECT expression. These correspond to two field names and field expressions in the outer SELECT expression clause. For the first field, the field expression usesFIELD_STEP
iterator. The second field is the aggregate functioncount
. The iteratorFUNC_SUM
is used to iterate the result produced by its parent iterator and determine the total number of bags.
Query 8: Using the secondary index with multiple index scans
SELECT bagdet.fullName, bagdet.bagInfo[].tagNum
FROM BaggageInfo bagdet WHERE bagdet.fullName IN
("Lucinda Beckman", "Adam Phillips",
"Zina Christenson","Fallon Clements");
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_SHARDS",
"order by fields at positions" : [ 0 ],
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "fixedschema_conf",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {},
"range conditions" : {}
}
]
},
"FROM variable" : "$$bag",
"GROUP BY" : "Grouping by the first expression in the SELECT list",
"SELECT expressions" : [
{
"field name" : "confNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "confNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "TOTAL_BAGS",
"field expression" :
{
"iterator kind" : "FN_COUNT",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
},
"FROM variable" : "$from-1",
"GROUP BY" : "Grouping by the first expression in the SELECT list",
"SELECT expressions" : [
{
"field name" : "confNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "confNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$from-1"
}
}
},
{
"field name" : "TOTAL_BAGS",
"field expression" :
{
"iterator kind" : "FUNC_SUM",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "TOTAL_BAGS",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$from-1"
}
}
}
}
]
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
- The value of the
FROM
field is an iterator; in this case, it is a TABLE iterator. - The index
compindex_namephone
is used and the index is not covering ( as you need to scan the table to fetch columns other than the index entries). - Every value in the IN clause is evaluated using an index scan with an equality condition. There are four index scans that are performed each evaluating one equality condition.
- The FROM variable is the name of a variable ranging over the
records produced by the FROM iterator. Here the FROM
iterator is a TABLE iterator, and the FROM variable
(
$$bagdet
) is the same as the row variable of the TABLE iterator, as the index used is not covering. - In the SELECT expression two fields
(
fullname, bag.bagInfo[].tagNum
) are fetched. These correspond to two field names and field expressions in the SELECT expression clause. For the first field, the field expression is computed usingFIELD_STEP
iterator. For the second field, anARRAY_CONSTRUCTOR
iterator is used which iterates over the corresponding arrays to fetch the field value.
Query 9: A SINGLE PARTITION query using a primary index
SELECT fullName,bag.ticketNo, bag.confNo, bag.bagInfo[].tagNum,
bag.bagInfo[].routing FROM BaggageInfo bag WHERE
bag.ticketNo=1762311547917 AND bag.confNo="FH7G1W"
{
"iterator kind" : "RECEIVE",
"distribution kind" : "SINGLE_PARTITION",
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$$bag",
"index used" : "primary index",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {"ticketNo":1762311547917},
"range conditions" : {}
}
]
},
"FROM variable" : "$$bag",
"WHERE" :
{
"iterator kind" : "EQUAL",
"left operand" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "confNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
},
"right operand" :
{
"iterator kind" : "CONST",
"value" : "FH7G1W"
}
},
"SELECT expressions" : [
{
"field name" : "fullName",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fullName",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "ticketNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "ticketNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "confNo",
"field expression" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "confNo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
},
{
"field name" : "tagNum",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "tagNum",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
},
{
"field name" : "routing",
"field expression" :
{
"iterator kind" : "ARRAY_CONSTRUCTOR",
"conditional" : true,
"input iterators" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "routing",
"input iterator" :
{
"iterator kind" : "ARRAY_FILTER",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$$bag"
}
}
}
}
]
}
}
]
}
}
- The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator.
- This query specifies a complete shard key in its WHERE clause. As a result, its full result set is contained in a single partition, and the RECEIVE iterator will send its subplan to a single RN that stores that partition.
- The value of the
FROM
field is an iterator; in this case, it is a TABLE iterator. - A SINGLE_PARTITION query can reference a primary index or a secondary index. The primary key index is used in this example. The index is not covering ( as you need to scan the table to fetch columns other than the index entries).
- The index scan property contains the start and stop conditions that define the index scans to be performed.
- The FROM variable is the name of a variable ranging over the
records produced by the FROM iterator. Here the FROM
iterator is a TABLE iterator, and the FROM variable
(
$$bag
) is the same as the row variable of the TABLE iterator, as the index used is not covering. - In the SELECT expression five fields
(
fullname, ticketNo,confNo, bag.bagInfo[].tagNum,bag.bagInfo[].routing
) are fetched. These correspond to five field names and field expressions in the SELECT expression clause. For the first three fields, the field expression is computed usingFIELD_STEP
iterator. For the last 2 fields, anARRAY_CONSTRUCTOR
iterator is used which iterates over the corresponding arrays to fetch the field value.
Query 10: Group data with fields not part of any index
SELECT $flt_src as SOURCE, count(*) as COUNT FROM BaggageInfo $bag,
$bag.bagInfo.flightLegs[0].fltRouteSrc $flt_src GROUP BY $flt_src
{
"iterator kind" : "GROUP",
"input variable" : "$gb-2",
"input iterator" :
{
"iterator kind" : "RECEIVE",
"distribution kind" : "ALL_PARTITIONS",
"input iterator" :
{
"iterator kind" : "GROUP",
"input variable" : "$gb-1",
"input iterator" :
{
"iterator kind" : "SELECT",
"FROM" :
{
"iterator kind" : "TABLE",
"target table" : "BaggageInfo",
"row variable" : "$bag",
"index used" : "primary index",
"covering index" : false,
"index scans" : [
{
"equality conditions" : {},
"range conditions" : {}
}
]
},
"FROM variable" : "$bag",
"FROM" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "fltRouteSrc",
"input iterator" :
{
"iterator kind" : "ARRAY_SLICE",
"low bound" : 0,
"high bound" : 0,
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "flightLegs",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "bagInfo",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$bag"
}
}
}
}
},
"FROM variable" : "$flt_src",
"SELECT expressions" : [
{
"field name" : "SOURCE",
"field expression" :
{
"iterator kind" : "VAR_REF",
"variable" : "$flt_src"
}
},
{
"field name" : "COUNT",
"field expression" :
{
"iterator kind" : "CONST",
"value" : 1
}
}
]
},
"grouping expressions" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "SOURCE",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$gb-1"
}
}
],
"aggregate functions" : [
{
"iterator kind" : "FUNC_COUNT_STAR"
}
]
}
},
"grouping expressions" : [
{
"iterator kind" : "FIELD_STEP",
"field name" : "SOURCE",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$gb-2"
}
}
],
"aggregate functions" : [
{
"iterator kind" : "FUNC_SUM",
"input iterator" :
{
"iterator kind" : "FIELD_STEP",
"field name" : "COUNT",
"input iterator" :
{
"iterator kind" : "VAR_REF",
"variable" : "$gb-2"
}
}
}
]
}
- In this query, you group passenger bags based on the flight source and determine the total number of bags belonging to one flight source.
- As the GROUP BY field
(
bagInfo.flightLegs[0].fltRouteSrc
in this example) is not part of any index, you need a separate GROUP operator to do the grouping. This is indicated by the existence of the GROUP iterators in the execution plan. There are two GROUP iterators: one that operates at the driver (above the RECEIVE iterator) and another that operates at the RNs (below the RECEIVE iterator). - The lower GROUP iterator has a SELECT iterator as
input. The SELECT returns the
fltRouteSrc
andcount
of bags. The GROUP iterator will operate until the batch limit is reached. If the batch limit is defined as the max number N of results produced, the GROUP iterator will stop when up to N flight source groups have been created. If the batch limit is defined as the max number of bytes read, it will stop when this max is reached. The GROUP operator has an input variable. For the inner GROUP operator, the input variable is$gb-1
and for the outer GROUP operator it is$gb-2
."iterator kind" : "GROUP","input variable" : "$gb-1",
- The primary key index is used here and in this example, it is not a covering index as the query has fields that are not part of the entries of the primary index.
- When the FROM iterator is a TABLE iterator, the FROM variable is the same as either the index row variable or the row variable of the TABLE iterator, depending on whether the used index is covering or not. Every time a next() call on the FROM iterator returns true, the variable will be bound to the result produced by that iterator. In this example, the FROM variable is the row variable as the index is not covering.
- This row variable (
$bag
) will be referenced by iterators implementing the other clauses of the inner SELECT expression. - The GROUP iterator creates an internal variable (
$gb-1
) that iterates over the records produced by the SELECT expression. - The result set produced by the lower GROUP iterator is partial: it may not contain all
the
fltRouteSrc
groups and for thefltRouteSrc
groups that it does contain, the count may be a partial sum (because all rows for a givenfltRouteSrc
may not have been retrieved when query execution stops). The upper GROUP iterator receives the partial results from each RN and performs the final grouping and aggregation. It operates the same way as the lower GROUP iterators and will keep operating until the are no more partial results from the RNs. At that point, the full and final result set is cached at the upper GROUP iterator and is returned to the application. - The upper GROUP iterator creates an internal variable (
$gb-2
) that iterates over the records produced by the outer SELECT. The$gb-2
variable has thefltRouteSrc
and count of all bags grouped byfltRouteSrc
. - In the SELECT expression, two fields are fetched:
fltRouteSrc,count(*)
. These correspond to two field names and field expressions in the SELECT expression clause. For the first field, the field expression usesFIELD_STEP
iterator. The second field is the aggregate functioncount
. The iteratorFUNC_SUM
is used to iterate the result produced by its parent iterator and determine the total number of bags.