Example 4: Sort the data using a Covering index

Fetch the id and income of users whose state is CA and whose city of residence is Santaclara and whose income is between 1000 and 10000. Sort the result by the income of the users.
SELECT id, income FROM Users u WHERE u.address.state = "CA" AND 
u.address.city= "Santaclara" AND 1000 < income and income < 10000 
ORDER BY income
Query execution plan:
{
  "iterator kind" : "RECEIVE",
  "distribution kind" : "ALL_SHARDS",
  "order by fields at positions" : [ 1 ],
  "input iterator" :
  {
    "iterator kind" : "SELECT",
    "FROM" :
    {
      "iterator kind" : "TABLE",
      "target table" : "users",
      "row variable" : "$$u",
      "index used" : "idx_state_city_income",
      "covering index" : true,
      "index row variable" : "$$u_idx",
      "index scans" : [
        {
          "equality conditions" : {"address.state":"CA","address.city":"Santaclara"},
          "range conditions" : { "income" : { "start value" : 1000, "start inclusive" : false, "end value" : 10000, "end inclusive" : false } }
        }
      ]
    },
    "FROM variable" : "$$u_idx",
    "SELECT expressions" : [
      {
        "field name" : "id",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "#id",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$u_idx"
          }
        }
      },
      {
        "field name" : "income",
        "field expression" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "income",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$$u_idx"
          }
        }
      }
    ]
  }
}
Explanation of the query execution plan :
  • The root iterator of this query plan is a RECEIVE iterator with a single child (input iterator) that is a SELECT iterator. The only property of the RECEIVE iterator in this example, is the distribution kind whose value is ALL_SHARDS.
  • The results need to be sorted by income. The income is a part of the idx_state_city_income index. So in this example, you don't need a separate SORT operator. The sorting is done by the RECEIVE operator using its propertyorder by fields at positions,which is an array. The value of this array depends on the position of the field that is sorted in the SELECT expression.
    "order by fields at positions" : [ 1 ]
    In this example, the order by is done using the income field, which is the second field in the SELECT expression. That is why you see "1" in the order by fields at position property of the iterator.

    Note:

    If the order of fields in the SELECT expression is different, then the value above changes. For example, if the query is SELECT income, id FROM Users u WHERE u.address.state = "CA" AND u.address.city= "Santaclara"AND 1000 < income and income < 10000 ORDER BY income, the order by fields would be order by fields at positions : [0 ] as theincome field is the first field in the SELECT expression.
  • The index idx_state_city_income is used here and in this example, it is a covering index as the query can be evaluated using only the entries of the index.
  • The index scan property contains the start and stop conditions that define the index scans to be performed.
    "index scans" : [
       {
          "equality conditions" : {"address.state":"CA","address.city":"Santaclara"},
          "range conditions" : { "income" : { "start value" : 1000, "start inclusive" : false, 
                                              "end value" : 10000, "end inclusive" : false } }
       }
    ]
    In this query, only one index scan will be performed. The equality conditions correspond to the predicates u.address.state = "CA" and u.address.city = "Santaclara" from the query. The index scan will start at the first entry must have the value CA on the address.state field and its address.city field is equal to Santaclara. Specifically, the starting index entry must have the value CA on theaddress.state field. All subsequent entries must have CA as the value of their address.state field, and the scan will stop as soon as an entry with a different state value is encountered. There is a range condition to be applied here on the income field.
  • The index row variable is $$u_idx which is the name of a variable ranging over the index entries produced by the TABLE iterator. Every time a new index entry is produced by the index scan, the $$u_idx variable will be bound to that entry.
  • When the FROM iterator is a TABLE iterator, the FROM variable is the same as either the index row variable or the row variable of the TABLE iterator, depending on whether the used index is covering or not. In this example, the FROM variable is same as the index row variable ($$u_idx) as the index is covering because all the fields in the SELECT expression can be evaluated only using the index entries.
  • This index row variable ($$u_idx) will be referenced by iterators implementing the other clauses of the SELECT expression.
  • In the SELECT expression, two fields (id and income) are fetched. These correspond to two field names and field expressions in the SELECT expression clause.
    {
       "field name" : "id",
       "field expression" :
       {
          "iterator kind" : "FIELD_STEP",
          "field name" : "#id",
          "input iterator" :
          {
             "iterator kind" : "VAR_REF",
             "variable" : "$$u_idx"
          }
       }
    }
    For every field to be fetched by the SELECT expression, the field expression is computed by a FIELD_STEP iterator. The FIELD_STEP iterator extracts and returns the value of a field (id as shown above) from the records produced by its input iterator. The same is done for every field to be fetched in the SELECT expression.