Query 7: Group data with fields as part of the index

Fetch the reservation code and count of bags for all passengers.
SELECT bag.confNo, count(bag.bagInfo) AS TOTAL_BAGS 
FROM BaggageInfo bag GROUP BY bag.confNo;
Plan:
{
  "iterator kind" : "SELECT",
  "FROM" :
  {
    "iterator kind" : "RECEIVE",
    "distribution kind" : "ALL_SHARDS",
    "order by fields at positions" : [ 0 ],
    "input iterator" :
    {
      "iterator kind" : "SELECT",
      "FROM" :
      {
        "iterator kind" : "TABLE",
        "target table" : "BaggageInfo",
        "row variable" : "$$bag",
        "index used" : "fixedschema_conf",
        "covering index" : false,
        "index scans" : [
          {
            "equality conditions" : {},
            "range conditions" : {}
          }
        ]
      },
      "FROM variable" : "$$bag",
      "GROUP BY" : "Grouping by the first expression in the SELECT list",
      "SELECT expressions" : [
        {
          "field name" : "confNo",
          "field expression" :
          {
            "iterator kind" : "FIELD_STEP",
            "field name" : "confNo",
            "input iterator" :
            {
              "iterator kind" : "VAR_REF",
              "variable" : "$$bag"
            }
          }
        },
        {
          "field name" : "TOTAL_BAGS",
          "field expression" :
          {
            "iterator kind" : "FN_COUNT",
            "input iterator" :
            {
              "iterator kind" : "FIELD_STEP",
              "field name" : "bagInfo",
              "input iterator" :
              {
                "iterator kind" : "VAR_REF",
                "variable" : "$$bag"
              }
            }
          }
        }
      ]
    }
  },
  "FROM variable" : "$from-1",
  "GROUP BY" : "Grouping by the first expression in the SELECT list",
  "SELECT expressions" : [
    {
      "field name" : "confNo",
      "field expression" :
      {
        "iterator kind" : "FIELD_STEP",
        "field name" : "confNo",
        "input iterator" :
        {
          "iterator kind" : "VAR_REF",
          "variable" : "$from-1"
        }
      }
    },
    {
      "field name" : "TOTAL_BAGS",
      "field expression" :
      {
        "iterator kind" : "FUNC_SUM",
        "input iterator" :
        {
          "iterator kind" : "FIELD_STEP",
          "field name" : "TOTAL_BAGS",
          "input iterator" :
          {
            "iterator kind" : "VAR_REF",
            "variable" : "$from-1"
          }
        }
      }
    }
  ]
}
Explanation:
  • In this query, you group all bags based on the confNo of the users and determine the total count of bags belonging to each confNo.
  • The group-by is index-based, that is the group-by field (confNo) is also part of the index used. This is indicated by the lack of any GROUP iterators. Instead, the grouping is done by the SELECT iterators.
  • There are two SELECT iterators, the inner one has a GROUP BY property that specifies which of the SELECT-clause expressions are also grouping expressions. Here the group by fields is the first expression in the SELECT list (bag.confNo).
    "GROUP BY" : "Grouping by the first expression in the SELECT list"
  • The index fixedschema_conf is used here and in this example, it is a non-covering index as the query also needs to fetch count(bag.bagInfo)which is outside of the entries of the index.
  • When the FROM iterator is a TABLE iterator, the FROM variable is the same as either the index row variable or the row variable of the TABLE iterator, depending on whether the used index is covering or not. In this example, the inner FROM variable is the same as the row variable ($$bag) as the index is not covering.
  • In the SELECT expression two fields are fetched: bag.confNo,count(bag.bagInfo). These correspond to two field names and field expressions in the SELECT expression clause.
  • The results returned by the inner SELECT iterators from the various RNs are partial groups, because rows with the same bag.confNo may exist at multiple RNs. So, regrouping and re-aggregation have to be performed by the driver. This is done by the outer SELECT iterator (above the RECEIVE iterator).
  • The result is also sorted by confNo. The order by fields at positions property specifies the field used for sorting. The value of this array depends on the position of the field which is sorted in the SELECT expression. In this example bag.confNo is the first field in the SELECT expression. So order by fields at positions takes an array index of 0.
    "order by fields at positions" : [ 0 ]
  • In the outer SELECT expression, two fields are fetched: bag.confNo,count(bag.bagInfo). The $from-1 FROM variable will be referenced by iterators implementing the other clauses of the outer SELECT expression. These correspond to two field names and field expressions in the outer SELECT expression clause. For the first field, the field expression uses FIELD_STEP iterator. The second field is the aggregate function count. The iterator FUNC_SUMis used to iterate the result produced by its parent iterator and determine the total number of bags.