MEMBERS extension

MEMBERS is an extension to GROUP BY that allows grouping by the members of a set.

MEMBERS lets you group by multi-assign attributes. Keep in mind that when grouping by a multi-assign attribute, rows with no assignments for the attribute are discarded during grouping.

MEMBERS syntax

MEMBERS appears in the GROUP BY clause, using this syntax:
GROUP BY MEMBERS(<set>) AS <alias> [,MEMBERS(<set2>) AS <alias2>]*
where:
  • set is a set of any set data type (such as mdex:string-set or mdex:long-set) and must be an attribute reference. MEMBERS can only refer to attributes from the source statement(s) or corpus (i.e., cannot be locally defined). For example, set can be a multi-assign string attribute from the corpus.
  • alias is an aliased name, which must be NCName-compliant. In statement results, the aliased name has the same data type as the elements of the set.

As the syntax shows, EQL supports grouping by the members of multiple sets simultaneously. To do this, simply include multiple MEMBERS clauses in a GROUP list.

The MEMBERS form is available in grouping sets, with surface syntax like:
GROUP BY ROLLUP(a, b, MEMBERS(c) AS cValue, d)

Note that grouping by the members of a set is available in any statement, not just those over the corpus (because EQL preserves all values in a set across statement boundaries).

MEMBERS data type error message

If an attempt is made to use a single-assign attribute as an argument to MEMBERS, an error message is returned similar to this example:
Cannot apply MEMBERS to mdex:double. A set type is required 

In this error example, MEMBERS was used with a single-assign double attribute (mdex:double), instead of a multi-assign double attribute (mdex:double-set).

MEMBERS examples

Assume a small data set of 25 records, with each record having zero, one, or two assignments from the Body multi-assign attribute. In this sample query, WineID is a single-assign attribute:
RETURN results AS
  SELECT
    SET(WineID) AS IDs
  GROUP BY MEMBERS(Body) AS bodyType
The result of this statement might be:
IDs                                          bodyType
-------------------------------------------------------
| { 19 }                                    | Fresh   |
| { 22, 25 }                                | Firm    |
| { 14, 15 }                                | Supple  |
| { 11, 19, 22, 23, 24, 25, 4, 6, 8 }       | Robust  |
| { 10, 11, 12, 13, 16, 18, 3, 4, 5, 7, 9 } | Tannins |
| { 10, 12, 13, 16, 18, 3, 5, 7, 9 }        | Silky   |
-------------------------------------------------------

In the results, note that several records contribute to multiple buckets, because they have two Body assignments. Five records do not contribute to the buckets, because they have no assignments for the Body attribute, and thus are discarded during the grouping.

This second example shows how to group by the members of multiple sets simultaneously. The Body and Score multi-assign attributes are used in the query:
RETURN results AS
  SELECT
    SET(WineID) as IDs
  WHERE WineType = 'White'
  GROUP BY MEMBERS(Body) AS bodyType, MEMBERS(Score) AS scoreValue
The result of this query might be:
IDs      bodyType  scoreValue
------------------------
| { 19 } | Fresh  | 88 |
| { 25 } | Firm   | 82 |
| { 25 } | Firm   | 84 |
| { 25 } | Robust | 82 |
| { 25 } | Robust | 84 |
| { 19 } | Robust | 88 |
------------------------

Note that the record with WineID=25 contributes to four buckets, corresponding to the cross product of { Firm, Robust } and { 82, 84 }.

Note on MEMBERS interaction with GROUPING SETS

You should be aware that grouping by set members may interact with GROUPING SETS (including CUBE and ROLLUP) to produce results that at first glance may seem unexpected.

For example, first we make a query that groups only by the ROLLUP extension:
RETURN results AS
  SELECT
    SUM(Price) AS totalPrice
  GROUP BY ROLLUP(WineType)
The result with our data set is:
WineType          totalPrice
----------------------------
| Pinot Noir      |  14.99 |
| White           |  97.97 |
| Blanc de Noirs  |  16.99 |
| Zinfandel       |        |
| Brut            |  46.98 |
| Red             | 142.34 |
| Merlot          |  25.99 |
| Bordeaux        |  21.99 |
| Chardonnay      |  52.90 |
|                 | 420.15 |
----------------------------

We get one row for each WineType, and one summary row at the bottom, which includes records from all of the WineType values. Because SUM is associative, the expected behavior is that the totalPrice summary row will be equal to the sum of the totalPrice values for all other rows, and in fact the 420.15 result meets that expectation. (Note that the total for White wines is 97.97.)

Then we make a similar query, but selecting only the White wines and grouping with MEMBERS and ROLLUP:
RETURN results AS
  SELECT
    SUM(Price) AS totalPrice
  WHERE WineType = 'White'
  GROUP BY ROLLUP(WineType, MEMBERS(Body) AS bodyType)
The result from this second query is:
WineType bodyType totalPrice
--------------------------
| White | Fresh  | 20.99 |
| White | Firm   | 43.99 |
| White | Robust | 64.98 |
| White |        | 97.97 |
|       |        | 97.97 |
--------------------------

The results show that the correspondence between the summary row and the individual rows is not as expected. One might expect the totalPrice for the 'White' summary row (that is, the row where WineType is White and bodyType is null) to be the sum of the total prices for the (White, Firm), (White, Fresh), and (White, Robust) rows above it.

However, if you add the total prices for the first three rows, you get 129.96, rather than the expected value of 97.97. This discrepancy arises because, when you group by the members of a set, a row can contribute to multiple buckets. In particular, Record 19 has two Body assignments (Fresh and Robust) and therefore contributes to both the (White, Fresh) and (White, Robust) rows, and so its price is in effect double-counted. Similarly, Record 20 has no Body assignments and so it does not contribute to any of the buckets in which bodyType is not NULL, because its value of Body is the empty set.

EQL effectively computes the 'White' summary row, however, by grouping by WineType (which is a single-assign attribute), so each input row counts exactly once.