SET function

The SET aggregation function takes a single-assign attribute and constructs a set of all of the (non-NULL) values from that attribute.

Single-assign attributes have non-set data types (such as mdex:long). So the SET function takes a non-set data type attribute and produces a set data type result (for example, mdex:long-set).

The SET function's behavior is as follows:
  • All NULL values are discarded. This means that if there are two non-NULL values for an attribute and one NULL value, then only the two non-NULL values are returned.
  • If an attribute has no non-NULL values, then the empty set is returned.
  • Duplicate values in an attribute are discarded. For example, if three records all have a WineType=Red assignment and two of them have Price=14.95 assignments (the third having Price=21.95), then only two Price values (one 14.95 and one 21.95) will be returned for the Red set.
  • String values are case-sensitive. Therefore, the string value "Merlot" is distinct from the string value "merlot", which means that they are not duplicate values.
  • The order of the values within a set is unspecified and unobservable.

The resulting set will have a set data type (such as mdex:double-set). All subsequent operations on it must follow the rules for sets.

The SET function is available in one-argument and two-argument versions, as described below. This function can be used only in SELECT clauses.

SET one-argument version

The syntax of the one-argument version of the SET function is:
SET(<single-assign_attribute>)
where the data type of the attribute must be a non-set data type (such as mdex:double for a single-assign double attribute).
In this example, Price is a single-assign double attribute:
RETURN results AS
SELECT
  SET(Price) AS prices
FROM WineState
GROUP BY WineType
ORDER BY WineType
The result of this statement might be:
WineType             prices
---------------------------------------------------------------
| Blanc de Noirs | { 16.99 }                                  |
| Bordeaux       | { 21.99 }                                  |
| Brut           | { 22.99, 23.99 }                           |
| Chardonnay     | { 17.95, 34.95 }                           |
| Merlot         | { 25.99 }                                  |
| Pinot Noir     | { 14.99 }                                  |
| Red            | { 12.99, 13.95, 17.5, 18.99, 21.99, 9.99 } |
| White          | { 20.99, 32.99, 43.99 }                    |
| Zinfandel      | { }                                        |
---------------------------------------------------------------

In the results, note that Zinfandel has an empty set because Zinfandel does not have a Price attribute assignment.

SET two-argument version

For situations where the result of the SET aggregator can be extremely large (causing the Dgraph to consume excessive memory), a two-argument form of the aggregator is provided to limit the set size.

The syntax of the two-argument version of the SET function is:
SET(<single-assign_attribute>, <max-size>)
where:
  • single-assign_attribute is an attribute whose data type is a non-set data type (such as mdex:string for a single-assign string attribute).
  • max-size is an integer that specifies the maximum size of the set. If max-size is less than the number of elements in the set, The Dgraph arbitrarily chooses which elements to discard; this choice is stable across multiple executions of the query. If max-size is 0 (zero) or a negative number, SET always returns the empty set.
Note that max-size must be an integer literal:
SET(Price, 3) is valid.

SET(Price, x) is not valid, even if x is an integer.
This sample query is the same as the one-argument example, except that the query limits the sets to a maximum of two elements:
RETURN results AS
SELECT
  SET(Price, 2) AS prices
FROM WineState
GROUP BY WineType
ORDER BY WineType
The result of this statement might be:
WineType             prices
-------------------------------------
| Blanc de Noirs | { 16.99 }        |
| Bordeaux       | { 21.99 }        |
| Brut           | { 22.99, 23.99 } |
| Chardonnay     | { 17.95, 34.95 } |
| Merlot         | { 25.99 }        |
| Pinot Noir     | { 14.99 }        |
| Red            | { 12.99, 9.99 }  |
| White          | { 20.99, 32.99 } |
| Zinfandel      | { }              |
-------------------------------------

In the results, note that Red set now has two elements, while it had six elements with the one-argument SET version. Likewise with the White set, which previously had three elements.

Data type errors

When working with the SET function, keep in mind that its resulting sets are of the set data types, such as a mdex:double-set data type.

For example, assume that Price is a multi-assign double attribute. This incorrect example:
RETURN results AS
SELECT SET(Price) AS prices
FROM WineState
GROUP BY WineType
HAVING prices > 10
will throw this error:
In statement "results": In HAVING clause: Cannot compare mdex:double-set and mdex:long

The reason for the error is that the "prices" set is of type mdex:double-set and it is being compared to the number 10 (which is an mdex:double type).

The query should therefore be corrected to something like this:
RETURN results AS
SELECT SET(Price) AS prices
FROM WineState
GROUP BY WineType
HAVING SOME x IN prices SATISFIES (x > 10)

In this example, the SATISFIES expression allows you to make a numerical comparison.