IN Operator

Syntax

in_expression ::= in1_expression | in2_expression |
    in3_expression | in4_expression

in1_expression ::= "(" concatenate_expression 
    ("," concatenate_expression)* ")" 
    IN "(" expression ("," expression)* ")"
in2_expression ::= concatenate_expression 
    IN "(" expression ("," expression)* ")"
in3_expression ::= concatenate_expression IN path_expression
in4_expression ::= "(" concatenate_expression 
    ("," concatenate_expression)* ")" IN path_expression

Semantics

The IN operator is essentially a compact alternative to a number of OR-ed equality conditions. For example, the query

SELECT * FROM users WHERE age IN (22, 25, 43)
 
is equivalent to
 
SELECT * FROM users WHERE age = 22 OR age = 25 OR age = 43

and the query,

SELECT * FROM users
    WHERE (firstName, lastName) IN
        (("John","Smith"),("Peter","Paul"),("Mary","Ann"))
 
is equivalent to
 
SELECT * FROM users
    WHERE (firstName = "John" AND lastName = "Smith") OR
        (firstName = "Peter" AND lastName = "Paul") OR
        (firstName = "Mary" AND lastName = "Ann")

As shown in the grammar, there are 4 syntactic variants of the IN operator. The in1_expression and in2_expression follow the standard SQL syntax. The in2_expression one is actually a special case of the in1_expression, for the case when there is only one expression in the left-hand-side of the operator. For the in1_expression, if K is the number of expressions in the left-hand-side, then each expression list in the right-hand-side must consist of K expressions. If N is the number of expression lists in the right-hand-side, then the whole IN condition is equivalent to:

(expr1 = expr11 and expr2 = expr12 and exprK = expr1K) or
(expr1 = expr21 and expr2 = expr22 and exprK = expr2K) or
.... or
(expr1 = exprN1 and expr2 = exprN2 and exprK = exprNK)

However, in addition to being more compact, queries using IN operators will be executed more efficiently if appropriate indexes exist. For example, if table users has an index on columns age, firstName and lastName, then both of the above IN queries will use that index to find the qualifying rows, whereas the equivalent OR queries will be executed via full table scans. See also examples in Finding Applicable Indexes.

The in3_expression and in4_expression variants allow a relative large number of search keys to be provided via a single bind variable. For example, if the $keys variable in bound to the array [ "John", "Smith", "Peter", "Paul", "Mary", "Ann"], then the following query is equivalent to the second IN query above.

DECLARE $keys ARRAY(json);
SELECT * FROM users
    WHERE (firstName, lastName) IN $keys[]

In general, with the in3_expression and in4_expression variants, the expression in the right-hand-side is evaluated first. If the number M of items in the resulting sequence is less than the number K of expressions in the left-hand-side, the result of the IN operator is false. If M is not a multiple of K, the last (M mod K) items in the sequence are discarded and M is set to the number of remaining items. Then, the IN expression is equivalent to:

(expr1 = k1 and expr2 = k2 and exprK = kK) or
(expr1 = kK+1 and expr2 = kK+2 and exprK = k2*K) or
 .... or
(expr1 = kM-K and expr2 = kM-K+1 and exprK = kM)

However, an additional type-checking restriction applies in this case: in each of the above equality conditions, the type of the right-hand-side item must be a subtype of the left-hand-side type.