10.1.7.5 Collation of Expressions

In the great majority of statements, it is obvious what collation MySQL uses to resolve a comparison operation. For example, in the following cases, it should be clear that the collation is the collation of column charset_name:

SELECT x FROM T ORDER BY x;
SELECT x FROM T WHERE x = x;
SELECT DISTINCT x FROM T;

However, with multiple operands, there can be ambiguity. For example:

SELECT x FROM T WHERE x = 'Y';

Should the comparison use the collation of the column x, or of the string literal 'Y'? Both x and 'Y' have collations, so which collation takes precedence?

Standard SQL resolves such questions using what used to be called coercibility rules. MySQL assigns coercibility values as follows:

The preceding coercibility values are current as of MySQL 5.0.3. In MySQL 5.0 prior to 5.0.3, there is no system constant or NULL coercibility. Functions such as USER() have a coercibility of 2 rather than 3, and literals have a coercibility of 3 rather than 4.

MySQL uses coercibility values with the following rules to resolve ambiguities:

Although automatic conversion is not in the SQL standard, the SQL standard document does say that every character set is (in terms of supported characters) a subset of Unicode. Because it is a well-known principle that what applies to a superset can apply to a subset, we believe that a collation for Unicode can apply for comparisons with non-Unicode strings.

Examples:

ComparisonCollation Used
column1 = 'A'Use collation of column1
column1 = 'A' COLLATE xUse collation of 'A' COLLATE x
column1 COLLATE x = 'A' COLLATE yError

The COERCIBILITY() function can be used to determine the coercibility of a string expression:

mysql> SELECT COERCIBILITY('A' COLLATE latin1_swedish_ci);
        -> 0
mysql> SELECT COERCIBILITY(VERSION());
        -> 3
mysql> SELECT COERCIBILITY('A');
        -> 4

See Section 12.13, “Information Functions”.

For implicit conversion of a numeric or temporal value to a string, such as occurs for the argument 1 in the expression CONCAT(1, 'abc'), the result is a binary string for which the character set and collation are binary. See Section 12.2, “Type Conversion in Expression Evaluation”.