skip to Main Content

Today I have encountered an extremely counter-intuitive behavior of < operator in MySQL and I have no idea why.

Specifying exactly the same operands for < gives different results depending on the contexts.

Can anyone explain the reason? This is so counter-intuitive that I even suspect it’s a bug of MySQL.

SQL

DROP TABLE IF EXISTS t1;

CREATE TABLE
    t1 (s VARCHAR(10));

INSERT INTO
    t1 (s)
VALUES
    ('0001/a'),
    ('0001/b'),
    ('0002/a'),
    ('0002/b'),
    ('0003/a'),
    ('0003/b');

SELECT ('/' < ':'); -- => `TRUE`

SELECT ('0002/a' < '0002:'); -- => `TRUE`

SELECT ('0002/b' < '0002:'); -- => `TRUE`

SELECT * FROM t1 WHERE (s < '0002:'); -- => What will the result be?

Expected Result

+--------+
| s      |
+--------+
| 0001/a |
| 0001/b |
| 0002/a |
| 0002/b |
+--------+

Actual Result

+--------+
| s      |
+--------+
| 0001/a |
| 0001/b |
+--------+

Environment

I use the container version of MySQL with no special configuration:

$ docker run -d -p 3306:3306 --name mysql 
  -e 'MYSQL_DATABASE=t' 
  -e 'MYSQL_ROOT_PASSWORD=password' 
  --restart always mysql

Edit 1

I could reproduce the behavior in DB Fiddle:

Edit 2

 > show variables like "%collat%"
+-------------------------------+--------------------+
| Variable_name                 | Value              |
+-------------------------------+--------------------+
| collation_connection          | utf8mb3_general_ci |
| collation_database            | utf8mb4_0900_ai_ci |
| collation_server              | utf8mb4_0900_ai_ci |
| default_collation_for_utf8mb4 | utf8mb4_0900_ai_ci |
+-------------------------------+--------------------+

2

Answers


  1. I think you may have misread the return values for true and false. A successful function call usually returns 0, but in logical operator terms, 0 means false. I get

    mysql> SELECT ('/' < ':');
    +-------------+
    | ('/' < ':') |
    +-------------+
    |           0 |
    +-------------+
    1 row in set (0.00 sec)
    
    Login or Signup to reply.
  2. As noted in the comments different collations will result in different ordering of characters, so if your comparisons have two different collations you may get two different results.

    In your case the database default collation and the connection default collation are different, so when you compare string literals to string literals, the connection default is used, but when you compare a column to string literals the column collation is used which, if not explicitly set when the table is created will be the database default. So you are getting two different results.

    This is referenced in the docs for collation_connection (emphasis mine)

    The collation of the connection character set. collation_connection is
    important for comparisons of literal strings.
    For comparisons of
    strings with column values, collation_connection does not matter
    because columns have their own collation, which has a higher collation
    precedence (see Section 12.8.4, “Collation Coercibility in
    Expressions
    ”).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search