skip to Main Content

I have a posts table with a JSON column of an array with IDs. This column’s data looks exactly like:

[1, 3, 17, 19] These values are not quoted.

To check if a user should see a post, I simply use JSON_CONTAINS(userlist, '(the user ID)', '$'). However, after many thousands of posts, this is starting to get too slow. I’d like to avoid normalizing this to another table with proper relations for now, so I’m wondering what’s the best way to see if a user ID exists in a field like what I have.

Note: this is not exactly a duplicate. My values are straight integers, which is why I seemingly can’t use JSON_SEARCH()?

2

Answers


  1. You really should bite the bullet and normalise, as this operation is only going to get slower. In the meantime, there are a couple of ways you can do this with string operations using LIKE and REGEXP:

    select userlist regexp '\b3\b' AS got_3,
           userlist regexp '\b7\b' AS got_7
    from test
    ;
    select userlist like '[3,%' or userlist like '% 3,%' or userlist like '%,3]' AS got_3,
           userlist like '[7,%' or userlist like '% 7,%' or userlist like '%,7]' AS got_7
    from test
    

    In both cases for your sample data the output is:

    got_3   got_7
    1       0
    

    Using LIKE will probably be faster than JSON_CONTAINS, but using REGEXP probably won’t. You’d need to benchmark on your server.

    If you’re using MySQL 8+, then you can use JSON_TABLE:

    select *
    from test
    join json_table(userlist,
                    '$[*]' columns (user int path '$')
                   ) ul
    where ul.user = 3
    

    Again, performance will be dependent on your server.

    Demo on db-fiddle

    Login or Signup to reply.
  2. Demo of the MySQL 8.0 multi-valued index:

    mysql> create table mytable (id serial primary key, data json);
    
    mysql> insert into mytable set data = '[1, 3, 17, 19]';
    
    mysql> create index i on mytable ((cast(data as unsigned array)));
    
    mysql> explain select * from mytable where 17 member of (data)G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: mytable
       partitions: NULL
             type: ref
    possible_keys: i
              key: i    <-- see, it is using the index `i`
          key_len: 9
              ref: const
             rows: 1
         filtered: 100.00
            Extra: Using where
    

    It’s true that JSON_SEARCH() doesn’t work with integer values, but the multi-valued index I defined in the example is indexing integer values, so this works.

    Of course the whole task would be simpler if you normalized your table instead of using a JSON array. JSON generally makes queries and optimization harder.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search