skip to Main Content

i’m having a little problem, trying to optimaze quering data from a unstructured json field.

By unstructured i mean, each json field is unique on each row.

by using this query, i want to collect all the keys which are matching the prefix.

select j.key as value
from tasks t, lateral jsonb_each_text(t.tags) j
where j.key ilike '%some_key%'
group by j.key
order by j.key;

but on loaded table these queries take up to 20k millis to execute.

I found an index which can work with ilike statements (trgm_idx)
but how to put it here, still have no clue.

please great postgres masters, help me if you have any ideas

P.S. small update, if you know how to make it even faster if i use

select j.key as value
from tasks t, lateral jsonb_each_text(t.tags) j
where j.key ilike 'some_prefix%'
group by j.key
order by j.key;

i can work around it

2

Answers


  1. This is a poor design, you should probably rethink it. But it is possible. You would need to create a helper function which returns all keys munged into one string, picking a suitable delimiter which is not going to show up in either keys or in patterns:

    create function all_keys(jsonb) returns text language sql immutable as $$
       select string_agg(key,'===') from jsonb_each_text($1);  
    $$;
    
    create index on tasks using gin (all_keys(tags) gin_trgm_ops);
    
    select * from tasks where all_keys(tags) ilike '%some_key%';
    

    This query will return each task at most once, unlike your original query, so may need some tweaking to make it do exactly what you want. Maybe (untested):

    select j.key as value
    from tasks t, lateral jsonb_each_text(t.tags) j
    where all_keys(t.tags) ilike '%some_key%'
    and j.key ilike '%some_key%'
    group by j.key
    order by j.key;
    

    The first ilike should provide performance assuming few tasks have any qualifying keys (nor many false positives). The 2nd one produces just the specific keys which match, including multiplicity of them.

    Login or Signup to reply.
  2. Judging by the use of jsonb_each_text(), you’re actually dealing with jsonb rather than plain json, so jsonpath functions are available to you without having map to text

    In PostgreSQL JSONPath there’s a like_regex filter expression you can use with the match operator @?. Sadly, even though that match operator is supported by jsonb_ops as well as jsonb_path_ops operator classes

    The default GIN operator class for jsonb supports queries with the key-exists operators ?, ?| and ?&, the containment operator @>, and the jsonpath match operators @? and @@.

    searching key names requires .keyvalue().key accessor that won’t work with that. Still, you might gain some performance due to the mere fact you’re not deconstructing your jsonb‘s, casting to text, then matching against that, but rather interact with them directly, through a purpose-built function: demo

    select jsonb_path_query(tags,'$.keyvalue().key?(@ like_regex ".*some_key.*")')
    from tasks t
    where tags @? '$.keyvalue().key?(@ like_regex ".*some_key.*")'
    group by 1 
    order by 1
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search