skip to Main Content

Hi i was reading postgres indexes, trying find a solution to make gin operator work with like on array.

Let me explain.

I have a need to find a range of unique keys. This is not that simple, because i get initial data from another query, by parsing a lot of random jsons to a basically a map of where key is each field from this json.

i have a prefix of a key, and now i want to find all the matching results, but it takes to much time to execute.

the idea of this index is to make a function

create function aggregate_tags(jsonb) returns text language sql immutable as $$
select j.key
from lateral jsonb_each_text($1) j
$$

and to make index based on this function

create index col_text_pattern on tasks(aggregate_tags(tags), text_pattern_ops);

and then i was thinking to use

select j.key
from tags t, lateral jsonb_each_text(t.tags) j
where aggregate_tags(t.tag) like 'prefix%' and j.key like 'prefix%'

the problem is, this index somehow returns only the first value for each prefix and for example, if i use
‘my_awesome_value’ as prefix, and be trying to look for
‘my_awesome_value34’, index works fine. BUT
if the prefix ‘my_awesome_value34’ it will return empty result, because index was trying to match it to ‘my_awesome_value42’.

and for some reason i can’t make indexes work with function which returns setof text. Is it possible, because it looks like it could help me fix my problem

a great thanks in advance, to everyone who can help.

2

Answers


  1. text_pattern_ops operator class uses B-tree indexing, which is not ideal for wildcard searches with the LIKE operator, especially when you’re using prefixes; you may want to consider using the pg_trgm extension, which provides trigram-based indexing and text search capabilities ideal for using the LIKE operator.

    CREATE EXTENSION pg_trgm;
    
    CREATE OR REPLACE FUNCTION array_elements_like(arr jsonb, pattern text)
    RETURNS SETOF text AS $$
    BEGIN
      RETURN QUERY
      SELECT key
      FROM jsonb_each_text(arr)
      WHERE key LIKE pattern;
    END;
    $$ LANGUAGE plpgsql IMMUTABLE;
    
    -- create an index on the array elements using trigram indexing
    CREATE INDEX idx_tags_elements_trgm ON tasks USING gin (array_elements_like(tags, 'prefix%') gin_trgm_ops);
    
    SELECT key
    FROM tags t, LATERAL jsonb_each_text(t.tags) j
    WHERE array_elements_like(t.tags, 'prefix%');
    

    gin_trgm_opsis designed for searching texts using trigrams, can handle wildcard searches and partial matches like the one you need for prefixes.

    Login or Signup to reply.
  2. You could do what you want by creating an extension to create a new operator class using the partial match feature. This would involve rather a lot of tedious C programming. That is a professional engagement, it is unlikely you will find someone to do this for you for free.

    The FTS feature already supports partial matches, using tsquery elements which have ‘:*’ appended. It is possible you could finagle that into working for you here without need a new extension.

    There might be two reasons your current implementation only matches the first key. The first is that the function shown only returns one key, as it returns a scalar and involves no aggregation despite the name. If your function did do an aggregation into one string, like the one I showed in my answer to your previous question, then the problem would be that only one of keys could occur at the beginning of that string, and so only could possibly match the front-anchored prefix search.

    You would instead have to match like this:

    where aggregate_tags(t.tag) like '%prefix%' and j.key like 'prefix%'
    

    But that couldn’t work with your text_pattern_ops index, you would need the trigram index to support that.

    There is no way we can tell you if the trigram index will be fast enough. You will have to try it yourself on realistic data and realistic pattern queries.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search