skip to Main Content

In our DB we have tables, that have 6-8 M records, this data is static and in order of performance optimization we decided to store some data from this tables in separate table, and for predefined filters use the values from this table. For example we want to store distinct states, cities, tags in this table.

The example of data could be represented in such test table:

CREATE TABLE test_data_table (
                                 id    BIGSERIAL PRIMARY KEY,
                                 state  text,
                                 city text
);

INSERT INTO test_data_table (state, city)
    values  ('MD', 'Union Bridge'),
            ('IL', 'Bourbonnais'),
            ('NC', 'Rdelk'),
            ('FL', 'Bonita Springs'),
            ('FL', 'Unit Ave Maria'),
            ('AZ', 'N Placita Chula Vista Tucson'),
            ('TX', 'Sienna'),
            ('LA', 'Lornager'),
            ('CA', 'Empire'),
            ('MA', 'Middleton'),
            ('CO', 'Yampa'),
            ('GA', 'Spr Project Name Unit'),
            ('IN', 'Greencastle'),
            ('NV', 'Flora'),
            ('RI', 'Coventry'),
            ('WA', 'Rice'),
            ('FL', 'Orange City'),
            ('TX', 'Haltom City'),
            ('CA', 'Moraga'),
            ('LA', 'Belle Chasse'),
            ('AZ', 'E Rincon Creek Ranch Rd Tucson'),
            ('CA', 'Acampo'),
            ('AZ', 'S Avra Rd Tucson'),
            ('GA', 'Folkston'),
            ('FL', 'Biscayne Park'),
            ('FL', 'All Units Oxford'),
            ('IL', 'Volo'),
            ('MN', 'Peterson'),
            ('LA', 'Rayville'),
            ('KY', 'South Park View'),
            ('AZ', 'E Camino Emmanuel Sahuarita'),
            ('CA', 'Wrighwood'),
            ('CA', 'Los Angeles'),
            ('AZ', 'N Teal Blue Tr Tucson'),
            ('NY', 'Clifton Park'),
            ('IN', 'Frankl'),
            ('KY', 'Anchorage'),
            ('LA', 'Crown Point'),
            ('CA', 'Los BanosStruct CalcsSpan T'),
            ('IL', 'Chebanse');

And here the example of separate table for filters and the way to fill it with values:

    CREATE TABLE test_filter_values
(
    id    BIGSERIAL PRIMARY KEY,
    type  text,
    value jsonb
);

INSERT INTO test_filter_values (type, value)
VALUES ('CITY', (SELECT json_agg(DISTINCT city)::jsonb FROM test_data_table));

INSERT INTO test_filter_values (type, value)
VALUES ('STATE', (SELECT json_agg(DISTINCT state)::jsonb FROM test_data_table));

So, the main issue happens when we try to query data from this table, here is the example:

SELECT DISTINCT city from test_data_table
WHERE city ILIKE 'b%';

This query returns 4 results;
And this one:

SELECT city FROM (SELECT jsonb_array_elements(value) AS city
                  FROM test_filter_values WHERE type = 'CITY') cities
WHERE city::text ILIKE 'b%';

Returns no results,
LIKE operator without % at the begining or in the end returs zero results from test_filter_values table, but if using LIKE '%b%' with such case the behaviour is the same. The same issue happens with the exact match:

SELECT state FROM (SELECT jsonb_array_elements(value) AS state
                   FROM test_filter_values WHERE type = 'STATE') states
WHERE state::text = 'NC';

It returns no results either.

So how could we deal in this case?

2

Answers


  1. That’s because jsonb_array_elements returns jsonb not text. A JSON array might have other arrays, objects, arrays of objects etc inside it.

    That means the text values will be quoted as JSON and you will be trying to match against "Anchorage" not Anchorage when that gets converted to text.

    Maybe just store an array of text if that’s what you want to use?

    Login or Signup to reply.
  2. If you’re using jsonb, you get JSONPath expressions, and those offer like_regex filter: demo

    SELECT jsonb_path_query( value
                            ,'$[*]?(@ like_regex "^b.*" flag "i")'
                           )#>>'{}' AS city
    FROM test_filter_values 
    WHERE type = 'CITY';
    
    city
    Belle Chasse
    Biscayne Park
    Bonita Springs
    Bourbonnais

    The #>>'{}' is there to "open up" the resulting jsonb strings and take the text-type value out. Plain ::text cast would produce double quotes around the values.

    It’s more flexible than comparing to every element from jsonb_array_elements_text() and you can apply other JSONPath-related logic in one go. If you prefer that for some other reason and all you do is this type of comparison, there’s no need to nest a subquery:

    SELECT state
    FROM test_filter_values
    CROSS JOIN jsonb_array_elements_text(value)as t(state)
    WHERE type = 'STATE'
    AND state = 'NC';
    

    And in that case it should perform comparably.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search