In our DB we have tables, that have 6-8 M records, this data is static and in order of performance optimization we decided to store some data from this tables in separate table, and for predefined filters use the values from this table. For example we want to store distinct states, cities, tags in this table.
The example of data could be represented in such test table:
CREATE TABLE test_data_table (
id BIGSERIAL PRIMARY KEY,
state text,
city text
);
INSERT INTO test_data_table (state, city)
values ('MD', 'Union Bridge'),
('IL', 'Bourbonnais'),
('NC', 'Rdelk'),
('FL', 'Bonita Springs'),
('FL', 'Unit Ave Maria'),
('AZ', 'N Placita Chula Vista Tucson'),
('TX', 'Sienna'),
('LA', 'Lornager'),
('CA', 'Empire'),
('MA', 'Middleton'),
('CO', 'Yampa'),
('GA', 'Spr Project Name Unit'),
('IN', 'Greencastle'),
('NV', 'Flora'),
('RI', 'Coventry'),
('WA', 'Rice'),
('FL', 'Orange City'),
('TX', 'Haltom City'),
('CA', 'Moraga'),
('LA', 'Belle Chasse'),
('AZ', 'E Rincon Creek Ranch Rd Tucson'),
('CA', 'Acampo'),
('AZ', 'S Avra Rd Tucson'),
('GA', 'Folkston'),
('FL', 'Biscayne Park'),
('FL', 'All Units Oxford'),
('IL', 'Volo'),
('MN', 'Peterson'),
('LA', 'Rayville'),
('KY', 'South Park View'),
('AZ', 'E Camino Emmanuel Sahuarita'),
('CA', 'Wrighwood'),
('CA', 'Los Angeles'),
('AZ', 'N Teal Blue Tr Tucson'),
('NY', 'Clifton Park'),
('IN', 'Frankl'),
('KY', 'Anchorage'),
('LA', 'Crown Point'),
('CA', 'Los BanosStruct CalcsSpan T'),
('IL', 'Chebanse');
And here the example of separate table for filters and the way to fill it with values:
CREATE TABLE test_filter_values
(
id BIGSERIAL PRIMARY KEY,
type text,
value jsonb
);
INSERT INTO test_filter_values (type, value)
VALUES ('CITY', (SELECT json_agg(DISTINCT city)::jsonb FROM test_data_table));
INSERT INTO test_filter_values (type, value)
VALUES ('STATE', (SELECT json_agg(DISTINCT state)::jsonb FROM test_data_table));
So, the main issue happens when we try to query data from this table, here is the example:
SELECT DISTINCT city from test_data_table
WHERE city ILIKE 'b%';
This query returns 4 results;
And this one:
SELECT city FROM (SELECT jsonb_array_elements(value) AS city
FROM test_filter_values WHERE type = 'CITY') cities
WHERE city::text ILIKE 'b%';
Returns no results,
LIKE operator without % at the begining or in the end returs zero results from test_filter_values table, but if using LIKE '%b%'
with such case the behaviour is the same. The same issue happens with the exact match:
SELECT state FROM (SELECT jsonb_array_elements(value) AS state
FROM test_filter_values WHERE type = 'STATE') states
WHERE state::text = 'NC';
It returns no results either.
So how could we deal in this case?
2
Answers
That’s because
jsonb_array_elements
returns jsonb not text. A JSON array might have other arrays, objects, arrays of objects etc inside it.That means the text values will be quoted as JSON and you will be trying to match against
"Anchorage"
notAnchorage
when that gets converted to text.Maybe just store an array of text if that’s what you want to use?
If you’re using
jsonb
, you get JSONPath expressions, and those offerlike_regex
filter: demoThe
#>>'{}'
is there to "open up" the resulting jsonb strings and take thetext
-type value out. Plain::text
cast would produce double quotes around the values.It’s more flexible than comparing to every element from
jsonb_array_elements_text()
and you can apply other JSONPath-related logic in one go. If you prefer that for some other reason and all you do is this type of comparison, there’s no need to nest a subquery:And in that case it should perform comparably.