So I work with Postgres SQL, and I have a jsonb column with the following structure:
{
"Store1":[
{
"price":5.99,
"seller":"seller"
},
{
"price":56.43,
"seller":"seller"
}
],
"Store2":[
{
"price":45.65,
"seller":"seller"
},
{
"price":44.66,
"seller":"seller"
}
]
}
I have a jsonb like this for every product in the database. I want to run an SQL query that will answer the following question:
For each product, is one of the prices in this JSON is bigger/equal/smaller than X?
Basically filter the product to include only the ones who have at least one price that satisfies a mathematical condition.
How can I do it efficiently? What’s the best way in Postgres to iterate a JSON like this, with a relatively complex inner structure?
Also, if I could control the way the data is structured (to an extent, I can), what changes can I do to make this query more efficient?
Thanks!
2
Answers
Use a json path expression:
or
If you need to compare to another column or make the query parameterised, you can either build the jsonpath dynamically
or you can use the respective function and pass variables as an object:
I admit I had to check my cheat sheet to figure out the right combination of operator and expression. Takeaways:
ANY
@@
does not work with? (@ …)
filter expressions since they don’t return a boolean,@?
does not work with predicates since they always return a value (even if it’sfalse
)As @jjanes commented on my other answer, the jsonpath match
col @@ '$.*[*].price < $limit'
isn’t going to be fast and needs to do full table scan, at least for<
and>
. To make a useful index, a different approach is required. An index can only have a single value to compare with, not any number. For that, we need to change the condition fromEXISTS(SELECT prices_of(col) WHERE price < $limit)
to(SELECT MIN(prices_of(col))) < $limit
.With this idea it is possible to build an expression index on the result of a custom immutable function:
which you can use as
Looking for rows with a price larger than a certain number requires a separate index on
max_price(data)
. If you want to use the index in aJOIN
with more conditions, consider making it a multi-column index.Looking for row with a price equalling a certain number can be optimised by indexing the jsonb column and using a jsonpath:
Unfortunately you can’t use
jsonb_path_ops
here since that doesn’t support the wildcard.