In my Postgres 13.10 database, I have a table menus
with a JSON column named dishes
. This column contains values in the following format:
{
"desserts": [
{"id": 1, "name": "chocolate cake"},
{"id": 2, "name": "banana split"}
],
"appetizers": [
{"id": 3, "name": "nachos"},
{"id": 4, "name": "mozzarella sticks"},
{"id": 5, "name": "buffalo wings"}
]
}
The data type is json
, but I do have the option of changing it to jsonb
, if that would help with query performance.
Given a list of appetizer IDs (for example, 3 and 5), I need to determine which IDs are referenced by at least one row in this table.
How can I write a query to do this?
(Of course, this is a contrived example.)
2
Answers
Assuming
jsonb
. (Would not scale well for plainjson
).To make it fast, use SQL/JSON path functionality added with Postgres 12:
This searches the key "id" in all arrays nested in top-level objects of
menus.dishes
. Or restrict to the top-level key "appetizers":fiddle
Be sure to have a GIN index on
menus(dishes)
. Ideally ajsonb_path_ops
index:The main feature is to build a
jsonpath
expression off unnested IDs dynamically. This way it works for any number of input IDs, any number of top-level objects in the JSON documents, and any number of nested array items – while still using above index.Related:
Aside
Have you considered making an actual relational DB out of your JSON documents? Keyword "database normalization". Would make queries like this a whole lot simpler and faster.
You didn’t describe how you want the list to be provided. Here I’m providing it as a VALUES construct.
You can use the
@>
containment test, but will need to cast the data to JSONB (or just store it that way in the first place) in order to use that. The "at least one row" suggests use of theexists
construct.