We have a system take takes a JSON array as a parameter to define matching records. To keep things simple, we have a query:
SELECT ai.ID
FROM association_internal ai
WHERE ai.source_object_record_id MEMBER OF('[57928,57927]');
Now the above call takes 380ms, which is ridiculous. Why? Because the below version of the exact same call takes 11ms.
SELECT ai.ID
FROM association_internal ai
WHERE ai.source_object_record_id IN(57928,57927);
Just by using IN vs. MEMBER OF, it works fine. MEMBER OF is so slow, it’s (honestly) unusable in an enterprise setting. The speed degrades the more numbers we have (and we can have 100+ in the above example).
Obviously there’s some crazy slow processing of MEMBER OF (perhaps it’s even avoiding indexing?) but we cannot pass a JSON array as a property of IN. Now I COULD write dynamic SQL and execute that manually, but that’s ugly and sloppy.
So the question is this – is there a way to pass a JSON array for use in a hardcoded query that doesn’t use MEMBER OF? Something that’ll be just as fast as hardcoding the IN values? Or a way to convert a JSON array of values into a useable IN or EXISTS use case?
3
Answers
I can't believe it. Using JSON_TABLE only takes 12ms to convert the array. So I can front-load any array as a CTE and use IN and it's lightning fast.
For clarity, the fix is:
The right way to optimize this query is to use
IN()
with a discrete value for each element in your array. You can’t optimize a comparison against a comma-separated string.Make as many
?
placeholders as you have elements in your array, and bind each element as a scalar to each parameter.This is not ugly or sloppy, this is the proper way to use SQL. When you write functions in Java or other language, do you pass arguments as a comma-separated string and expect the function to split that string? No — you would pass a separate value per argument.
JSON
is a string; it has to be parsed to see what is in it. And, in most cases, the query must look at every row; no shortcuts.An RDBMS is a structure that is optimized to fetch things. This is especially true when
INDEXes
are used.Indexes into JSON are minimal, and you have to work to get them — such as with "generated" columns.
Use
JSON
for stuff that you won’t need inWHERE
,GROUP BY
orORDER BY
.