We are using MariaDB 10.5.13 but the same issue occurs on native MySQL 8.
Getting this error
4003 – Too many WITH elements in WITH clause
We have a web application which allows users to search a database containing approx 290,000 substances. A feature of the application is that substances can appear on different chemical inventories and the user can find substances by inventory. The inventories themselves are referred to as filters
in the database.
There are 3 tables holding the relevant data:
substances
: the list of 290,000 substances. Each has a uniqueid
.filters
: the list of different chemical inventories. Approx 2300 rows. Each has a uniqueid
.filters_substances
: A table which mapssubstances
andfilters
together. The relationship is such that any individual substance may appear on >=1filters
(i.e. any given substance can be on 1 or more chemical inventories). Approx 1,662,873 rows. The foreign keys used arefilter_id
andsubstance_id
to refer to filters and substances respectively.
When the user applies filters there are 2 possible operations alongside each filter.
- Find the substances that are on a given inventory.
- Exclude any substances that are on that inventory.
Find and Exclude can be used in combination. For example the user could do:
- Find substances on
filters
2 and 300 - Exclude any substances on
filters
5
The result would be substances that are on inventories 2 and 300, but with any substances on 5 removed from the results.
Or they can just use Find on its own, e.g.
- Find substances on
filters
2 and 300
There is no use-case for Exclude on its own.
When the user applies filter conditions we write the substances found, substances.id
, to a table. Every user has their own table to store results which is suffixed with their user ID, e.g. "filtering_1745" is the table for user ID 1745.
The problem
This has worked without issue but we recently found when >60 filters were applied we got this error
Too many WITH elements in WITH clause
The SQL which was generated by the application looks like this. For brevity I’ve removed some of it but added comments to indicate what the removed portions would be:
create table if not exists filtering_1745
as with
fs196 as (select substance_id from filters_substances where filter_id = 196),
fs197 as (select substance_id from filters_substances where filter_id = 197),
fs198 as (select substance_id from filters_substances where filter_id = 198),
-- Same as above for ~60 other filters
select distinct substance_id as `id` from
(
(
select substance_id from fs196
union
select substance_id from fs197
union
select substance_id from fs198
-- Same as above for ~60 other filters
)
)
sq;
In this case all conditions in the query are "Find" (it doesn’t use any "Exclude" conditions). If we were also using Exclude conditions then the only difference is that an EXCEPT
statement is appended to the query. For instance this would exclude filters 170, 178, 180.
except
(
select substance_id from fs170
union select substance_id from fs178
union select substance_id from fs180
) sq;
The same error message appears in this scenario. In other words it’s the number of filters being used as opposed to whether they are Find or Exclude as such.
I’m looking for advice on how this could potentially be rewritten.
One thing I considered is along the lines of this.
- Running ~60 separate
SELECT
statements to get the appropriate substance IDs for the "Find"filters
. - (Optional) if "Exclude" is being used
DELETE
any substances that areNOT IN
thefilters
that are being excluded. DELETE
any non-unique substance IDs infiltering_1745
. This is to make sure when the results are presented to a user later it only includes unique substances. If for example a given substance appeared on 5filters
it should only be presented once because it’s a unique substance.
Any advice would be welcomed.
I looked at some other posts such as "IN" clause limitation in Sql Server and How to put more than 1000 values into an Oracle IN clause but I believe these are different problems. I couldn’t find anything equivalent to this but if there is something please comment with a link and I’ll take a look.
2
Answers
Pattern:
Then join substances table and receive their details.
If mentioned filter is present in current row then according
filter_id = @filterN
produces 1, if not then it produces 0.MAX() returns 1 if filter is present in at least one of the rows for the substance (see GROUP BY), and returns 0 if none row for this substance contains this filter.
NOT inverts the output of MAX().
So the substance will be returned only when all filters which should be present returns 1 and all filters which should not be present returns 0 accordingly.
This query may be used either as CTE, or the substances table may be added into FROM with according output list and GROUP BY expressions expanding.
I would offer a suggestion to have a temporary table that may be specific to each user that you can insert all the filters or exclusions and join against that. This way, you are not limited to 60 or even 1000 entries. Now, not knowing the language you are using to query the database, you can resolve that. But lets start with a temp table per user just as example with applicable index per ID.
Then, instead of building a loop for all the filters or substances, just insert
If exclusions
Then query based on these two temp tables
Now, if you have no "EXCEPTION" substances to be excluded, that temp table would have no records in it and thus return everything. But if there WERE records, they would be excluded. The query itself doesn’t change. You just populate the temp filters or temp substances.
Now, you could take this a slightly different step further instead of multiple temp tables, you could expand and include the user’s ID as a secondary column (and include that ID with your inserts and index as well)
At least this can open your mind to another possible solution.
You can even take the second option a step further by having some additional table allowing a user to save filter sets such as "My XYZ Filter set", that has an auto-increment ID. Add that ID to the filter/substance tables and include that as part of the inserts. Then, when a user wants to adjust, they dont have to re-pick the same set of possible 100’s of values again, miss something and do again. I have no idea of your user interface, just a thought. If not doing based on some "criteria set" ID, make sure you pre-delete the already populated list for each user before starting a new query.