Please refer to this db fiddle (SQL also pasted below):
https://www.db-fiddle.com/f/bJ5FfufvP9GHPYtBhkywdP/5
I would like to show rows from the "mpr_metrics" table that have NO associated reviews. But here’s the catch: if there is a metric with an associated review, AND that review also shares the SAME start time as another review, then it should be excluded from the result set.
For example, the result of the fiddle is "3", but it should only show a count of "1", since there is only 1 metric (id=5) which has no review AND no review with the same start_time value as another review.
CREATE TABLE mpr_metrics (
id INT,
start_time TIMESTAMP,
PRIMARY KEY (id)
);
CREATE TABLE mpr_metrics_reviews (
id INT,
metric_id INT,
PRIMARY KEY (id)
);
INSERT INTO mpr_metrics (id, start_time) values (1, '2024-09-16');
INSERT INTO mpr_metrics (id, start_time) values (2, '2024-09-16');
INSERT INTO mpr_metrics (id, start_time) values (3, '2024-09-09');
INSERT INTO mpr_metrics (id, start_time) values (4, '2024-09-09');
INSERT INTO mpr_metrics (id, start_time) values (5, '2024-09-11');
INSERT INTO mpr_metrics_reviews (id, metric_id) values (1, 1);
INSERT INTO mpr_metrics_reviews (id, metric_id) values (2, 3);
SELECT count(DISTINCT mpr_metrics.start_time)
FROM mpr_metrics
LEFT JOIN mpr_metrics_reviews
ON mpr_metrics.id = mpr_metrics_reviews.metric_id
WHERE mpr_metrics_reviews.id is null;
The result of this SELECT is "3", but the result should be "1".
4
Answers
You need another join to exclude rows that have shared start times with another row.
Why that anti join trick? Do you think that MySQL is not capable of running straight-forward
NOT IN
andNOT EXISTS
in a reasonable time? This may have been the case in the very early days of the DBMS, but those days are gone.You want two
NOT EXISTS
clauses, because you want metrics …The simple query:
Demo: https://dbfiddle.uk/kghxWWSh
(The first condition can easily be re-written as
NOT IN
, which may even be slightly more readable. While the second condition could also be converted intoNOT IN
, though, I suggest you stick toNOT EXISTS
, because theIN
clause would be correlated, and we strive to keep[NOT] IN
clauses non-correlated for readability.)You can do something simple like this:
If the data set is large it would make sense to have an index on the start_time.
Updated db fiddle here
NOT EXISTS
: Metric has no associated reviews.NOT IN
:start_time
of metric does not match astart_time
of metrics that do have reviews