skip to Main Content

Please refer to this db fiddle (SQL also pasted below):

https://www.db-fiddle.com/f/bJ5FfufvP9GHPYtBhkywdP/5

I would like to show rows from the "mpr_metrics" table that have NO associated reviews. But here’s the catch: if there is a metric with an associated review, AND that review also shares the SAME start time as another review, then it should be excluded from the result set.

For example, the result of the fiddle is "3", but it should only show a count of "1", since there is only 1 metric (id=5) which has no review AND no review with the same start_time value as another review.

CREATE TABLE mpr_metrics (
  id INT,
  start_time TIMESTAMP,
  PRIMARY KEY (id)
);

CREATE TABLE mpr_metrics_reviews (
  id INT,
  metric_id INT,
  PRIMARY KEY (id)
);

INSERT INTO mpr_metrics (id, start_time) values (1, '2024-09-16');
INSERT INTO mpr_metrics (id, start_time) values (2, '2024-09-16');
INSERT INTO mpr_metrics (id, start_time) values (3, '2024-09-09');
INSERT INTO mpr_metrics (id, start_time) values (4, '2024-09-09');
INSERT INTO mpr_metrics (id, start_time) values (5, '2024-09-11');

INSERT INTO mpr_metrics_reviews (id, metric_id) values (1, 1);
INSERT INTO mpr_metrics_reviews (id, metric_id) values (2, 3);

SELECT count(DISTINCT mpr_metrics.start_time) 
FROM mpr_metrics 
LEFT JOIN mpr_metrics_reviews 
    ON mpr_metrics.id = mpr_metrics_reviews.metric_id 
WHERE mpr_metrics_reviews.id is null;

The result of this SELECT is "3", but the result should be "1".

4

Answers


  1. You need another join to exclude rows that have shared start times with another row.

    SELECT COUNT(distinct m.id) AS count
    FROM mpr_metrics AS m
    LEFT JOIN mpr_metrics_reviews AS r
        ON m.id = r.metric_id
    LEFT JOIN (
      SELECT m.id, m.start_time
      FROM mpr_metrics AS m
      JOIN mpr_metrics_reviews AS r
      ON m.id = r.metric_id
    ) AS m1 
        ON m.start_time = m1.start_time
    WHERE r.id IS NULL and m1.id IS NULL
    
    Login or Signup to reply.
  2. Why that anti join trick? Do you think that MySQL is not capable of running straight-forward NOT IN and NOT EXISTS in a reasonable time? This may have been the case in the very early days of the DBMS, but those days are gone.

    You want two NOT EXISTS clauses, because you want metrics …

    • … that have no reviews
    • … that have no start time siblings

    The simple query:

    SELECT COUNT(*) 
    FROM mpr_metrics m
    WHERE NOT EXISTS (SELECT null FROM mpr_metrics_reviews mr
                      WHERE mr.metric_id = m.id)
      AND NOT EXISTS (SELECT null FROM mpr_metrics m2
                      WHERE m2.id <> m.id AND m2.start_time = m.start_time)
    ;
    

    Demo: https://dbfiddle.uk/kghxWWSh

    (The first condition can easily be re-written as NOT IN, which may even be slightly more readable. While the second condition could also be converted into NOT IN, though, I suggest you stick to NOT EXISTS, because the IN clause would be correlated, and we strive to keep [NOT] IN clauses non-correlated for readability.)

    Login or Signup to reply.
  3. You can do something simple like this:

    select count(id) from mpr_metrics where start_time not in (
         select distinct start_time from mpr_metrics where id in (select metric_id from mpr_metrics_reviews)
    );
    

    If the data set is large it would make sense to have an index on the start_time.

    Updated db fiddle here

    Login or Signup to reply.
  4. SELECT COUNT(*)
    FROM mpr_metrics m
    WHERE NOT EXISTS (
      SELECT 1
      FROM mpr_metrics_reviews r
      WHERE r.metric_id = m.id
    )
    AND m.start_time NOT IN (
      SELECT m2.start_time
      FROM mpr_metrics m2
      INNER JOIN mpr_metrics_reviews r2 ON m2.id = r2.metric_id
    );
    
    • NOT EXISTS: Metric has no associated reviews.
    • NOT IN: start_time of metric does not match a start_time of metrics that do have reviews
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search