skip to Main Content

I have two tables, one which I will call D and one which I will call S, with a 1:N relationship between D and S. I currently have the following query that runs on these two:

SELECT d.*, s.*
FROM d
LEFT JOIN s ON s.d_id = d.id
WHERE coalesce(:status, d.status) = d.status

Now, this works well, but I would like to be able to filter on s.block_start and s.block_end, which are date times. That seems straightforward, but my goal is to filter out any row in D that corresponds to a row in S which doesn’t fall within the date time bounds I set. My first thought was to join the above with the following query:

SELECT id
FROM (
    SELECT d.id, MIN(s.block_start), MAX(s.block_end)
    FROM d
    LEFT JOIN s ON s.d_id = d.id
    GROUP BY d.id
) WHERE block_start >= :start AND block_end <= :end

but this seems like a very inefficient query for what I’m trying to do. Does anyone know of the best-practice way to achieve this result?

2

Answers


  1. If you want get only d.id you can take s.d_id (d.id=s.d_id) without JOIN.

    Try

        SELECT s.d_id as id, MIN(s.block_start), MAX(s.block_end)
        FROM s
        WHERE s.block_start >= :start AND s.block_end <= :end
        GROUP BY s.d_id
    

    Case when d.id is exists in D and not exists s.d_id then you will still filter such condition rows by date range(id,null,null). Therefore, join with d is not needed.

    Case when you want some like d.* (several columns from d)
    try

    select * 
    from d
    inner join(
        SELECT s.d_id, MIN(s.block_start), MAX(s.block_end)
        FROM s
        WHERE s.block_start >= :start AND s.block_end <= :end
        GROUP BY s.d_id
     ) s ON s.d_id = d.id
    
    

    There using LEFT or INNER join – the choice depends on the expected result. inner join matches your current query.

    With the growth of N in the ratio d:s (1:N), the relative efficiency of such a query will be better.
    Applying a date range filter before grouping and before join is also preferable, the more so the narrower this filter is.

    Login or Signup to reply.
  2. Your query reduces to a HAVING clause:
    demo at db<>fiddle

    SELECT d.id
    FROM d
    LEFT JOIN s ON s.d_id = d.id
    GROUP BY d.id
    HAVING MIN(s.block_start) >= :start
       AND MAX(s.block_end)   <= :end
    

    And it’s not really that inefficient, as long as you have this type of index in place:

    create index on s(d_id,block_start,block_end);
    

    Thing is, the description of your desired logic doesn’t really match your code, or its reduced version:

    my goal is to filter out any row in D that corresponds to a row in S which doesn’t fall within the date time bounds I set

    Not falling within could be:

    1. Completely disjoined, where it both starts and ends entirely before or entirely after your target range.
    2. Partially but not fully within, where it starts or ends inside the range, but stretches beyond the range.
    3. It’s wider than the entire range, and contains it whole. It doesn’t fall within the range because it’s the range that falls within it.

    Among others. On top of that, you’re aggregating, so the code accepts all d‘s that have at least one s above the lower bound, and another, not necessarily the same, which is under the upper bound.

    There might be a faster index+query setup, depending on what exactly you’re aiming to catch/skip.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search