skip to Main Content

I have two tables – jobs, and jobEntries

jobEntries has a jobId that maps to jobs, but wasn’t set up as an "official" foreign key relationship so I can’t use CASCADE DELETE.

For each job, there are many jobEntries.

I want to delete from jobEntries so there is at most one jobEntry for each job, leaving only the most recent jobEntry (ORDER BY createdAt DESC)

I’ve tried many iterations of nested queries, COUNT(), HAVING, etc. but to no avail.

2

Answers


  1. Well, just FYI I think you couldn’t use CASCADE DELETE because 1) you aren’t deleting jobs, only JobEntries and 2) You aren’t deleting all the JobEntries, only the ones that aren’t most recent.

    I would encourage you to post your attempts (with Count, Having, etc.) as it would be more instructive for you to learn what mistake you made … However, be that as it may, here’s some solutions. They are written in SQL Server syntax, but should be applicable to MySql with some appropriate fixes to the sql syntax.

    Some sample data

    declare @Job table (JobId int, CreatedAt datetime)
    insert into @Job 
    values
        (1, '2023-01-01'),
        (2, '2023-01-15'),
        (3, '2023-01-30')
    
    declare @JobEntry table (JobEntryId int, JobId int, CreatedAt datetime)
    insert into @JobEntry
    values
        (1, 1, '2023-01-01'),
        (2, 1, '2023-01-02'),
        (3, 2, '2023-01-15'),
        (4, 2, '2023-01-16'),
        (5, 2, '2023-01-17')
    

    Approach 1, with a correlated subquery

    delete je
    from
        @Job j
        inner join @JobEntry je
        on j.JobId = je.JobId
        where je.JobEntryId <> (
        select top (1) je2.JobEntryId 
        from @JobEntry je2 
        where je2.JobId = je.JobId 
        order by je2.CreatedAt desc)
    --select * from @JobEntry
    

    Approach 2, with a CTE

    ;with LastJobEntryCreatedAt as (
        select 
            je.JobId, 
            Max(je.CreatedAt) as CreatedAt
        from @JobEntry je
        group by je.JobId)
    delete je
        from @JobEntry je
        left join LastJobEntryCreatedAt le
        on je.JobId = le.JobId
        and je.CreatedAt = le.CreatedAt
        where le.CreatedAt is null
    --select * from @JobEntry
    
    Login or Signup to reply.
  2. We can ignore the Jobs table, since we’re only concerned with data in the jobEntries table. Thus the many-to-many aspect is not relevant.

    We need to know what the most recent job entry for each job is, which we can get with a query like this:

    select jobID, max(createdAt) as maxCreatedAt from jobEntries group by jobID;
    

    However, this is not enough. We need to select all the rows where the createdAt is not the most recent.

    We can do this by joining the jobEntries table to the output of the query above, and then deleting all the rows where the createdAt column is not the maxCreatedAt value for that jobID:

    delete je from jobEntries as je
      join (select jobID, max(createdAt) as maxCreatedAt from jobEntries group by jobID) as maxJE using(jobId)
      where createdAt <> maxCreatedAt;
    

    DBFiddle

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search