skip to Main Content

I have two tables:

TESTS:

ID YEAR DESCRIPTION
0 2023 some text
..
N 2024 some text

test_prep_materials:

ID PATH TID
0 /home/user/file0.pdf 0
..
M /home/user/fileM.pdf 0

So any test have prep materials. And some test’s have same material.
I want to delete all material’s for 2023 tests and DELETE FILES FROM DISK. But 2024 test’s may have same path (SAME FILE ON DISK).
So i need to pick all prep materials for 2023 tests and exclude from them all path’s which have dependencies in 2024 tests.

I have tried the following, but there are too many results to know if the command was executed correctly.

SELECT `test_prep_materials`.`PATH`, `tests`.`id`, `tests`.`year`, `tests`.`Name_t` 
    from `test_prep_materials` 
    JOIN `tests` ON `test_prep_materials`.`tId` = `tests`.`id` 
    WHERE `tests`.`year` = 2023 
EXCEPT
SELECT `test_prep_materials`.`PATH`, `tests`.`id`, `tests`.`year`, `tests`.`Name_t` 
    from `test_prep_materials` 
    JOIN `tests` ON `test_prep_materials`.`tId` = `tests`.`id` 
    WHERE `tests`.`year` <> 2023 

Can somebody help me understand if my Query right or i am missing something?

2

Answers


  1. It looks like you need paths to delete from past years tests that are not used in any test of this year. If that is the case that means you need paths used in last year and last year only – use Max(YEAR) = Min(YEAR) per PATH and < Max(YEAR) over all:

    WITH    --  S a m p l e    D a t a :
        tests ( ID, YEAR, DESCRIPTION ) AS 
            ( Select 0, 2023, 'some text 0' Union All
              Select 1, 2024, 'some text 1' Union All
              Select 2, 2023, 'some text 2' Union All
              Select 3, 2024, 'some text 3' Union All
              Select 4, 2024, 'some text 4' 
            ),
        test_prep_materials ( ID, PATH, TID ) AS 
            ( Select 100, '/home/user/file0.pdf', 0 Union All 
              Select 101, '/home/user/file0.pdf', 1 Union All 
              Select 102, '/home/user/file3.pdf', 1 Union All 
              Select 103, '/home/user/file3.pdf', 2 Union All 
              Select 104, '/home/user/file4.pdf', 3 Union All 
              Select 105, '/home/user/file4.pdf', 2 Union All 
              Select 106, '/home/user/file4.pdf', 1 Union All 
              Select 107, '/home/user/file5.pdf', 4 Union All 
              Select 108, '/home/user/file6.pdf', 0 
           )
    
    --    S Q L :
    Select     tpm.PATH, Max(t.YEAR) as MAX_YEAR, Min(t.YEAR) as MIN_YEAR
    From       tests t 
    Left Join  test_prep_materials tpm ON(tpm.TID = t.ID)
    Group By   tpm.PATH
    HAVING     Max(t.YEAR) < ( Select Max(YEAR) From tests )
    
    /*    R e s u l t : 
    PATH                    MAX_YEAR  MIN_YEAR
    ----------------------  --------  --------
    /home/user/file6.pdf        2023      2023    */
    

    OR with all the data from tests table:

    --    S Q L :
    SELECT  t.*, tpm.PATH
    FROM    tests t
    INNER JOIN ( Select     tpm.PATH, Max(t.YEAR) as MAX_YEAR, Min(t.YEAR) as MIN_YEAR
                 From       tests t 
                 Left Join  test_prep_materials tpm ON(tpm.TID = t.ID)
                 Group By   tpm.PATH
                 HAVING     Max(t.YEAR) < ( Select Max(YEAR) From tests )
              ) tpm ON( t.YEAR Between tpm.MIN_YEAR And tpm.MAX_YEAR )
    Where   t.YEAR < ( Select Max(YEAR) From tests )
    
    /*    R e s u l t :
    ID  YEAR  DESCRIPTION  PATH
    --  ----  -----------  ------------------------
    0   2023  some text 0  /home/user/file6.pdf
    2   2023  some text 2  /home/user/file6.pd      */
    

    NOTE:
    If you want to reffer to any specific year within a range (like 2021 till 2026) change the subquery selecting Max(YEAR) over all with the specific year – lets say 2025 – and the result should be paths used before 2025 and not in 2025. That was not in your sample data, though.

    ...
    HAVING     Max(t.YEAR) <  2025
    ...
    

    This can be tuned even more with different Where clause conditions.

    Login or Signup to reply.
  2. You may use following query, it’ll eliminate the common path across both the years & gives you the path that you can delete.

    select *from test_prep_materials tpm inner join tests t
    on t.id = tpm.tid
    where t.year = '2023'
    and exists (select path from test_prep_materials tpm2 inner join tests t
                    on t.id=tpm.id and t.year = 2024  -- t.year != 2023
                    where tpm2.path =tpm.path
               )
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search