skip to Main Content

For each row in table line there can be many rows in table line_version. I need to find only N (let’s say 2) "latest" rows for each line row. Latest here means the largest IDs.

For example, that’s how line_version table rows look like (line_id references the line table):

  id   | line_id 
-------+---------
 10    |   1
 20    |   1
 30    |   1
 40    |   1
 50    |   2
 60    |   2
 70    |   2

Looking for the 2 latest line_version row IDs per line row, it should be (30, 40, 60, 70).

Globally, I can get 2 latest ones by simply doing:

SELECT id from line_version ORDER BY id DESC LIMIT 2

But how can I get the latest 2 per line_id?

2

Answers


  1. YOu can use the below with condition rownumber>=count1-(N-1).The sql fiddle here.For last row you can use rownumber>=count1.last two rows rownumber>=count1-1,three rows rownumber>=count1-2

    with data as (select 
    
         10 id   ,  1 line_id union all
         select 20    ,   1 union all
         select 30    ,   1 union all
         select 40    ,   1 union all
         select 50    ,  2 union all
         select 60    ,   2 union all
        select  70    ,   2),
        subset as (select a.*,ROW_NUMBER() over(partition by line_id order by id ) rownumber,count(id) over(partition by line_id) count1 from data a)
        select id,line_id from subset where rownumber>=count1-1;
    

    To improve the performance you can do away with the count1 with the below using rownumber<=N.

            with data as (select 
    
         10 id   ,  1 line_id union all
         select 20    ,   1 union all
         select 30    ,   1 union all
         select 40    ,   1 union all
         select 50    ,  2 union all
         select 60    ,   2 union all
        select  70    ,   2),
        subset as (select a.*,ROW_NUMBER() 
        over(partition by line_id order by id desc) 
        rownumber from data a)
        select id,line_id from subset 
        where rownumber<=2 order by line_id,id;
    
    Login or Signup to reply.
  2. If there are substantially more versions than just the two per line, avoid a sequential scan over line_version. This query using a LATERAL subquery does just that, and should be much more efficient:

    SELECT lv.*
    FROM   line l
    CROSS  JOIN LATERAL (
       SELECT lv.id
       FROM   line_version lv
       WHERE  lv.line_id = l.line_id
       ORDER  BY lv.id DESC NULLS LAST
       LIMIT  2  -- your pick!
       ) lv
    -- ORDER BY ???
    

    fiddle

    I added NULLS LAST mainly to defend against possible null values, which you haven’t ruled out. See:

    Be sure to have an index on line_version with line_id as leading index field. Ideally:

    CREATE INDEX line_version_line_id_id_idx ON line_version(line_id DESC NULLS LAST) INCLUDE (id);
    

    See:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search