skip to Main Content

I have two tables (see below) in a postgreSQL database.
Table1 tracks all changes made to each ID (i.e. multiple rows per ID):

create table table1(id,record,name,data1,data2,data3)as values
 (1,1,'Dave',  1.2,55,1)
,(1,2,'David', 1.2,55,1)
,(2,1,'Sam',    .8,30,2)
,(3,1,'Jenn',   .9,28,2)
,(4,1,'Arthur',1.1,77,1)
,(5,1,'Jim',    .7,42,2)
,(5,2,'Jimmy',  .7,42,2)
,(5,3,'James',  .7,42,2)
,(6,1,'Sue',   1.3,32,2)
,(6,2,'Susan', 1.3,32,2);

Table2 contains only one row per ID:

create table table2(id,record,name,data1,data2,data3)as values
 (1,1,'Dave',  1.2,55,1)
,(2,1,'Sam',    .8,30,2)
,(3,1,'Jenn',   .9,28,2)
,(4,1,'Arthur',1.1,77,1)
,(5,2,'Jimmy',  .7,42,2)
,(6,1,'Sue',   1.3,32,2);

I need an SQL query to update Table2 based on the highest Record value for each ID in Table1 and also updating all other fields as well (data1, data2, data3 in this example).
These tables are just examples but my actual data has many columns, so I would like to use some sort of wild card to include all fields in the update.

Relatively new to SQL in postgreSQL but here is my attempt at getting the most recent record version returned, but not sure how to update Table2 from there. I have had a few attempts but nothing work so probably not worth posting that code.

  select * from
    Table1 t1
  where
    (ID,Record) in 
    (
      select
        ID,
        MAX(Record)
      from
        Table1 t1
      group by
        ID
     )

ExpectedResult:

id record name data1 data2 data3
1 2 David 1.2 55 1
2 1 Sam 0.8 30 2
3 1 Jenn 0.9 28 2
4 1 Arthur 1.1 77 1
5 3 James 0.7 42 2
6 2 Susan 1.3 32 2

fiddle

2

Answers


  1. We get the maximum record for each ID in the CTE (hgh) and use that in the associated UPDATE statement to update the respective columns:

    WITH hgh AS (
        SELECT  ID, MAX (Record) AS max_record
        FROM    Table1
        GROUP BY ID
    )
    UPDATE   Table2 t2
      SET Data1 = t1.Data1,
          Data2 = t1.Data2,
          Data3 = t1.Data3,
          Record = t1.Record
    FROM hgh h
       JOIN Table1 t1
           ON (   t1.ID = h.ID
              AND t1.record = h.max_record
              )
    WHERE  t2.ID = t1.ID
    
    Login or Signup to reply.
  2. The best way to achieve this is with a cte that makes use of distinct on. Something along the following lines:

    with cte as 
    (SELECT DISTINCT ON (id) id, record, name, data1, data2, data3
    from table1 ORDER BY id, record desc)
    UPDATE table2 t
    SET record = cte.record, data1 = cte.data1, data2 = cte.data2, data3 = cte.data3
    FROM cte 
    WHERE cte.id = t.id;
    

    What distinct on gives you, is the ability to limit the resultset to just one line per column (or columns) that you are interested in. The order by, which must start with the columns in the distinct on, determines from the remaining data, what row is returned. Ordering by record descending is therefore the same as getting that row which has the MAX(record) for that id.

    The update simply joins the cte to the target table and updates the values accordingly.

    I am afraid that I know of no shortcut, which will mean that you do not need to specify each column!

    For future reference, please supply sample data by way of insert statements. It saves us having to do the work to test.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search