Postgresql - Updating all rows but one that have same values on certain columns

FrancisDucharme
August 17, 2023
180 views
0 votes
2 Answers

Given the following schema

CREATE TEMP TABLE tmp_table 
(
    id_bucket int2 NOT NULL, 
    id_property int2 NOT NULL,
    v1 varchar(320) NOT NULL,
    v2 varchar(320) NULL
);

And the following data

id_bucket  id_property v1   v2
------------------------------
1          3           OT   A
1          8           RO   C
1          2           OT   C
1          1           RT   D   
1          5           OT   C
1          4           OT   C

I would like that all rows that have the same values for v1 and v2 (in this case v1 = OT, v2 = C) be updated with a value for v2 of NULL except for one, regardless.

I tried grouping by my criterias, but then I lose the details of the rows in question. I know id_property will probably be useful, but when using GROUP BY’s, I don’t have access to that column.

I’ve been pulling my hair out for a while on this one. Any ideas ?

Tags: postgresql sql

Answers

- AsiSalo
- August 17, 2023 at 8:34 pm
- 0 votes
0
"Except for one".
You didn’t actually specify which one, but if you don’t care, you can do something like:
```
UPDATE tmp_table t
SET v2 = NULL
WHERE (id_bucket, v1, v2) IN (
    SELECT id_bucket, v1, v2
    FROM tmp_table
    GROUP BY id_bucket, v1, v2
    HAVING COUNT(*) > 1
)
AND id_property <> (
    SELECT MIN(id_property)
    FROM tmp_table
    WHERE id_bucket = t.id_bucket AND v1 = t.v1 AND v2 = t.v2
);
```
Login or Signup to reply.

- JensSchauder
- August 17, 2023 at 8:40 pm
- 0 votes
0
Your solution are analytic functions aka window functions.
```
UPDATE tmp_table
SET v2 = NULL
WHERE (id_bucket,
       id_property) in -- this assumes these columns form a unique key
    (SELECT id_bucket,
            id_property
     FROM
       (SELECT id_bucket,
               id_property,
               row_number() OVER (
                  PARTITION BY v1, v2) rn -- numbers rows with identical v1 and v2
                                                
        FROM tmp_table
       ) x
     WHERE rn > 1 
    ); -- removes the first row with identical v1, v2
```
Results in

id_bucket id_property v1 v2

1 3 OT A

1 8 RO C

1 2 OT C

1 1 RT D

1 5 OT null

1 4 OT null

See the DB-fiddle for a full example.

The benefit of a solution based on window functions is that the underlying table gets selected only once, while comparable solutions based on group by in general need to select the table at least a second time. This makes window functions much more efficient.
Of course, that is only really relevant when table get large.
Login or Signup to reply.