Postgresql - UPDATE performance when JOIN produces duplicates rows for columns in SET

potashin
August 11, 2023
228 views
0 votes
2 Answers

Consider the following example:
There are 3 tables: clients (1) -> (n) orders (1) -> (n) order_updates.
I want to update clients based on the corresponding order_updates with JOIN (which produces duplicates for clients data):

UPDATE "clients"
SET "clients"."status" = 'active'
FROM "orders"
JOIN "orders_updates" ON "orders.id" = "order_updates"."order_id"
WHERE "orders"."client_id" = "clients"."id" AND 
      "order_updates"."created_at" > (CURRENT_DATE - INTERVAL '1 day')::date

Is it going to perform the same way (under the hood) as the following statement, which does not produce duplicates for the UPDATE statement?

UPDATE "clients"
SET "clients"."status" = 'active'
WHERE "clients"."id" IN (
  SELECT "orders"."client_id"
  JOIN "orders_updates" ON "orders.id" = "order_updates"."order_id"
  WHERE "orders"."client_id" = "clients"."id" AND 
        "order_updates"."created_at" > (CURRENT_DATE - INTERVAL '1 day')::date
)

Which one is preferable?

Answers

- ValNik
- August 10, 2023 at 3:51 pm
- 0 votes
0
By default, the effectiveness will depend on the available indexes on the tables.

I think the 2nd option is preferable.
Provide the request execution plan(s). They can be analyzed and compared.

Consider another request option.
```
UPDATE "clients"
SET "clients"."status" = 'active'
WHERE EXISTS (
  SELECT "orders"."client_id"
  JOIN "orders_updates" ON "orders.id" = "order_updates"."order_id"
  WHERE "orders"."client_id" = "clients"."id" AND 
        "order_updates"."created_at" > (CURRENT_DATE - INTERVAL '1 day')::date
)
```
Login or Signup to reply.

- mahmoudhossam
- August 11, 2023 at 8:48 pm
- 0 votes
0
Between the two approaches, the second one using the IN clause with a subquery is generally preferable in situations like this. This is because it avoids the possibility of duplicate updates for the same client, which can be both inefficient and potentially lead to incorrect results.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Postgresql – UPDATE performance when JOIN produces duplicates rows for columns in SET

Answers