PostgreSQL Upsert With Multiple Constraints

adonig
February 14, 2023
134 views
1 vote
2 Answers

I’m trying to do an upsert on a table with two constraints. One is that the column a is unique, the other is that the columns b, c, d and e are unique together. What I don’t want is that a, b, c, d and e are unique together, because that would allow two rows having the same value in column a.

The following fails if the second constraint (unique b, c, d, e) is violated:

INSERT INTO my_table (a, b, c, d, e, f, g)
SELECT (a, b, c, d, e, f, g)
FROM my_temp_table temp
ON CONFLICT (a) DO UPDATE SET
    a=EXCLUDED.a,
    b=EXCLUDED.b,
    c=EXCLUDED.c,
    d=EXCLUDED.d,
    e=EXCLUDED.e,
    f=EXCLUDED.f,
    g=EXCLUDED.g;

The following fails if the first constraint (unique a) is violated:

INSERT INTO my_table (a, b, c, d, e, f, g)
SELECT (a, b, c, d, e, f, g)
FROM my_temp_table temp
ON CONFLICT ON CONSTRAINT my_table_unique_together_b_c_d_e DO UPDATE SET
    a=EXCLUDED.a,
    b=EXCLUDED.b,
    c=EXCLUDED.c,
    d=EXCLUDED.d,
    e=EXCLUDED.e,
    f=EXCLUDED.f,
    g=EXCLUDED.g;

How can I bring those two together? I first tried to define a constraint that says "either a is unique or b, c, d and e are unique together" but it looks like that isn’t possible. I then tried two INSERT statements with WHERE clauses making sure that the other constraint doesn’t get violated, but there is a third case where a row might violate both constraints at the same time. To handle the last case I considered dropping one of the constraints and creating it after the INSERT, but isn’t there a better way to do this?

I tried this, but according to the PostgreSQL documentation it can only DO NOTHING:

INSERT INTO my_table (a, b, c, d, e, f, g)
SELECT (a, b, c, d, e, f, g)
FROM my_temp_table temp
ON CONFLICT DO UPDATE SET
    a=EXCLUDED.a,
    b=EXCLUDED.b,
    c=EXCLUDED.c,
    d=EXCLUDED.d,
    e=EXCLUDED.e,
    f=EXCLUDED.f,
    g=EXCLUDED.g;

I read in another question that it might work using MERGE in PostgreSQL 15 but sadly it’s not available on AWS RDS yet. I need to find a way to do this using PostgreSQL 14.

Answers

Chosen as BEST ANSWER

I tried to use triggers to create a constraint that combines the two requirements (unique a and unique-together b, c, d and e) but it didn't work because "constraint in ON CONFLICT clause has no associated index". I then tried to create an index but I couldn't get it to work. So I played around a bit with BEFORE and AFTER INSERT triggers but found no performant way to get it working. I then came up with a solution that relies on a FULL OUTER JOIN of the table with the new data I want to upsert and the table with the old data. For my use case it is performant enough because my tables only have around 150k rows each and I will switch to a MERGE with PostgreSQL 15 anyway when it becomes available. Note that I had to emulate a full outer join using a union of a left and a right join, because "FULL JOIN is only supported with merge-joinable or hash-joinable join conditions".

    -- Create a full outer join of the new and the old table.
    CREATE TEMPORARY TABLE my_merge_table AS
    SELECT old.id AS old_id, new.id AS new_id
    FROM my_temp_table new
    LEFT JOIN my_table old
    ON new.a = old.a OR (
        new.b = old.b AND
        new.c = old.c AND
        new.d = old.d AND
        new.e = old.e
    )
    UNION ALL
    SELECT old.id AS old_id, new.id AS new_id
    FROM my_temp_table new
    RIGHT JOIN my_table old
    ON new.a = old.a OR (
        new.b = old.b AND
        new.c = old.c AND
        new.d = old.d AND
        new.e = old.e
    )
    WHERE new.id IS NULL;

    -- Insert new rows that don't match old rows. 
    WITH my_merge AS (
        SELECT new_id AS id
        FROM my_merge_table
        WHERE old_id IS NULL 
    )
    INSERT INTO my_table (a, b, c, d, e, f, g)
    SELECT (a, b, c, d, e, f, g)
    FROM my_temp_table} temp
    JOIN my_merge
    ON temp.id = my_merge.id;

    -- Update old rows that match new rows. 
    WITH my_merge AS (
        SELECT *
        FROM my_merge_table
        WHERE new_id IS NOT NULL AND old_id IS NOT NULL 
    )
    UPDATE my_table SET
        a = temp.a,
        b = temp.b,
        c = temp.c,
        d = temp.d,
        e = temp.e,
        f = temp.f,
        g = temp.g,
    FROM my_temp_table temp
    JOIN my_merge
    ON temp.id = my_merge.new_id
    WHERE my_merge.old_id = my_table.id;

    -- Delete old rows that don't match new rows. 
    WITH my_merge AS (
        SELECT old_id AS id
        FROM my_merge_table
        WHERE new_id IS NULL
    )
    DELETE FROM my_table
    USING my_merge
    WHERE my_merge.id = my_table.id;

(Edit)

- Tometzky
- February 15, 2023 at 11:28 am
- 0 votes
0
I think what you need is a somewhat different design. I suppose "a" is a surrogate key and b,c,d,e,f,g make up the natural key. And I suppose there are other columns, that are the data.

So force column "a" to be automatically generated, like this:
```
CREATE TEMP TABLE my_table(
  a bigint GENERATED ALWAYS AS IDENTITY,
  b bigint NOT NULL,
  c bigint NOT NULL,
  d bigint NOT NULL,
  e bigint NOT NULL,
  f bigint NOT NULL,
  g bigint NOT NULL,
  data text,
  CONSTRAINT my_table_unique_together_b_c_d_e UNIQUE (b,c,d,e,f,g)
);
```
And then just skip the a column from your insert:
```
INSERT INTO my_table (b, c, d, e, f, g)
SELECT (b, c, d, e, f, g)
FROM my_temp_table temp
ON CONFLICT ON CONSTRAINT my_table_unique_together_b_c_d_e DO UPDATE SET
  data=EXCLUDED.data;
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.