skip to Main Content

I am using BULK UPSERT to insert record into new table while data of column is being encrypting by python lib CryptoDome.
here is what I done

def bulk_upsert_query(data, table_name=None):
    values_list = []
    add = values_list.append

    for i in data:
        encrypt_user_id = encryption_fnc(str(i.user_id))

        add(f"({i.id},'{encrypt_user_id}','{i.name}')")

    upsert_sql = r"""insert into upsert_test_table 
                    (id,encrypt_user_id,name)
                   values
                   """ + "n, ".join(values_list) + """ ON CONFLICT ON CONSTRAINT 
                  "upsert_test_table_pkey"
                   DO UPDATE SET
                   name=excluded.name,
                   encrypt_user_id=excluded.user_id"""
    return upsert_sql

data = data from my old table that have user_id as int(ex: 124,345,786)
in upsert_test_table table id column is primary key column.
this bulk_upsert_query fnc will encrypt int user_id and append to values_list then create a upsert query.

As per insertion it works as I expected, but when it comes for update if conflict with id column then as you can see I set encrypt_user_id=excluded.user_id for user_id column,
it update existing encrypted user_id with int user_id(from old table), because I did not put any encryption fnc here,
Update: user_id column is changeable

So what I want is I want to call my encryption_fnc in DO UPDATE SET section,
while it update I also want to do encryption.

Can any one tell me who can I achieve this?

Note: I can do encryption in database site by using pgcrypto but my requirement is do encryption in python side not database side.

2

Answers


  1. Chosen as BEST ANSWER

    Solved: Sorry I was stupid, I thought excluded table is original(input payload) table, but when I replaced excluded.user_id with excluded.encrypt_user_id and it works, although have no idea how excluded table works.

    upsert_sql = r"""insert into upsert_test_table 
                        (id,encrypt_user_id,name)
                       values
                       """ + "n, ".join(values_list) + """ ON CONFLICT ON CONSTRAINT 
                      "upsert_test_table_pkey"
                       DO UPDATE SET
                       name=excluded.name,
                       encrypt_user_id=excluded.encrypt_user_id"""
    

  2. As per the comment:
    there is no user_id column in that insert payload, so excluded.user_id is null.

    ON CONFLICT...DO UPDATE SET encrypt_user_id=excluded.user_id is inserting a null rather than the old, unencrypted user_id: demo

    CREATE TABLE upsert_test_table 
      (id SMALLSERIAL,
      encrypt_user_id TEXT,
      name TEXT,
      user_id INT,
      CONSTRAINT "upsert_test_table_pkey" PRIMARY KEY (id));
    INSERT INTO upsert_test_table VALUES 
      (1,'enc1','Bob',11);
    
    INSERT INTO upsert_test_table (id,encrypt_user_id,name) VALUES
      (1,'enc101','Bob2'),--conflict here
      (2,'enc2','Ted')
    ON CONFLICT ON CONSTRAINT "upsert_test_table_pkey"
    DO UPDATE SET 
      name = excluded.name,              --name will be overwritten by incoming data
      encrypt_user_id = excluded.user_id;--null because user_id not in incoming data
    
    SELECT * FROM upsert_test_table;
    -- id | encrypt_user_id | name | user_id
    ------+-----------------+------+---------
    --  1 |                 | Bob2 |      11
    --  2 | enc2            | Ted  |
    

    From the documentation:

    The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the existing row using the table’s name (or an alias), and to the row proposed for insertion using the special excluded table.

    The issue here was that excluded was misunderstood and instead used as the preexisting data found already in the table upon conflict, while it’s actually the data that was coming in, as the payload of that insert.

    It should be enough to correct that one misinterpretation and refer to new/incoming/inserted value as excluded, old/existing by the target table name and column:

    upsert_sql = r"""insert into upsert_test_table 
                    (id,encrypt_user_id,name)
                   values
                   """ + "n, ".join(values_list) + """ ON CONFLICT ON CONSTRAINT 
                  "upsert_test_table_pkey"
                   DO UPDATE SET
                   name = upsert_test_table.name,             --keep the old name
                   encrypt_user_id = excluded.encrypt_user_id --overwrite with new
                  """
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search