Idea: need to run a daily cron job that upsert+delete a bunch of data to/from a target table.
The data is in a txt file; it contains the newest version of the data that the target table should eventually have. Therefore size of data in the txt file is similar to the size of the table, because the txt file is a mirror of the table but with some changes.
Let’s say the target table has this following data:
And the txt file has the following data:
In this case we should delete row 0 from target table because it doesn’t exist in txt file (newest data) anymore, and update row 1 because his job has changed, and insert row 3 because it’s new data that doesn’t exist in table but exist in txt file.
Essentially I wanna merge (upsert+delete) data from txt file into target table.
Assuming the target table has 1 million rows of data, the txt file also has approximate number of rows. How to efficiently merge data from txt file to target table?
Currently I am reading data from txt file in a pandas dataframe, and use insert on conflict update
to do upsert, and use a delete
statement to remove rows from table that doesn’t exist in dataframe. But it’s doing a million row update, which isn’t ideal. How to efficient update the table so that only the updated/new data get updated/inserted in the table? So if only 10 rows are updated, we should only update those 10 rows.
2
Answers
One way to do this is to use the
WHERE
clause ofDO UPDATE
:The other way is to use the predefined
suppress_redundant_updates_trigger()
trigger function.The first solution will probably perform better, since it avoids calling a trigger function, but the second solution will be simpler.
Your description basically describes replacing everything in the
target_table
with the contents of thetxt file
. The quickest way may be to literally do that. Something like:You will need to translate the above to your specific script/language and probably use an absolute path to the file. See example here. Since I cannot issue
copy
in a fiddle I substitute text_table fortxt file
. ?