I need to update the row with the CSV filename and the time data was inserted into the database.
I can use the below code to insert data from CSV into the database:
with open('my.csv', 'r') as f:
next(f)
cur.copy_from(f, 'csv_import', sep=',')
but for my requirement along with the csv data, there are 2 more columns which needs to be updated.
- filename of the csv file
- timestamp when data was loaded
how can we achieve this?
3
Answers
The timestamp values can be done with triggers and teh CSV file would need to be updated from the operational system ,if it is linux could be added to the execution plan batch file or cronschedule.
Trigger :
and command line to be executed on the file location :
if it is out from the server you can install the psql of current postgresql version and run the command line from the operation system being used
Setup:
Code to add file name and timestamp:
Sorry, gonna stick to pseudo-code (hopefully got the right raw sql syntax, don’t usually write sql raw).
What I have often seen done is to use 2 tables, one a staging, import table, one the real table.
So…
Don’t try to get fancy with
*
on the right side to skip the field list – that depends on column order in db, which may change with Alter Tables later. Been there, done that – picking up after someone did just that.This supports higher volumes easily and does not clutter your db with triggers. You can also reuse the staging table to get feeds from elsewhere, say an email gateway.