I am collecting a csv file from a server.
On the server the daily file gets rows of data appended to it.
I need to update my data-owner every ten minutes with ONLY the new data (if any).
So I need to collect the csv file, compare it with the previously collected version to see if there are any new lines in it, and then, if there are, send them to the data-owner.
I am using PHP.
How would you go about this?
Would you count the number of rows?
The first field is a timestamp – so that could be useful ?
This is what the rows look like:
"2023-08-01 05:54:18","Lolla","[email protected]", .... 20 fields more
"2023-08-01 17:44:27","Dave","[email protected]", .... 20 fields more
"2023-08-01 17:42:23","John","[email protected]", .... 20 fields more
Any suggestions much appreciated !
2
Answers
If the couple timestamp-user name is a uniq identifier, you can run join to extract only not paired rows.
If you have in example
and then
You can use Miller and run
to get
-j
to set 1 & 2 fields as keys--ur
to emit unpaired records from the right file--np
to not emit paired recordsA sure way would be to save the last known line, search for it in the current file and get the new content after it: