skip to Main Content

I am collecting a csv file from a server.
On the server the daily file gets rows of data appended to it.

I need to update my data-owner every ten minutes with ONLY the new data (if any).

So I need to collect the csv file, compare it with the previously collected version to see if there are any new lines in it, and then, if there are, send them to the data-owner.

I am using PHP.
How would you go about this?
Would you count the number of rows?
The first field is a timestamp – so that could be useful ?

This is what the rows look like:

"2023-08-01 05:54:18","Lolla","[email protected]", .... 20 fields more
"2023-08-01 17:44:27","Dave","[email protected]", .... 20 fields more
"2023-08-01 17:42:23","John","[email protected]", .... 20 fields more

Any suggestions much appreciated !

2

Answers


  1. If the couple timestamp-user name is a uniq identifier, you can run join to extract only not paired rows.

    If you have in example

    "2023-08-01 05:54:18","Lolla"
    "2023-08-01 17:44:27","Dave"
    "2023-08-01 17:42:23","John"
    

    and then

    "2023-08-01 05:54:18","Lolla"
    "2023-08-01 17:44:27","Dave"
    "2023-08-01 17:42:23","John"
    "2023-08-01 17:42:28","Pam"
    "2023-08-01 19:38:15","John"
    

    You can use Miller and run

    mlr --csv -N join --ur --np -j 1,2 -f first.csv second.csv
    

    to get

    2023-08-01 17:42:28,Pam
    2023-08-01 19:38:15,John
    
    • -j to set 1 & 2 fields as keys
    • --ur to emit unpaired records from the right file
    • --np to not emit paired records
    Login or Signup to reply.
  2. A sure way would be to save the last known line, search for it in the current file and get the new content after it:

    <?php
    
    $file = '/some/watched/file';
    $tail = './file_with_the_last_line';
    
    $last_line = @file_get_contents($tail);
    
    copy($file, $tail);
    
    $fp = fopen($tail,'r+');
    
    if ( $last_line !== false ) {
        do {
            $line = fgets($fp); 
            $found = ($line === $last_line); 
        } while ( !$found && $line !== false );
        if ( !$found )
            rewind($fp);
    }
    
    $new_content = stream_get_contents($fp);
    
    if ( $new_content !== "" ) {
        $pos = strrpos($new_content, "n", -2);
        $last_line = false === $pos ? $new_content : substr($new_content, $pos+1);
    }
    
    ftruncate($fp, 0);
    rewind($fp);
    fwrite($fp, $last_line);
    fclose($fp);
    
    # NOW DO WHAT YOU WANT WITH $new_content
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search