skip to Main Content

I have a cron job set that moves the files from an EC2 instance to S3

aws s3 mv --recursive localdir s3://bucket-name/ --exclude "*" --include "localdir/*"

After that I use aws s3 sync s3://bucket-name/data1/ E:Datafolder in .bat file and run task scheduler in Windows to run the command.

The issue is that s3 sync command copies all the files in /data1/ prefix.

So let’s say I have the following files:

Day1: file1 is synced to local.
Day2: file1 and file2 are synced to local because file1 is removed from the local machine’s folder.

I don’t want them to occupy space on local machine. On Day 2, I just want file2 to be copied over.

Can this be accomplished by AWS CLI commands? or do I need to write a lambda function?

I followed the answer from Get last modified object from S3 using AWS CLI

but on Windows, the | and awk commands are not working as expected.

2

Answers


  1. Chosen as BEST ANSWER

    Modified answer to work with Windows .bat file. Uses Windows cmd.exe

    for /f "delims=" %%i in ('aws s3api list-objects-v2 --bucket BUCKET-NAME --prefix data1/ --query "sort_by(Contents, &LastModified)[-1].Key" --output text') do set object=%%i
    aws s3 cp s3://BUCKET-NAME/%object% E:Datafolder
    

  2. To obtain the name of the object that has the most recent Last Modified date, you can use:

    aws s3api list-objects-v2 --bucket BUCKET-NAME --query 'sort_by(Contents, &LastModified)[-1].Key' --output text
    

    Therefore (using shell syntax), you could use:

    object=`aws s3api list-objects-v2 --bucket BUCKET-NAME --prefix data1/ --query 'sort_by(Contents, &LastModified)[-1].Key' --output text`
    
    aws s3 cp s3://BUCKET-NAME/$object E:Datafolder
    

    You might need to tweak it to get it working on Windows.

    Basically, it gets the bucket listing, sorts by LastModified, then grabs the name of the last object in the list.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search