skip to Main Content

I have an S3 bucket which does not have versioning or lifecycle rules set due to a client decision and it contains data from as old as 10 years. However, we also want to keep a backup of the files that has been worked on in the last 30 days.

I am planning to create a new S3, turn on versioning and set a lifecycle rule to delete files older than 30 days. After that I will run a cronjob to do the aws s3 sync from the source S3 to the destination S3.

So, the files who are older than 30 days will get deleted from the destination s3. Which is fine. However, my concern is while doing the aws s3 sync command after that, it will restore the old files to the destination which were deleted. Is that correct? If so, how to resolve this and only keep the files for 30 days only?

2

Answers


  1. This isn’t possible with s3 sync, but it would be a perfect use case for S3 bucket Event notifications

    You can use the Amazon S3 Event Notifications feature to receive notifications when certain events happen in your S3 bucket. To enable notifications, add a notification configuration that identifies the events that you want Amazon S3 to publish. Make sure that it also identifies the destinations where you want Amazon S3 to send the notifications. You store this configuration in the notification subresource that’s associated with a bucket.

    https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html

    The idea would be to have an event notification on new file creation or file modification on bucket A. Then, have this event notification trigger a lambda function which copies the file from bucket A to bucket B.

    Login or Signup to reply.
  2. You could build your own replication.

    • Create an Amazon S3 Event on the S3 bucket that triggers an AWS Lambda function whenever a new object is created
    • Code the AWS Lambda function to copy the object to the other bucket

    This way, you will not require Versioning and you can also add logic to only copy objects in particular paths or with particular extensions (eg just .csv files).

    The code is very simple, see: AWS-Lambda function (Python) to copy file from S3 – perform manipulation – store output in another S3 – Stack Overflow

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search