skip to Main Content

I’m looking to add a lifecycle rule to delete all objects inside an S3 bucket after a certain number of days. I get the following error when I execute my code. Could this be due to the target prefix being null??

Error:

com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was not well-formed or did not validate against our published schema (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 4FR7C3BE85YVEW57; S3 Extended Request ID: fcYaw7u//7o843GjDtGGIQRjYxAMbn7f1iepEIas/Yt5bybM9BjDZ0JbG+SVz/vvE1k/KjaKadQ=; Proxy: null), S3 Extended Request ID: fcYaw7u//7o843GjDtGGIQRjYxAMbn7f1iepEIas/Yt5bybM9BjDZ0JbG+SVz/vvE1k/KjaKadQ=

Code used for creating and setting lifecycle rule to bucket configuration:

BucketLifecycleConfiguration.Rule rule = new BucketLifecycleConfiguration.Rule()
                        .withId("Delete objects in " + expirationInDays + " days")
                        .withFilter(new LifecycleFilter(new LifecyclePrefixPredicate("")))
                        .withExpirationInDays(expirationInDays)
                        .withExpiredObjectDeleteMarker(true)
                        .withStatus(BucketLifecycleConfiguration.ENABLED);

                configuration = new BucketLifecycleConfiguration()
                        .withRules(Collections.singletonList(rule));
                s3Client.setBucketLifecycleConfiguration(bucketName, configuration);

Debug output

2

Answers


  1. I’d recommend offloading the management of file expiry to AWS, you can use the built-in feature of S3 called AWS S3 lifecycle policies. You can use this lifecycle policies to:

    • Move objects from Standard tier to Less expensive tiers like Glacier, Standard IA
    • Most importantly, you can decide to delete objects after a certain number of days.

    You can implement this using any IAC tools but for the sake of sharing an example, here is the Terraform version of it:

    resource "aws_s3_bucket" "bucket" {
      bucket = "my-bucket"
    }
    
    resource "aws_s3_bucket_acl" "bucket_acl" {
      bucket = aws_s3_bucket.bucket.id
      acl    = "private"
    }
    
    resource "aws_s3_bucket_lifecycle_configuration" "bucket-config" {
      bucket = aws_s3_bucket.bucket.id
    
      rule {
        id = "log"
    
        expiration {
          days = 90
        }
    
        status = "Enabled"
      }
    }
    

    By using the built-in expiry policies of S3, you get the following advantages:

    • Save development and testing time spent on expiring objects
    • Can make sure that objects do really expire
    • Less code means less stuff to maintain
    • Speed to market
    • Spend the time building more important features to your project
    Login or Signup to reply.
  2. I was able to get this to work with the current Java SDK v2, as follows. The code basically creates a default, empty filter rule instead of a prefix filter on "". I’d assume that you can do the same with SDK v1 (though ideally you would no longer be using SDK v1, because it’s 5 years old).

    import software.amazon.awssdk.regions.Region;
    import software.amazon.awssdk.services.s3.S3Client;
    import software.amazon.awssdk.services.s3.model.BucketLifecycleConfiguration;
    import software.amazon.awssdk.services.s3.model.ExpirationStatus;
    import software.amazon.awssdk.services.s3.model.LifecycleRuleFilter;
    import software.amazon.awssdk.services.s3.model.LifecycleRule;
    import software.amazon.awssdk.services.s3.model.LifecycleExpiration;
    import software.amazon.awssdk.services.s3.model.S3Exception;
    import software.amazon.awssdk.services.s3.model.PutBucketLifecycleConfigurationRequest;
    
    import java.util.Collections;
    
    public class App {
    
         public static void setLifecycle(S3Client s3, String bucketName) {
    
            try {
                // Create a rule to delete all objects after 30 days.
                LifecycleRule rule = LifecycleRule.builder()
                        .id("Delete after 30 days rule")
                        .filter(LifecycleRuleFilter.builder().build())
                        .expiration(LifecycleExpiration.builder().days(30).build())
                        .status(ExpirationStatus.ENABLED)
                        .build();
    
                BucketLifecycleConfiguration lifecycleConfiguration = BucketLifecycleConfiguration.builder()
                        .rules(Collections.singletonList(rule))
                        .build();
    
                PutBucketLifecycleConfigurationRequest request = PutBucketLifecycleConfigurationRequest.builder()
                        .bucket(bucketName)
                        .lifecycleConfiguration(lifecycleConfiguration)
                        .build();
    
                s3.putBucketLifecycleConfiguration(request);
            } catch (S3Exception e) {
                System.err.println(e.awsErrorDetails().errorMessage());
                System.exit(1);
            }
        }
    
        public static void main(String[] args) {
            Region region = Region.US_EAST_1;
            S3Client s3 = S3Client.builder().region(region).build();
            setLifecycle(s3, "mybucket");
            s3.close();
        }
    }
    

    I came to this conclusion because, after manually configuring a similar lifecycle in the AWS Console, the awscli description of that lifecycle configuration looked like this:

    aws s3api get-bucket-lifecycle-configuration --bucket mybucket
    {
        "Rules": [
            {
                "Expiration": {
                    "Days": 30
                },
                "ID": "del30",
                "Filter": {},
                "Status": "Enabled"
            }
        ]
    }
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search