skip to Main Content

We believe that the S3 checksum is useless for huge files larger than 1GB, as it is a further hashed value of chunks separated by an arbitrary number of bytes.

There is a 1 GB file uploaded to AWS S3.
The SHA256 checksum value is "o9mK1Ay32kIpvW157S40b/2siazR/+tpuz6OYCsjNBU=-2620".

Is there any way to verify that the file locally and the file uploaded to S3 are identical in content?
Without downloading of course.

I am hoping to use the AWS SDK or cli to calculate the same checksum value from the local file

2

Answers


  1. Chosen as BEST ANSWER

    I was able to get all hash values per chunk with the get-object-attributes command. The byte counts are explicitly listed, so I assume they are complete.

    [cloudshell-user ~]$ aws s3api get-object-attributes --bucket xxx --key "xxx.bin"    --object-attributes "Checksum,ObjectParts"
    {
        "LastModified": "2024-07-10T20:56:50+00:00",
        "Checksum": {
            "ChecksumSHA256": "ZQHQKvGsIHCdKb9APtdkmY4RH/FwucuznEoShkrEPsw="
        },
        "ObjectParts": {
            "TotalPartsCount": 3106,
            "PartNumberMarker": 0,
            "NextPartNumberMarker": 1000,
            "MaxParts": 1000,
            "IsTruncated": true,
            "Parts": [
                {
                    "PartNumber": 1,
                    "Size": 5242880,
                    "ChecksumSHA256": "nkyKW7CVp1UjpQX66AHwRp3tMTJmpguNoyz+S5lcwt8="
                },
    略
    

  2. From Checking object integrity – Amazon Simple Storage Service:

    Amazon S3 uses checksum values to verify the integrity of data that you upload to or download from Amazon S3. In addition, you can request that another checksum value be calculated for any object that you store in Amazon S3. You can select from one of several checksum algorithms to use when uploading or copying your data. Amazon S3 uses this algorithm to compute an additional checksum value and store it as part of the object metadata.

    When you upload an object, you can optionally include a precalculated checksum as part of your request. Amazon S3 compares the provided checksum to the checksum that it calculates by using your specified algorithm. If the two values don’t match, Amazon S3 reports an error.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search