skip to Main Content

I’m using MinIO storage, but I’m having trouble understanding how the ETag is generated when the fileSize is equal to the partSize.

To illustrate, I created three empty files using the following commands:

dd if=/dev/zero of=5242879 bs=1 count=5242879
dd if=/dev/zero of=5242880 bs=1 count=5242880
dd if=/dev/zero of=5242881 bs=1 count=5242881

I manually set the partSize to 5242880 using the upload parameters.

When I upload these files to Minio via my application, here’s what I get:

result fileSize expected received
5242879 7c668eb59d6f0141a7863774100bfbcc 7c668eb59d6f0141a7863774100bfbcc
5242880 81485c0e873d222199469076b60f30e9-1 1faefa44d632065e8bd4fd2a462ed292-2
5242881 92f3a08aa3b1d7eb318ab9c2fc4a6ec3-2 92f3a08aa3b1d7eb318ab9c2fc4a6ec3-2

Can you help me understand what I might be missing here?

2

Answers


  1. This is a client side issue. The default part size in the case of mc is 16MiB

    $ truncate -s 16M /tmp/16m

    $ ~/mc cp /tmp/16m play/test/123
    /tmp/16m: 16.00 MiB / 16.00 MiB ━━━━━━ 2.67 MiB/s 5s

    $ ~/mc stat play/test/123
    ‘Name : 123
    Date : 2023-09-12 12:04:35 PDT
    Size : 16 MiB
    ETag : 3692065fc617ab1acea6dc7e886dc48a-1
    Type : file
    Metadata :
    Content-Type: application/octet-stream

    I also did the following using mc

    $ truncate -s 5242880 /tmp/testobject

    $ MC_UPLOAD_MULTIPART_SIZE=5242880 ~/mc cp /tmp/testobject play/test/
    /tmp/testobject: 5.00 MiB / 5.00 MiB ━━━━━━━ 2.67 MiB/s 1s

    $ ~/mc stat play/test/testobject
    Name : testobject
    Date : 2023-09-12 13:03:12 PDT
    Size : 5.0 MiB
    ETag : 81485c0e873d222199469076b60f30e9-1
    Type : file
    Metadata :
    Content-Type: application/octet-stream

    Please check your client side code/tool and fix it

    Login or Signup to reply.
  2. This seems like an issue with the way your application is making the requests – I can’t reproduce your results using the CLI.


    Let’s create a test file:

    (test-user) ✗ dd if=/dev/zero of=1000-bytes-file bs=1 count=1000
    1000+0 records in
    1000+0 records out
    1000 bytes transferred in 0.002021 secs (494805 bytes/sec)
    
    (test-user) ✗ md5 1000-bytes-file
    MD5 (1000-bytes-file) = ede3d3b685b4e137ba4cb2521329a75e
    

    Amazon S3:

    (test-user) ✗ aws s3api create-multipart-upload --bucket test-bucket --key 1000-bytes-file --no-cli-pager
    {
        "ServerSideEncryption": "AES256",
        "Bucket": "test-bucket",
        "Key": "1000-bytes-file",
        "UploadId": "xxxx"
    }
    
    (test-user) ✗ aws s3api upload-part --bucket test-bucket  --key 1000-bytes-file --part-number 1 --upload-id xxxx --body 1000-bytes-file --no-cli-pager
    {
        "ServerSideEncryption": "AES256",
        "ETag": ""ede3d3b685b4e137ba4cb2521329a75e""
    }
    
    (test-user) ✗ cat multipartupload-struct.json
    {
      "Parts": [
        {
          "ETag": "ede3d3b685b4e137ba4cb2521329a75e",
          "PartNumber": 1
        }
      ]
    }
    
    (test-user) ✗ aws s3api complete-multipart-upload --multipart-upload file://multipartupload-struct.json --bucket test-bucket --key 1000-bytes-file --upload-id xxxx --no-cli-pager
    {
        "ServerSideEncryption": "AES256",
        "Location": "https://test-bucket.s3.eu-west-1.amazonaws.com/1000-bytes-file",
        "Bucket": "test-bucket",
        "Key": "1000-bytes-file",
        "ETag": ""c019643e056d8d687086c1e125f66ad8-1""
    }
    
    
    (test-user) ✗ echo 'ede3d3b685b4e137ba4cb2521329a75e' | xxd -r -p | md5
    c019643e056d8d687086c1e125f66ad8
    

    Same goes for MinIO, using the same file:

    (test-user) ✗ md5 1000-bytes-file
    MD5 (1000-bytes-file) = ede3d3b685b4e137ba4cb2521329a75e
    
    (test-user) ✗ aws s3api --endpoint-url http://127.0.0.1:9000 create-multipart-upload --bucket test-bucket --key 1000-bytes-file --no-cli-pager
    {
        "Bucket": "test-bucket",
        "Key": "1000-bytes-file",
        "UploadId": "xxxx"
    }
    
    (test-user) ✗ aws s3api --endpoint-url http://127.0.0.1:9000 upload-part --bucket test-bucket  --key 1000-bytes-file --part-number 1 --upload-id xxxx --body 1000-bytes-file --no-cli-pager
    {
        "ETag": ""ede3d3b685b4e137ba4cb2521329a75e""
    }
    
    (test-user) ✗ cat multipartupload-struct.json
    {
      "Parts": [
        {
          "ETag": "ede3d3b685b4e137ba4cb2521329a75e",
          "PartNumber": 1
        }
      ]
    }
    
    (test-user) ✗ aws s3api --endpoint-url http://127.0.0.1:9000 complete-multipart-upload --multipart-upload file://multipartupload-struct.json --bucket test-bucket --key 1000-bytes-file --upload-id xxxx --no-cli-pager
    {
        "Location": "http://127.0.0.1:9000/test-bucket/1000-bytes-file",
        "Bucket": "test-bucket",
        "Key": "1000-bytes-file",
        "ETag": ""c019643e056d8d687086c1e125f66ad8-1""
    }
    
    
    (test-user) ✗ echo 'ede3d3b685b4e137ba4cb2521329a75e' | xxd -r -p | md5
    c019643e056d8d687086c1e125f66ad8
    

    In both cases, we get the expected ETag value for a 1-part multipart upload, based on the MD5 hash of the file.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search