I’m using MinIO storage, but I’m having trouble understanding how the ETag is generated when the fileSize is equal to the partSize.
To illustrate, I created three empty files using the following commands:
dd if=/dev/zero of=5242879 bs=1 count=5242879
dd if=/dev/zero of=5242880 bs=1 count=5242880
dd if=/dev/zero of=5242881 bs=1 count=5242881
I manually set the partSize to 5242880 using the upload parameters.
When I upload these files to Minio via my application, here’s what I get:
result | fileSize | expected | received |
---|---|---|---|
✅ | 5242879 | 7c668eb59d6f0141a7863774100bfbcc | 7c668eb59d6f0141a7863774100bfbcc |
❌ | 5242880 | 81485c0e873d222199469076b60f30e9-1 | 1faefa44d632065e8bd4fd2a462ed292-2 |
✅ | 5242881 | 92f3a08aa3b1d7eb318ab9c2fc4a6ec3-2 | 92f3a08aa3b1d7eb318ab9c2fc4a6ec3-2 |
Can you help me understand what I might be missing here?
2
Answers
This is a client side issue. The default part size in the case of mc is 16MiB
$ truncate -s 16M /tmp/16m
$ ~/mc cp /tmp/16m play/test/123
/tmp/16m: 16.00 MiB / 16.00 MiB ━━━━━━ 2.67 MiB/s 5s
$ ~/mc stat play/test/123
‘Name : 123
Date : 2023-09-12 12:04:35 PDT
Size : 16 MiB
ETag : 3692065fc617ab1acea6dc7e886dc48a-1
Type : file
Metadata :
Content-Type: application/octet-stream
I also did the following using mc
$ truncate -s 5242880 /tmp/testobject
$ MC_UPLOAD_MULTIPART_SIZE=5242880 ~/mc cp /tmp/testobject play/test/
/tmp/testobject: 5.00 MiB / 5.00 MiB ━━━━━━━ 2.67 MiB/s 1s
$ ~/mc stat play/test/testobject
Name : testobject
Date : 2023-09-12 13:03:12 PDT
Size : 5.0 MiB
ETag : 81485c0e873d222199469076b60f30e9-1
Type : file
Metadata :
Content-Type: application/octet-stream
Please check your client side code/tool and fix it
This seems like an issue with the way your application is making the requests – I can’t reproduce your results using the CLI.
Let’s create a test file:
Amazon S3:
Same goes for MinIO, using the same file:
In both cases, we get the expected ETag value for a 1-part multipart upload, based on the MD5 hash of the file.