I’m creating a download link to an item that is stored on AWS / S3.
As I am building out this link, I’ve confirmed that the data is encoded in UTF-8, but when the user goes to download it, they are hit with this error anytime the link contains anything other than ASCII encoding.
<Error>
<Code>InvalidArgument</Code>
<Message>Header value cannot be represented using ISO-8859-1.</Message>
<ArgumentName>response-content-disposition</ArgumentName>
<ArgumentValue>attachment;filename="今日も僕は 用もないのに.mp3"</ArgumentValue>
<RequestId>12345</RequestId>
<HostId>12345</HostId>
</Error>
//first attempt - whatever encoding it IS, convert it to utf-8
$encoded_file = mb_convert_encoding($original_filename, "UTF-8", mb_detect_encoding($original_filename));
//second attempt - force filename to use html entities
$encoded_file = mb_convert_encoding($original_filename,'HTML-ENTITIES','UTF-8');
$obj_data['ResponseContentDisposition'] = 'attachment;filename="' . $encoded_file . '"';
$cmd = $s3->getCommand('GetObject', $obj_data);
$presign_url_request = $s3->createPresignedRequest($cmd, AWS_PRESIGNED_URL_EXPIRATION);
Forcing the attachment;filename
to use htmlentities works – but it’s really ugly. If I am converting the filename into UTF-8, why am I getting this error from AWS that the header value cannot use ISO-8859-1?
2
Answers
Unfortunately HTTP message Header doesn’t have the same restrictions as the HTTP message body.
UTF-8 is supported in message body but not in the header (for historic and technical reasons). PHP
urlencode
function is worth a try for headers but not sure it will improve things.Allowed characters in HTTP header values
https://stackoverflow.com/a/75998796/8199678
HTTP headers are forbidden from containing anything other than ISO-8859-1, and strings of any other incompatible encoding must be encoded in conformance to established specs.
In this case, it is RFC6266.
Output:
And you would use it in your code like:
That said, do not rely on
mb_detect_encoding()
as it make a guess as to what the encoding might be. String encoding is metadata that must be captured alongside the data itself and preserved.