I want to generate text bytes in various charsets (encodings), such as ISO-8859-1, Big5, UTF-8, UTF-16, etc., mostly for testing purposes (i.e. to make sure my app script can correctly handle the provided bytes in these charsets), which is almost like:
new TextEncoder(mycharset).encode(mystring)
Unfortunately TextEncoder
only supports conversion of a string into UTF-8 bytes, which is unlike TextDecoder
, which supports conversion of bytes in various charset into a string.
The behavior of TextEncoder
is defined by the spec and is unlikely going to be changed in the future… Any folk know why there is the inconsistent behavior in both classes? And the way to do the job? (without manually providing a conversion table/map for all the target charsets)
TO REVIEWER: This issue is NOT a duplicate of the related question, which asks why the parameter of TextEncoder
does not take effect, and the answer says that TextEncoder
does not take a parameter by the spec.
This questions specifically asks for:
- why the spec specified that
TextEncoder
not take a parameter, which is inconsistent withTextDecoder
- HOW to perform the conversion as this question asked
Both are not asked and explained in the related question and answer.
2
Answers
Thanks for @Kaiido's idea. After some tests I finally find a way to do it natively:
According to this issue the logic for keeping only UTF-8 is that all the various Web APIs only accept UTF-8 as input. So there should be no need to produce other encodings for the Web APIs. This makes the API a lot simpler. Also, there is a goal to make everything UTF-8, so not providing tools to produce non-UTF-8 encodings make sense for the Web API. However it’s clear this goal isn’t reached yet and thus it makes sense to have a decoder in the platform.
As for how to perform such encodings, you can use the polyfill made Joshua Bell, which does have a
NONSTANDARD_allowLegacyEncoding
option.To use the polyfill on modern browsers, you’d need to nullify the browser’s
TextEncoder
though:Here I copy a snippet from a previous answer of mine: