Can someone explain why the first code doesn’t work and the second does. The only difference is that the string is split in a different place.
Doesn’t work:
var text = '{"data":[' +
'{"name": "Gerschu00e4ftsfu00f' +
'chrer"}' +
']}';
document.getElementById("test").innerHTML = text;
<p id="test"></p>
This works fine:
var text = '{"data":[' +
'{"name": "Gerschu00e4ftsfu00fc' +
'hrer"}' +
']}';
document.getElementById("test").innerHTML = text;
<p id="test"></p>
I get it working if I pass the character ü
not as u00fc
but as %C3%BC
4
Answers
Escape sequences (such as
u00fc
) are converted to characters when the string literal is parsed (i.e. converted from source code to a string).If you put half of an escape sequence in one string literal and half in another string literal (and then attempt to join the two parts together with the
+
operator) then the escape sequence is incomplete at the time JavaScript tries to convert it into a character.The console error is telling you exactly the problem:
This is an invalid unicode character:
But this one is valid:
The rest of the string, such as how you split it, is irrelevant. The syntax of the unicode character must be correct within the string literal that uses it. This literal is invalid:
So any JavaScript code which tries to use this literal will fail because of that invalid unicode syntax. It doesn’t matter if that code was going to try to append another string literal to this one, since this string literal is broken it can’t be used for that or any other purpose.
Just as in the working version, split the string in such a way that each individual string literal is a valid string literal.
The issue is that you’re splitting the string in the middle of a Unicode escape sequence. When you use the
+
operator with strings, you concatenate them, but Unicode escape sequences are resolved before concatenation.So, in your first code, it’s trying to resolve the sequence
u00f
, which is impossible because Unicode escape sequences need to be four characters long.In your second code, you use the full sequence
u00fc
which is correct.Because the unicode sequence for "ü" is u00fc and you can’t split unicode sequences.