Javascript - Read encoded url - PhpOut

BartVanSchil
July 6, 2023
270 views
2 votes
4 Answers

Can someone explain why the first code doesn’t work and the second does. The only difference is that the string is split in a different place.

Doesn’t work:

var text = '{"data":[' +
  '{"name": "Gerschu00e4ftsfu00f' +
  'chrer"}' +
  ']}';
document.getElementById("test").innerHTML = text;

<p id="test"></p>

This works fine:

var text = '{"data":[' +
  '{"name": "Gerschu00e4ftsfu00fc' +
  'hrer"}' +
  ']}';
document.getElementById("test").innerHTML = text;

<p id="test"></p>

I get it working if I pass the character ü not as u00fc but as %C3%BC

Tags: javascript

Answers

- Quentin
- July 6, 2023 at 4:46 pm
- 0 votes
0
Escape sequences (such as u00fc) are converted to characters when the string literal is parsed (i.e. converted from source code to a string).

If you put half of an escape sequence in one string literal and half in another string literal (and then attempt to join the two parts together with the + operator) then the escape sequence is incomplete at the time JavaScript tries to convert it into a character.

Login or Signup to reply.

- David
- July 6, 2023 at 4:46 pm
- 0 votes
0
The console error is telling you exactly the problem:

Invalid Unicode escape sequence

This is an invalid unicode character:
```
u00f
```
But this one is valid:
```
u00fc
```
The rest of the string, such as how you split it, is irrelevant. The syntax of the unicode character must be correct within the string literal that uses it. This literal is invalid:
```
'{"name": "Gerschu00e4ftsfu00f'
```
So any JavaScript code which tries to use this literal will fail because of that invalid unicode syntax. It doesn’t matter if that code was going to try to append another string literal to this one, since this string literal is broken it can’t be used for that or any other purpose.

Just as in the working version, split the string in such a way that each individual string literal is a valid string literal.
Login or Signup to reply.

- MichaelM
- July 6, 2023 at 4:47 pm
- 0 votes
0
The issue is that you’re splitting the string in the middle of a Unicode escape sequence. When you use the + operator with strings, you concatenate them, but Unicode escape sequences are resolved before concatenation.

So, in your first code, it’s trying to resolve the sequence u00f, which is impossible because Unicode escape sequences need to be four characters long.

In your second code, you use the full sequence u00fc which is correct.

Login or Signup to reply.

- MarioPH
- July 6, 2023 at 4:50 pm
- 0 votes
0
Because the unicode sequence for "ü" is u00fc and you can’t split unicode sequences.

Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.