skip to Main Content

(disclaimer: this is the first time I post a question on SO, so I apologize in advance if I did anything wrong)

I have an URI pointing to an image with this structure:

(stuff…)/acryagl_violencia física/(more stuff…).jpg

I tried to encode it but I get two different results in two different script files and I don’t see the reason why.

// Script one:
`stuff.../${ encodeURIComponent(element.article_id_thumbnail) }/...stuff`
// I get 'acryagl_violencia%20fi%CC%81sica', which does NOT work

// Script two (and Chrome console):
`stuff.../${ encodeURIComponent(element.id) }/...stuff`
// I get 'acryagl_violencia%20f%C3%ADsica', which DOES work

// Notice the difference is on the 'í' from 'física'

According to https://www.url-encode-decode.com/, both strings should decode to the same, which is weird to me. I am totally lost on this one.

In case it helps, this is a React + Vite project, although I don’t see how this could be related with the bundler. I am also testing everything on Chrome.

I fixed it by manually encoding the í character, but there should be a better fix.
Has anyone faced this problem before?

2

Answers


  1. String.prototype.normalize to the rescue:

    encodeURIComponent(element.article_id_thumbnail.normalize())

    and

    encodeURIComponent(element.id.normalize())

    should give you the result you are looking for.

    const a = encodeURIComponent(decodeURIComponent("acryagl_violencia%20fi%CC%81sica").normalize())
    const b = encodeURIComponent(decodeURIComponent("acryagl_violencia%20f%C3%ADsica").normalize())
    console.log(a === b) // true
    Login or Signup to reply.
  2. The code works well, it’s the source data that seem to be inconsistent:

    You might need to use normalize somewhere in your content handling pipeline to get them consistent.

    console.log(
     decodeURI('%C3%AD').normalize('NFKD')
     ===
     decodeURI('i%CC%81')
    ); // true
    // both are two (same) codepoints;
    // first was decomposed from single codepoint
    
    console.log(
     decodeURI('%C3%AD')
     ===
     decodeURI('i%CC%81').normalize('NFKC')
    ); // true
    // both are same single codepoint;
    // second was composed into it from two codepoints
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search