skip to Main Content

At the moment my program looks like this:

let final = text;
  const divElement = document.createElement('div');
  // eslint-disable-next-line max-len
  const linkRegExp = /b(((http(s)?://)([w-]{1,32}(.|:)[w-]{1,32}))|([w-]{1,32}(@)[w-]{1,32}(.)[w-]{1,32})|([w-]{1,32}(.)[A-Za-z]{1,32}))b/gi;

  function replacer(url) {
    if (url.match(/S+@S+.S+/ig)) {
      const email = document.createElement('a');
      email.innerHTML = url;
      email.href = `mailto:${url}`;
      email.setAttribute('class', 'email');

      return `${email.outerHTML}`;
    }
    const link = document.createElement('span');
    link.innerHTML = url;
    link.setAttribute('class', 'link');
    link.setAttribute('style', 'color: blue; cursor: pointer');

    return `${link.outerHTML}`;
  }

  final = final.replace(RegExp(linkRegExp), replacer);

Unfortunately, replacing [w-] with [wА-Яа-я-], [p{L}d_-] did not give positive results and adding u at the end of the expression causes an error

2

Answers


  1. Chosen as BEST ANSWER
    /(?<=^|W)((http|https)://)?(www.)?([A-Za-zА-Яа-я0-9]{1}[A-Za-zА-Яа-я0-9-@]*.?)*.{1}[A-Za-zА-Яа-я0-9-]{2,8}(/([w#!:.?+=&%@!-/])*)?(?=$|W)/gi
    

    I present to your attention the ideal solution to this problem! thanks neural network!


  2. Latin characters and Cyrillic characters (russian is using Cyrillic characters) are only two subsets of all characters.
    First you should be precise if it is enough to check for let’s say Cyrillic or anything else, than it is only a two options differentiation.
    Next you should decide how to procede with mixed text, i.e. Cyrillic and non Cyrillic in one text.
    I guess for you it would be best to treat any text with at least one Cyrillic character as Cyrillic and anything else as Latin (though it could be e.g. Arabic as well).
    Now there are different methods in javascript to check strings, but since you asked for a regular expression, here is how to match any Cyrillic character with regex:

    const cyrillicRegex = /p{Script_Extensions=Cyrillic}/u

    Of course you can use p{Script_Extensions=Cyrillic} also in character classes or with quantifiers. Regex script extensions are usable in all major browsers since around 2017.

    Btw: the regex in your current program is using many unnecessary capturing groups. E.g. http(s)? can also be https?

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search