skip to Main Content

I have a situation where I need to test the string if it contains a particular word or letter using the Javascript Regex.

Sample strings would be:

// In the first 3 strings, I need "C" letter to be checked in the string
C is language is required.     
We need a C language dev.
Looking for a dev who knows C!

// Keyword is Artificial Intelligence
We need looking for someone who knows Artificial Intelligence.

For checking the above I have created a Regex.

['C', 'Artificial Intelligence', 'D', 'Angular', 'JS'].forEach((item) => {
 const baseRex = /[!,.?": ]?/g;
 const finalRex = new RegExp(baseRex.source + item + baseRex.source); // /[!,.?": ]<C/D/Angular...>[!,.?": ]/

// Say checking for first iteration only. So let consider 'C'.
 console.log(finalRex.test('C is required')); // true
 console.log(finalRex.test('Looking for a dev who knows C!')); // true
 console.log(finalRex.test('We need a C language dev.')); // true
 console.log(finalRex.test('Computer needed')); // Also returns true | Which is wrong!

});

I won’t want the words contains the letter C also get a count.

2

Answers


  1. The regex after the concatenation with the baseRex is:

    [!,.?": ]?C[!,.?": ]?
    

    Notice that [!,.?": ]? can match 0 or 1 characters. In Computer, both subpatterns of [!,.?": ]? matches 0 characters, and C matches C, causing the whole regex to match.

    Presumably, you added ? there so that it works at the start and end of the string, where there are no characters to be matched. However, you should instead use ^ and $ for the start and end instead. Your whole regex should be:

    (?:[!,.?": ]|^)C(?:[!,.?": ]|$)
    

    You can also replace the character class with W, which means [^0-9a-zA-Z_].

    In fact, you don’t actually need to do all of this! There is a useful 0-width matcher called “word-boundary” b, which seems to be exactly the thing you want here. Your base regex can just be:

    b
    

    It only matches the boundary between a w and a W or between a W and a w.

    Login or Signup to reply.
  2. for C

    input:

    C is language is required.     
    We need a C language dev.
    Looking for a dev who knows C!
    Computer needed
    invalidC should not match
    
    • js regex: (?<!w)C(?!w)
    • match result:
      • Chrome:
      • Safari: not support look behind

    extended to both C or Artificial Intelligence

    input:

    C is language is required.     
    We need a C language dev.
    Looking for a dev who knows C!
    Computer needed
    invalidC should not match
    We need looking for someone who knows Artificial Intelligence.
    not matchArtificial Intelligence
    
    • regex: (?<!w)((C)|(Artificial Intelligence))(?!w)
    • match result:
      • Chrome:

    Note

    for more about look ahead and look behind, can refer my summary:

    and my (Chinese) tutorial: 环视断言 · 应用广泛的超强搜索:正则表达式

    and even all regex: 一图让你看懂和记住所有正则表达式规则

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search