skip to Main Content

I’m making some code that will extract all links from a block of text. I have a regex pattern, I’m just wondering if there is a way to get all matches of that regex pattern?

I’ve tried some code I found but that returned nothing every time.

Any answers are appreciated.

let string = "Hello world"
let m, newtext, re = /o/g
do {
  m = re.exec(string);
  if (m) {
    newtext = (m[1], m[2]);
  }
} while (m);

console.log(m,newtext)

This is not a duplicate as the post did not solve my problem.

And I’ve also used this code, but I’m getting an error: "findAll is not a function or its return value is not iterable"

function link() {
        const findAll = (value, expr) => {

            const iterator = value.matchAll(typeof expr === 'string' ? 
            new RegExp(expr, 'gd') : 
            new RegExp(expr.source, [...new Set((expr.flags + 'gd').split(''))].join('')));
            let _next;
            [_next, iterator.next] = [iterator.next, function(){
              const result = _next.call(iterator);
              if(!result.done){
                const {0: text, indices: [[startIndex, endIndex]]} = result.value;
                console.log(
                 'done:', false,
                 'value:', {text, startIndex, endIndex} 
                );
              }
              
              console.log (result);
            }];
            console.log(iterator);
          }
          
          console.log('matching with string:');
          for (const m of findAll(str, re)) {
            console.log(JSON.stringify(m));
          }
          
          console.log('matching with regex:');
          for (const m of findAll(str, re)) {
            console.log(JSON.stringify(m));
          }
    }

3

Answers


  1. As @Konrad mentioned. One of the best ways is to use matchAll. Example:

    const regexp = /foo[a-z]*/g;
    const str = "table football, foosball";
    const matches = str.matchAll(regexp);
    
    for (const match of matches) {
      console.log(
        `Found ${match[0]} start=${match.index} end=${
          match.index + match[0].length
        }.`,
      );
    }
    // Found football start=6 end=14.
    // Found foosball start=16 end=24.
    Login or Signup to reply.
  2. Here is a modified version of Jivopis’ response.

    I created a function called findAll that takes a string and a pattern (string or regex) and return a list of objects. These objects contain the text, startIndex, and endIndex for each matched substring in the origin text.

    /**
     * An object representing match info.
     * @typedef {Object} MatchInfo
     * @property {string} text - The matched substring text
     * @property {number} startIndex - The subtring start index
     * @property {number} endIndex - The substring end index
     */
    
     /**
     * Returns all matching substrings, along with their start and end indices.
     *
     * @param {string} value - The text to search
     * @param {RegExp|string} expr - A pattern to search with
     * @returns {MatchInfo[]} - Matches
     */
    const findAll = (value, expr) =>
      [...value.matchAll(typeof expr === 'string' ? new RegExp(expr, 'g') : expr)]
        .map(({ 0: s, index: i }) => ({ text: s, startIndex: i, endIndex: i + s.length }))
    
    const matches = findAll('table football, foosball', 'foo[a-z]*');
    
    for (let match of matches) {
      console.log(match);
    }
    .as-console-wrapper { top: 0; max-height: 100% !important; }

    Output

    [{
      "text": "football",
      "startIndex": 6,
      "endIndex": 14
    }, {
      "text": "foosball",
      "startIndex": 16,
      "endIndex": 24
    }]
    
    Login or Signup to reply.
  3. Jumping into the wagon, here is a modified version of Mr. Polywhirl’s response:

    1. Return an iterator of .matchAll so we could use the iterator with all of its benefits, for example we can break the loop thus skipping the rest and optimizing performance
    2. Monkeypatch .next() of the iterator to get data in our needed format.
    3. Add d flag to the regular expression so we can get start and end match indices automatically
    4. Fix the second argument if a regexp by adding gd flags
    const findAll = (value, expr) => {
    
      const iterator = value.matchAll(typeof expr === 'string' ? 
      new RegExp(expr, 'gd') : 
      new RegExp(expr.source, [...new Set((expr.flags + 'gd').split(''))].join('')));
      let _next;
      [_next, iterator.next] = [iterator.next, function(){
        const result = _next.call(iterator);
        if(!result.done){
          const {0: text, indices: [[startIndex, endIndex]]} = result.value;
          return {
            done: false,
            value: {text, startIndex, endIndex} 
          }
        }
        
        return result;
      }];
      return iterator;
    }
    
    console.log('matching with string:');
    for (const m of findAll('table football, foosball', 'foo[a-z]*')) {
      console.log(JSON.stringify(m));
    }
    
    console.log('matching with regex:');
    for (const m of findAll('table football, foosball', /foo[a-z]*/g)) {
      console.log(JSON.stringify(m));
    }
    .as-console-wrapper { top: 0; max-height: 100% !important; }
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search