skip to Main Content

how can I "clean" strings from everything that starts with GH until - mark, I need just the rest of the data, the problem is that I don’t know what number will be part of GHxxx-,GHxx- in the example, it can be anything.

Is there a way in regex to define something like this?

GH8-30476E/30477E
GH82-23124B GH82-23100B
GH82-20900Aaa, GH82-20838A
GH82-20900C,GH82-20838C
GH962-13566A
GH82-23596C/23597C/31461C

desired result:

30476E 30477E
23124B 23100B
20900Aaa 20838A
20900C 20838C
13566A
23596C 23597C 31461C
const arr = ["GH8-30476E/30477E", "GH82-23124B GH82-23100B", "GH82-20900Aaa, GH82-20838A", "GH82-20900C,GH82-20838C", "GH962-13566A", "GH82-23596C/23597C/31461C"]

arr.forEach((e, index) => {
  arr[index] = e.replace(/GH82-/gi, '');


})
console.log(arr)

Thank you in advance.

edit:
GH82-23100B can be GH82-2310044B or GH82-23B, that is why my approach was to remove GH, not match the other 5 characters.

edit2:
I edited the examples a bit. Looks like solution from comment works

3

Answers


    • You could just get the output and ignore everything else:
    const regex = /[0-9]{5}[A-Z]/gm;
    const arr = [
      "GH82-30476E/30477E",
      "GH82-23124B GH82-23100B",
      "GH82-20900A, GH82-20838A",
      "GH82-20900C,GH82-20838C",
      "GH962-13566A",
      "GH82-23596C/23597C/31461C",
    ];
    
    let res = [];
    
    arr.forEach((str) => {
      const M = str.match(regex);
      if (M) {
        res.push(M.join(" "));
      }
    });
    
    console.log(res);

    • If you have to remove those unwanted patterns, you can use s*[A-Z]{2}[0-9]{1,3}s*-s*|[/,.], instead:
    const regex = /s*[A-Z]{2}[0-9]{1,3}s*-s*|[/,.]/gm;
    const arr = [
      "GH82-30476E/30477E",
      "GH82-23124B GH82-23100B",
      "GH82-20900A, GH82-20838A",
      "GH82-20900C,GH82-20838C",
      "GH962-13566A",
      "GH82-23596C/23597C/31461C",
    ];
    
    let res = [];
    
    arr.forEach((str) => {
      let s = str.replace(regex, " ");
      s = s.replace(/s{2,}/g, " ").trim();
      res.push(s);
    });
    
    console.log(res);

    You can add more restrictions to the pattern, depending on how your data may look like:

    /s*b[A-Z]{2}[0-9]{1,3}bs*-s*|[/,.]+/g
    
    • [/,.]+, anything that wanted to be removed, goes into this character class.

    • b, is a word boundary.


    • If we only have GH, then we simply use: /s*bGH[0-9]{1,3}bs*-s*|[/,.]+/g
    Login or Signup to reply.
  1. There are two way you did that.

    Method-01: find matched substring of format d{5}[A-Z] and then join them.

    Method-02: remove GHd+- substring and / or , special characters, then trim the data.

    Code as follows:

    const arr = [
      "GH82-30476E/30477E",
      "GH82-23124B GH82-23100B",
      "GH82-20900A, GH82-20838A",
      "GH82-20900C,GH82-20838C",
      "GH962-13566A",
      "GH82-23596C/23597C/31461C",
    ];
    
    // Method-01
    console.log('Method-01: ', arr.map(x => x.match(/d{5}[A-Z]/gi).join(' ')));
    
    // Method-02
    console.log('Method-02: ', arr.map(x => x.replace(/(GHd+-)|([/,])/gi, ' ')
                          .trim().replace(/s+/g, ' '))); 
    Login or Signup to reply.
  2. You can use

    const arr = ["GH82-30476E/30477E", "GH82-23124B GH82-23100B", "GH82-20900A, GH82-20838A", "GH82-20900C,GH82-20838C", "GH962-13566A", "GH82-23596C/23597C/31461C"]
    
    arr.forEach((e, index) => {
      arr[index] = e.replace(/GHd*-/gi, '').split(/W+/).join(" ");
    
    
    })
    console.log(arr)

    Here

    • .replace(/GHd*-/gi, '') – removes GH + zero or more digits and then -
    • .split(/W+/) – splits at non-word chars
    • .join(" ") – joins the resulting items with a space.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search