skip to Main Content

I want to replace any word that follows a certain condition in a text, with another word, which varies, depending on the first word. The possibilities of the other word is stored in an array, and the index of the element to be used to replace the original word is calculated using the first word.

I tried to accomplish this by splitting the original text into an array of each word in a different index of the array. (using .split(" ")) When a few words are found to satisfy the condition, and replaced, I lose the punctuation marks that sometimes trail behind them. My language is extremely abstract because what I essentially want to do is, split a string into words and punctuation marks, but the punctuation marks should be a seperate object in the array only if the punctuation marks are followed by a space.(I clearly haven’t mastered how to write questions yet)

To clarify with an example:

The sentence "My code doesn’t work and, I don’t know how to get it to." should be split into ["My","code","doesn't","work","and",",","I","don't","know","how","to","get","it","to","."]

2

Answers


  1. You can use a regular expression that matches whitespace and punctuation to split your string, with the trick of using a capture group so you can check whether or not the string was pure whitespace or not, and if so, retain it in the array.

    function splitIntoWords(s) {
      return s.split(/([s,.])/).filter((a) => a.trim());
    }
    
    > splitIntoWords("My code doesn't work and, I don't know how to get it to.")
    (15) ['My', 'code', "doesn't", 'work', 'and', ',', 'I', "don't", 'know', 'how', 'to', 'get', 'it', 'to', '.']
    

    (If you need to, for some reason, glue a single . back into the preceding word as in your example, you can do that as a next step.)

    Login or Signup to reply.
  2. JavaScript’s internationalization API features Intl.Segmenter for exactly such tasks since due to a language’s complexity a solely regex based solution, which covers every aspect of a language and the specificity of locales, might get difficult to handle.

    It’s usage applied to the OP’s problem might look similar to the next provided example code …

    const sampleText =
      "My code doesn't work and, I don't know how to get it to.";
    
    const segmenter =
      new Intl.Segmenter('en', { granularity: 'word' });
    
    const segmentList = [...segmenter.segment(sampleText)];
    
    const finalResult = segmentList
      .filter(({ segment }) => !!segment.trim())
      .map(({ segment }) => segment)
    
    console.log({ sampleText, finalResult, segmentList });
    .as-console-wrapper { min-height: 100%!important; top: 0; }
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search