skip to Main Content

I hate asking regex questions.

Subject

Here’s an example of my subject:

You should get yourself some free coconut water! It’s lovely! Because
[coconut water](/buy/) is so affordable, you should totally get some. Get
some [free coconut water today](/buy/)!

Task

I want to replace coconut water with a link: [coconut water](/buy/). However, some links have been added to the text already (using different versions), I want to add links where they are missing.

Summary

In human speak, here’s what I’m trying to do:

  • Replace the phrase coconut water with [coconut water](/buy/)
  • Do not replace if it’s already a link ([coconut water](/buy/))
  • Do not replace if it’s already in a link ([free coconut water!](/buy/))

Attempts

The first problem, where it may already be a link, can be avoided using this regex:

(?<![)coconut water(?!])

It works for two our of the three.

  • ✅ Matches coconut water
  • ✅ Ignores [coconut water](/buy/)
  • ❌ Matches [free coconut water today](/buy/)

Just for clarity, the last one would turn [free coconut water today](/buy/) into [free [coconut water](/buy/) today](/buy/).

Next

The common reoccurance, because it’s Markdown, is that ] will always appear at some point after if it’s already a link. So what I can’t figure out is how to say to that to regex:

Match the phrase but only if [ appears before ] afterwards

When I’ve searched around Stack Overflow and search engines the most common response is to do with it directly after or before the word, but I want it to be flexible so that it would ignore:

  • [free coconut water today](/buy/)
  • [try some coconut water](/buy/)
  • [lovely coconut water for sale](/buy/)

Context

I’m using PCRE regex in PHP. There is more than one phrase to scan for, so it’s actually replace x with [x](y).

2

Answers


  1. Find and skip the links and replace the matches in all other contexts:

    [[^][]*](/buy/)(*SKIP)(*F)|bcoconut waterb
    

    Replace with [$0](/buy/). If there can be any word instead of buy, use [^/]+ or w+.

    I added word boundaries around coconut water to only match the phrase as a whole word.

    See the regex demo

    Details

    • [[^][]*](/buy/)[, then any 0+ chars other than ] and [, then a ](/buy/) text
    • (*SKIP)(*F) – PCRE verbs discarding the current match attempt and resuming the search for the next match from the current position
    • | – or
    • bcoconut waterb – a whole word match for a coconut water phrase.
    Login or Signup to reply.
  2. You can use a negative lookahead to exclude matches that are followed by any characters and a ] in your group:

    (?<![)coconut water(?!.*])(?!])

    https://regex101.com/r/kbquGp/1

    Matches coconut water and coconut water again
    Ignores [coconut water](/buy/)
    ignores [free coconut water
    today](/buy/) and matches coconut water
    ignores [coconut
    water yay](/buy/) and matches coconut water water ftw

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search