skip to Main Content

So I have a string like this:

test //ita https://example.com lics// test // another // one

I can capture text between 2 "//" strings easy enough like so:

//(.*?)//

Which will return the groups ita https: and test however I’m trying to get it to ignore the cases where there is a "http://" or "https://".

So I’m trying to get it so that it only returns ita https://example.com lics and another.

3

Answers


  1. Looks like your on the right track.

    (?<!https?://)//(.*?)(?=(?:s|https?://|$))
    

    Heres how it works

    (?<!https?://)
    

    Negative lookbehind assertion to ensure that there is no "http://&quot; or
    "https://&quot; before "//".

    //
    

    Matches the "//" strings.

    (.*?)
    

    Captures any text between "//" using a non-greedy match.

    (?=(?:s|https?://|$))
    

    Positive lookahead assertion to ensure that what follows is either a
    whitespace character, "http://&quot; or "https://&quot;, or the end of the
    string.

    You should post a few string examples to better test but based on what I tried, looks like this works.

    Login or Signup to reply.
  2. You can use this regex to match your strings:

    (?<!https:|http:)//s*((?:https?://|(?!//).)*)(?<!s)s*//
    

    This will match:

    • (?<!https:|http:)// : //, not preceded by https: or http:
    • s* : some amount of whitespace
    • ((?:https?://|(?!//).)+) : capture group 1, some number of either:
      • https?:// : // preceded by https: or http:; or
      • (?!//). : a character which is not the start of //
    • (?<!s)s* : some amount of whitespace, not preceded by whitespace (this prevents capturing any whitespace before the closing // in group 1)
    • // : literal //

    Regex demo on regex101

    The strings you are interested in will be captured in group 1. In PHP:

    $text = 'test //ita https://example.com lics// test // another // one';
    $regex = '~(?<!https:|http:)//s*((?:https?://|(?!//).)*)(?<!s)s*//~';
    preg_match_all($regex, $text, $matches);
    var_export($matches[1]);
    

    Output:

    array (
      0 => 'ita https://example.com lics',
      1 => 'another',
    )
    

    PHP demo on 3v4l.org

    Login or Signup to reply.
  3. (https?://)(*SKIP)(*F)|//s*((?:(?1)|.)*?)s*//
    

    I propose this solution with control verbs (*SKIP) and (*F).

    Here’s the regex101 proof and PHP proof

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search