skip to Main Content

I am currently writing a PHP function which should help me to extract an upgrade notice from a given readme text.

This is my source text:

Some stuff before this notice like a changelog with versioning and explanation text.

== Upgrade Notice ==

= 1.3.0 =

When using Master Pro, 1.3.0 is the new minimal required version!

= 1.1.0 =

When using Master Pro, 1.1.0 is the new minimal required version!

= 1.0.0 =

No upgrade - just install :)

[See changelog for all versions](https://plugins.svn.wordpress.org/master-pro/trunk/CHANGELOG.md).

This is the function:

/**
 * Parse update notice from readme file
 *
 * @param string $content
 * @param string $new_version
 *
 * @return void
 */
private function parse_update_notice( string $content, string $new_version ) {
    $regexp  = '~==s*Upgrade Notices*==s*(.*?=+s*' . preg_quote( $new_version ) . 's*=+s*(.*?)(?=^=+s*d+.d+.d+s*=+|$))~ms';

    if ( preg_match( $regexp, $content, $matches ) ) {
        $version = trim( $matches[1] );
        $notices = (array) preg_split( '~[rn]+~', trim( $matches[2] ) );

        error_log( $version );
        error_log( print_r( $notices, true ) );
    }
}

I am currently stuck at my RegEx. I’m not really getting it to work. This was my initial idea:

  1. Only search after == Upgrade Notice ==
  2. Check if we have a version matching $new_version
  3. Get the matched version between the = x.x.x = as match 1 e.g. 1.1.0
  4. Get the content after the version as match 2 but stopping after an empty new line. The upgrade notice can go over multiple lines but without an empty new line

I’m totally stuck at this task. Maybe you find the issue I’m unable to see.

4

Answers


  1. You don’t need to do everything with a regex. Just use a regex for the version detection. Here’s a simplified version:

    Demo: https://3v4l.org/aMdXF

    $versions = [];
    $currentVersion = '';
    $ignore = true;
    foreach(explode("n", $md) as $line) {
        if (str_starts_with($line, '== Upgrade Notice ==')) {
            $ignore = false;
            continue;
        }
    
        if (preg_match('/^= ([0-9.]+) =/', $line, $matches)) {
            $currentVersion = $matches[1];
            continue;
        }
    
        if (true === $ignore || '' === $currentVersion) {
            continue;
        }
    
        $versions[$currentVersion][] = $line;
    }
    
    Login or Signup to reply.
  2. Here is a solution not based on regex but good old strpos():

    function getNotice($readme, $version)
    {
        $txt = str_replace("r", '', $readme);
        $p1 = strpos($txt, "== Upgrade Notice ==");
        if($p1 !== false)
        {
            $ver = "= $version =";
            $p2 = strpos($txt, $ver, $p1);
            if($p2 !== false)
            {
                $p2 += strlen($ver) + 2;
                $p3 = strpos($txt, "nn", $p2);
                if($p3 !== false)
                    return substr($txt, $p2, $p3 - $p2);
                else
                    return substr($txt, $p2);
            }
        }
        return '';
    }
    
    $readme = <<<README
    Some stuff before this notice which is not relevant.
    
    == Upgrade Notice ==
    
    = 1.3.0 =
    
    When using Master Pro, 1.3.0 is the new minimal required version!
    Additional line.
    
    = 1.1.0 =
    
    When using Master Pro, 1.1.0 is the new minimal required version!
    
    = 1.0.0 =
    
    No upgrade - just install :)
    
    [See changelog for all versions](https://plugins.svn.wordpress.org/master-pro/trunk/CHANGELOG.md).
    README;
    
    echo getNotice($readme, '1.3.0');
    

    Output:

    When using Master Pro, 1.3.0 is the new minimal required version!
    Additional line.
    
    Login or Signup to reply.
  3. It seems like it’s just a mistake in the position of your parentheses:

    '~==s*Upgrade Notices*==s*.*?=+s*(' . preg_quote( $new_version ) .
      ')s*=+s*(.*?)(?=^=+s*d+.d+.d+s*=+|^s*?$)~ms'
    

    https://3v4l.org/WY3aE

    Login or Signup to reply.
  4. To get the first part after "Upgrade Notice", matching only the first following block with non empty lines, you can omit the s flag to have the dot match a newline and capture matching all following lines that contain at least a single non whitespace character.

    ^==h*Upgrade Noticeh*==Rs*^=h*(1.3.0)h*=Rs*^((?:h*S.*(?:Rh*S.*)*)+)
    

    The line in PHP:

    $regexp = '~^==h*Upgrade Noticeh*==Rs*^=h*(' . preg_quote( $new_version ) . ')h*=Rs*^((?:h*S.*(?:Rh*S.*)*)+)~m';
    

    Regex demo


    If you want to be able to determine which occurrence after matching "Upgrade Notice", you can use a quantifier to skip the amount of occurrences that start with the version pattern:

    ^==h*Upgrade Noticeh*==(?:(?:R(?!=h*d+.d+.d+h*=$).*)*R=h*(d+.d+.d+)h*=$s*){2}(^h*S.*(?:Rh*S.*)+)
    
    • ^ Start of string
    • ==h*Upgrade Noticeh*== The starting pattern, where h* match optional horizontal whitespace characters
    • (?: Non capture group
      • (?:R(?!=h*d+.d+.d+h*=$).*)* Match all lines that do not start with a version pattern
      • R=h* Match a newline and = followed by horizontal whitespace characters
      • (d+.d+.d+) Capture group 1, match the version
      • h*=$s* Match horizontal whitespace characters, = and assert the end of the string and match optional whitespace characters
    • ){2} Use a quantifier (in this case {2}) to match n times a version pattern
    • ^ Start of string
    • ( Capture group 2
      • (?:h*S.*(?:Rh*S.*)*)+ Match 1 or more lines that contain at least a single non whitespace character
    • ) Close the group

    Regex demo

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search