skip to Main Content

I want write some function that were accept two parameters $text and $keys. Keys that an array with keys.

At the output we need to get an array, where the keys will be the keys passed to the function (if we found them in the text), and the values ​​will be the text that follows this key, until it comes across the next key or the text ends. If the key is repeated in the text, write only the last value to the array

For example:

Visualized Text: Lorem Ipsum is simply one dummy text of the printing and two typesetting industry. Lorem Ipsum has been the industry’s one standard dummy text ever since the three 1500s.

$text = 'Lorem Ipsum is simply one dummy text of the printing and  two typesetting industry. Lorem Ipsum has been the industry's one standard dummy text ever since the three 1500s.';

$keys = ['one', 'two', 'three'];

Desired Output:

[
    'one' => 'standard dummy text ever since the',
    'two' => 'typesetting industry. Lorem Ipsum has been the industry's',
    'three' => '1500s.'
]

I tried writing a regular expression which will cope with this task, but without success.

Last attempt:

function getKeyedSections($text, $keys) {
    $keysArray = explode(',', $keys);
    $pattern = '/(?:' . implode('|', array_map('preg_quote', $keysArray)) . '):s*(.*?)(?=s*(?:' . implode('|', array_map('preg_quote', $keysArray)) . '):s*|z)/s';
    preg_match_all($pattern, $text, $matches);

    $keyedSections = [];
    foreach ($keysArray as $key) {
        foreach ($matches[1] as $index => $value) {
            if (stripos($matches[0][$index], $key) !== false) {
                $keyedSections[trim($key)] = trim($value);
                break;
            }
        }
    }

    return $keyedSections;
}

2

Answers


  1. Do you need to pass the keys? How about this one with the keys being appended as they appear in the text:

    <?php 
    
    
    $text = "Lorem Ipsum is simply **one** dummy text of the printing and  **two** typesetting industry. Lorem Ipsum has been the industry's  **one** standard dummy text ever since the **three** 1500s.";
    
    $matches = [];
    preg_match_all("/(**(w|d)+**)(w|d|s)+/", $text, $matches);
    
    $actualMatches = $matches[0];
    $keys = $matches[1];
    $index = 0;
    
    $results = array_reduce($actualMatches, function($carry, $item) use ($keys, &$index) {
        $key = $keys[$index];
        $carry[str_replace("*", "", $key)] = trim(substr($item, strlen($key)));
        $index++;
        return $carry;
    }, []);
    
    var_dump($results);
    
    ?>
    

    If you need just the specific keys, here is an alternative:

    <?php 
    
    
    $text = "Lorem Ipsum is simply **one** dummy text of the printing and  **two** typesetting industry. Lorem Ipsum has been the industry's  **one** standard dummy text ever since the **three** 1500s.";
    
    $matches = [];
    preg_match_all("/(**(w|d)+**)(w|d|s)+/", $text, $matches);
    
    $actualMatches = $matches[0];
    $keys = $matches[1];
    $index = 0;
    
    $targetKeys = ['one', 'three'];
    $results = array_reduce($actualMatches, function($carry, $item) use ($keys, &$index, $targetKeys) {
        $key = $keys[$index];
        $cleanedKey = str_replace("*", "", $key);
        if (in_array($cleanedKey, $targetKeys)) {
            $carry[str_replace("*", "", $key)] = trim(substr($item, strlen($key)));
        }
        $index++;
        return $carry;
    }, []);
    
    
    
    var_dump($results);
    
    Login or Signup to reply.
  2. Here is an approach with preg_match_all() which extracts all segments starting with any key and ending before any key. The array_column() call just discards earlier matches for later matches and sets up the desired associative result. (Demo)

    $text = "Lorem Ipsum is simply one dummy text of the printing and  two typesetting industry. Lorem Ipsum has been the industry's one standard dummy text ever since the three 1500s.";
    
    $keys = ['one', 'two', 'three'];
    
    $escaped = implode('|', array_map('preg_quote', $keys));
    
    preg_match_all('#b(' . $escaped . ')bs*K.*?(?=s*(?:$|b(?:' . $escaped . ')b))#', $text, $m, PREG_SET_ORDER);
    
    var_export(array_column($m, 0, 1));
    

    Output:

    array (
      'one' => 'standard dummy text ever since the',
      'two' => 'typesetting industry. Lorem Ipsum has been the industry's',
      'three' => '1500s.',
    )
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search