I have to deal with plant latin names and need to style parts of the words in the name of the plants comming from the DB. The names are stored as raw text.
Example string : "Androsace angrenica ‘Angelica’ subsp. Violaceae".
And need to style it like so :
<em>Androsace angrenica</em> 'Angelica' subsp. <em>Violaceae</em>
Some specific words are to be tracked not to be in italic, like shown in the example above and in the array $toFind
.
I got so far but ending up with avery single words except for the one in the array being wraped by <em></em>
like so :
<em>Androsace</em> <em>angrenica</em> 'Angelica' subsp. <em>Violaceae</em>
I would like to be able to prevent following </em> <em>
like in the first part of the name and join them in one single tag wrap shown in the first example.
# Array of words not be wraped in italic
$toFind = ["subsp.", "var.", "f.", "(voir)", "hybride"];
# Plant name
$name = "Androsace angrenica 'Angelica' subsp. Violaceae";
# Make an array of words from the name
$words = explode( " ", $name );
$newWords = [];
foreach( $words as $key => $word ) {
if( in_array( $word, $toFind )) {
$newWords[] = $word;
}else{
# Catch the word or words surrounded by single quotes like 'Angelica'
$isHybrid = preg_match_all( "/'([^.]*?)'/", $word, $matches, PREG_PATTERN_ORDER );
if( $isHybrid ){
# No tags required
$newWords[] = $word ;
}else{
# Tags required for these words
$newWords[] = "<em> ". $word . "</em>";
}
}
}
echo implode(" ", $newWords);
Note that this exemple name is one of many possiblities like so:
Allium obliquum
Allium ostrowkianum (voir) A. oreophilum
Allium senescens subsp. glaucum
Allium sikkimense
Androsace × pedemontana
Thanks!
2
Answers
You could consider processing the
implode()
result:This replaces all instances of
after implosion of the
</em> <em>
to$newWords
.Your task logic is a blend of literal and non-literal word exclusions. The truth is that you don’t need to
explode()
the string into a temporary array, compare each word against a blacklist array, then use a regex to conditionally exclude single-quote-wrapped words, then implode the potentially mutated words again.It will be much more direct to prepare a single regex pattern with a negated lookahead to exclude disqualified words. Then
preg_replace()
is the best single-call tool to execute your logic.Code: (Demo)
Output:
Not only is this more direct and more concise, if you want to extend the literal or non-literal exclusions, you don’t need to modify the pattern, only the
$prepped
array.Pattern breakdown:
Here is a demo using your entire sample data set.