My text as belows:
9/91 a1 2a cx.papaya 94000
9/92 b2 3a x44b mango 10220
9/93 3 3a x333 pineapple
9/94 x4 cx.apple 94000
9/95 5 55 cyz cx.orange
I try to develop a regex to find out the word as below table but it’s not working.
My regex is ^[0-9/]+.*s(.*)s(d{5})$
.
This is my expectation:
Group 1 | Group 2 | Group 3 |
---|---|---|
9/91 a1 2a | papaya | 94000 |
9/92 b2 3a x44b | mango | 10220 |
9/93 3 3a x333 | pineapple | |
9/94 x4 | apple | 94000 |
9/95 5 55 cyz | orange |
4
Answers
Probably something like this might help:
you can try to play around with regexp on some sites like https://regex101.com/
Here is my attempt:
Demo: regex101
Explanation:
^
: start anchor(d+/d+hxd+)
: first capturing group, match pattern9/91 x1
(one or many digitsd+
, a slash with escape character/
, one or many digitsd+
, a spaceh
, characterx
, one or many digitsd+
)h(?:w+.)?
: a spaceh
followed by a non capturing group that match optional patterncx.
(w+)
: second capturing group, match any words charactersw+
one or many timesh?(d+)?
third capturing group (which is optional), a optional spaceh?
, optional capturing group(d+)?
$
: end anchorUpdate: OP changed their question so this is my new attempt:
Thanks @The fourth bird for remove trailing space in the third capturing group
Demo: regex101
(?:hw+)+
to the first capturing group to match multiple characters group likea1 2a
after9/91
patternw+
to[a-zA-Z]+
to match only word character.You forgot to create a group for the first part and to account for the
x
sequence. You should also make the last part optional and account for the leading optional prefix in your second part. The result of those changes could look like this:You can add the lazy group
(?: w+)+?
to reflect the additional trailing sequence to the first group in your changed question:Since you tagged also php i will provide a php solution without a regex for your problem so you can also check it out as an alternative.
The output of the
group
variables is as expected.Basically your strings have a pattern.
I hope it helps you.
(updated my answer based on your latest input change in your question)