Context and Explaination
I am doing a telegram bot, and i want to add the excape char ""
before every "_"
char that is not in a username (a word starting with "@"
) like "@username_"
, to prevent some markdown errors (in fact in telegram the "_"
char is used to make a string italic).
So, for example, having this string:
"hello i like this char _ write me lol_ @myusername_"
i want to be matched only the first two "_"
chars but not the third
Question
what’s the correct way to do this with a regex pattern?
Expected Conditions and Matching
Condition | Match |
---|---|
"_" alone: ("_" ) |
YES |
"_" in a word without "@" : ("lol_" ) |
YES |
"_" in a word starting with "@" : ("@username_" ) |
NO |
"_" in a word containing "@" after the "@" : ("lol@username_" ) |
NO |
"_" in a word containing "@" before the "@" : ("lol_@username" ) |
YES |
"_" in a world like: ("lol_@username_" ) |
first: YES second: NO |
What i have tried
so far i arrived at this, but it does not work properly:
"(?=[^@]+)(?:s[^s]*(_)[^s]*s)"
EDIT
I also want that in this string: "lol_@username_"
the first char "_"
to be matched
4
Answers
I assume you only care about
@
being at the start of a word. You can usere.sub
along withreplace
and(?:s|^)[^@]S+b
to match the words that fit your spec:If you care about
@
appearing anywhere in a word, try(?:s|^)[^@s]+b
:Per OP comment, sounds like the latest spec is to escape
_
that are anywhere except after@
in a word:Extract with PyPi regex library:
See Python proof.
Explanation
Remove with
re
:See Python proof.
Replace with
re
:See another Python proof.
You could match all non whitspace chars after matching
@
and capture the_
in a group using an alternation. If the callback of re.sub, check if group 1 exists.If it does, return an escaped underscore or the excaped group 1 value (which is also an underscore), else return the match to leave it unchanged.
Regex demo
Output
Based on @OlvinRoght’s comment, with a small edit, this should do the trick:
Regex
((?:^|s)(?:[^@s]*?))(_)((?:[^@s]*?))(?=@|s|$)
Code example
Expected output:
Demo
See it live
Note:
Although this works, @TheFourthBird’s answer is faster. (And more elegant I think.)