skip to Main Content

Given a string such as:

"The user foo_bar has a Twitter account: https://twitter.com/foo_bar"

In order to be sent in markdown mode by the Telegram bots API, it should be formatted as:

"The user foo_bar has a Twitter account: [https://twitter.com/foo_bar]"

(Adding [] to url could be done using regex).

Is it possible to write a function in Python that can escape certain characters such as _ or * in a text, but only when these characters are not contained within a URL?

Here is an example without checking character location:

original_text = 'The user foo_bar has a Twitter account: https://twitter.com/foo_bar'
formatting_url = re.sub(
    'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*(),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', r'[g<0>]', original_text)
escaping_char = formatting_url.replace('*', '*').replace('_', '_')
print(escaping_char)

Output:

The user foo_bar has a Twitter account: [https://twitter.com/foo_bar]

Where the _ in url is also be replaced.

2

Answers


  1. First add brackets to the url using regex. Then you can iterate over each letter of the string, add an escape character whenever you see one when outside of a url. You can raise a flag whenever you see character [ or ] to know if you are inside a url:

    s = "The user foo_bar has a Twitter account: [https://twitter.com/foo_bar]"
    in_url = False
    output = ""
    for letter in s:
        if letter == "[":
            in_url = True
            output += letter
        elif letter == "]":
            in_url = False
            output += letter
        elif letter == "_":
            if in_url:
                output += "_"
            else:
                output += "_"
        else:
            output += letter
    print(output)
    
    
    Login or Signup to reply.
  2. If you are using python-telegram-bot – then there is a method for escaping text (for both Markdown versions 1 and 2). If you need to add link there – propose to format it separately and concatenate these strings.

    >>> from telegram.utils.helpers import escape_markdown
    
    >>> a = "The user foo_bar has a Twitter account: https://twitter.com/foo_bar"
    
    >>> escape_markdown(a)
    'The user foo\_bar has a Twitter account: https://twitter.com/foo\_bar'
    
    >>> escape_markdown(a, version=2)
    'The user foo\_bar has a Twitter account: https://twitter\.com/foo\_bar'
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search