I am working with some raw strings to avoid escape characters and came across this funny syntax highlighting on VS Code. I apologize if this is a bad question; I am merely curious regarding the reason behind the question marks being highlighted (comparing s1
and s2
). If it helps, I am using the GitHub Dark Default theme.
Here’s some code for your copy-paste purposes:
s1 = "hello?"
s2 = r"hello?"
s2 = r"hello?"
Printing these strings gives, as expected, the following output:
Hello? Hello? Hello?
2
Answers
In VSCode you can go to your command pallete (Ctrl+Shift+P) and search up
Developer: Inspect Editor Tokens and Scopes
Then when you hover inside your raw string at the question mark you will see:
You can see that Textmate is running the show behind the scenes and that it is showing syntax highlighting for regex.
More information about Textmate regex: https://macromates.com/manual/en/regular_expressions
While this doesn’t explain why regex highlighting is in play with a raw-string, I would assume that the choice was made because raw strings are often used for regex. This would obviously be odd highlighting if, instead of regex, you were sticking file paths in your r-string.
Why (in terms of mechanics)
You can inspect the token scopes in VS Code using the
Developer: Inspect Editor Tokens and Scopes
command in the command palette.For
"?"
”s?
, you’ll see the following textmate scopes:string.quoted.single.python
,source.python
.For
r"?"
‘s?
, you’ll seekeyword.operator.quantifier.regexp
,string.regexp.quoted.single.python
,source.python
.For
r"?"
s?
, you’ll seeconstant.character.escape.regexp
,string.regexp.quoted.single.python
,source.python
.For all three, you’ll see that the language mode is Python (I.e. There’s no language embedding going on here).
You can find the TextMate grammars for those token scopes in the following two files:
https://github.com/microsoft/vscode/blob/main/extensions/python/syntaxes/MagicPython.tmLanguage.json (
source.python
,string.quoted.single.python
,keyword.operator.quantifier.regexp
,string.regexp.quoted.single.python
,constant.character.escape.regexp
)https://github.com/microsoft/vscode/blob/main/extensions/python/syntaxes/MagicRegExp.tmLanguage.json (
keyword.operator.quantifier.regexp
,constant.character.escape.regexp
)Why (in terms of software design)
I echo the similar thoughts as JNevill. Python has raw string literals, but there’s no "regexp string literals". There’s nothing that says you can’t write regexp strings using non-raw string literals in Python. Such textmate scopes for regex inside Python raw string literal contexts are probably provided by VS Code since raw string literals are useful for doing regexp string literals in Python.