Visual Studio Code - VsCode remove dots from xml tag content

Max
August 14, 2024
131 views
0 votes
2 Answers

I have an XML file and only for specific tag names I need to remove dots (.) from contents, I don’t know the dots position and the numbers (ex: "12345.6", "1.23.456.7", "ABC.456.98"). For example, if I have:

<?xml version="1.0"?>
<MyData>
   <test>A.123.236</test>
   <tag1>202400.000.0.0.17731</tag1>
   <tag2>some content</tag2>
   <tag3>some.content</tag3>
   <test>dotted.content.123</test>
   <data>
        <test>dsd456.1</test>
        <tag5>some.content</tag5>
   </data>
</MyData>

I want to remove dots within the content of the "test" tag, so:

<?xml version="1.0"?>
<MyData>
   <test>A123236</test>
   <tag1>202400.000.0.0.17731</tag1>
   <tag2>some content</tag2>
   <tag3>some.content</tag3>
   <test>dottedcontent123</test>
   <data>
        <test>dsd4561</test>
        <tag5>some.content</tag5>
   </data>
</MyData>

In VsCode, I find the content with: <test>(.+?)</test> but I don’t know what to put in replace field.
Thanks in advance.

Tags: regex visual-studio-code

Answers

- camilo_chart
- August 13, 2024 at 5:52 pm
- 0 votes
0
you are using VScode find/replace option? or is another programming language ?

i can help you with this python code:
```
import re
content ="""
<?xml version="1.0"?>
<MyData>
   <test>A.123.236</test>
   <tag1>202400.000.0.0.17731</tag1>
   <tag2>some content</tag2>
   <tag3>some.content</tag3>
   <test>dotted.content.123</test>
   <data>
        <test>dsd456.1</test>
        <tag5>some.content</tag5>
   </data>
</MyData>
"""

tag_open = r'<test>'
tag_close = r'</test>'
pattern_text = tag_open+r'(.*?)'+tag_close
pattern = re.compile(pattern_text, re.DOTALL)
matches = pattern.findall(content)
fix_content = content
for match in matches:
    fix_content = fix_content.replace(match,match.replace('.',''))
print(fix_content)
```
The algorithm searches for all the content within the tags.
then replace the content by removing the dots.

This will be useful if the content is not repeated within other tags

In case the content is repeated elsewhere, you can remove the capture group (.*?) and put only .*?

This will replace all text including tags, it will work as long as the tags don’t have dots within their definition.
Login or Signup to reply.

- WiktorStribiew
- August 14, 2024 at 3:19 pm
- 0 votes
0
In the search and replace panel, when your document is opened, you can use a regex like
```
(?<=<test>[^<]*).(?=[^<]*</test>)
```
Which matches any . char inside <test> and </test> strings with no < in between these tags.

Details:
- (?<=<test>[^<]*) – a positive lookbehind that matches a location that is immediately preceded with <test> and then any zero or more chars other than <
- . – a dot
- (?=[^<]*</test>) – a positive lookahead that matches a location that is immediately preceded with any zero or more chars other than < and then a </test> string.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Visual Studio Code – VsCode remove dots from xml tag content

Answers