I have an XML file and only for specific tag names I need to remove dots (.) from contents, I don’t know the dots position and the numbers (ex: "12345.6", "1.23.456.7", "ABC.456.98"). For example, if I have:
<?xml version="1.0"?>
<MyData>
<test>A.123.236</test>
<tag1>202400.000.0.0.17731</tag1>
<tag2>some content</tag2>
<tag3>some.content</tag3>
<test>dotted.content.123</test>
<data>
<test>dsd456.1</test>
<tag5>some.content</tag5>
</data>
</MyData>
I want to remove dots within the content of the "test" tag, so:
<?xml version="1.0"?>
<MyData>
<test>A123236</test>
<tag1>202400.000.0.0.17731</tag1>
<tag2>some content</tag2>
<tag3>some.content</tag3>
<test>dottedcontent123</test>
<data>
<test>dsd4561</test>
<tag5>some.content</tag5>
</data>
</MyData>
In VsCode, I find the content with: <test>(.+?)</test>
but I don’t know what to put in replace field.
Thanks in advance.
2
Answers
you are using VScode find/replace option? or is another programming language ?
i can help you with this python code:
The algorithm searches for all the content within the tags.
then replace the content by removing the dots.
This will be useful if the content is not repeated within other tags
In case the content is repeated elsewhere, you can remove the capture group (.*?) and put only .*?
This will replace all text including tags, it will work as long as the tags don’t have dots within their definition.
In the search and replace panel, when your document is opened, you can use a regex like
Which matches any
.
char inside<test>
and</test>
strings with no<
in between these tags.Details:
(?<=<test>[^<]*)
– a positive lookbehind that matches a location that is immediately preceded with<test>
and then any zero or more chars other than<
.
– a dot(?=[^<]*</test>)
– a positive lookahead that matches a location that is immediately preceded with any zero or more chars other than<
and then a</test>
string.