I need to make an ASP.net C# function for removing all image attributes, except "src", "align", "alt" and "title". The function must only remove content inside image tags. The input is html used for displaying articles, where I need to clean up image attributes.
public static string FixImageAttributes(string html-string)
{
// Remove all attribues in the html-string here, except: "src", "align", "alt" and "title".
return html-string;
}
Example:
If function input (html-string) is this:
<html>
<body>
<div>
<h1>Some html here</h1>
<p><img align="right" title="" border="0" hspace="7" alt="" vspace="7" src="/upload/content/images/bla/bla/test.jpg"></p>
</div>
<div>
<h2>Lorem impum</h2>
<p><img src="/upload/content/test/blah/image.jpg" width="624" height="255" alt="Text here" title="Hello" border="0" vspace="0" hspace="0"></p>
</div>
</body>
</html>
The function output should be this:
<html>
<body>
<div>
<h1>Some html here</h1>
<p><img align="right" title="" alt="" src="/upload/content/images/bla/bla/test.jpg"></p>
</div>
<div>
<h2>Lorem impum</h2>
<p><img src="/upload/content/test/blah/image.jpg" alt="Text here" title="Hello"></p>
</div>
</body>
</html>
2
Answers
You can use
HtmlAgilityPack
for this and write something like this:More about can be found here
HtmlAgilityPack
can be found here:https://html-agility-pack.net/?z=codeplex
I modified Ran Turner’s answer a bit:
Output from console: