This project has moved and is read-only. For the latest updates, please go here.

C# - Replacing back slash character

Dec 1, 2009 at 8:47 PM

During the HtmlDocument.Load it is erroring with an illegal character. We are getting the markup from an external source so I have to parse it and clean it up. Eventually I want to get the <table>. What is the correct way to strip out the "\". Thanks.

HtmlAgilityPack.HtmlNodeCollection table = doc.DocumentNode.SelectNodes("//table");

result = result.Replace("\t", "");
result = result.Replace("\r", "");
result = result.Replace("\n", "");
result = result.Replace("\\", "");
result = result.Replace("\\\\", "");

<html><head><meta http-equiv=\"Content-type\" content=\"text/html;charset=UTF-8\" /><script>function Send() {if(document.mainForm.htmlArea && document.mainForm.htmlArea.value) {var rawHtml = document.mainForm.htmlArea.value;if(rawHtml) {var htmlContainerObj = false;htmlContainerObj = parent.document.getElementById(\"hortonQV_Content\");htmlContainerObj.innerHTML = rawHtml;}}return (true);}</script></head><body onload=\"Send()\"><form name=\"mainForm\"><textarea id=\"htmlArea\" cols=\"200\" rows=\"200\" > <div class=\"hortonQVQuest\">What Is Your Finish Of Choice For An Automatic Door?</div><table cellpadding=\"0\" cellspacing=\"0\"><tr class=\"hortonQVAns1\"><td class=\"hortonQVcell\">Anodized Aluminum</td><td class=\"hortonQVpercent\">37%</td><td class=\"hortonQVTotal\">14,539</td></tr><tr class=\"hortonQVAns2\"><td class=\"hortonQVcell\">Powdercoat Paint</td><td class=\"hortonQVpercent\">28%</td><td class=\"hortonQVTotal\">11,287</td></tr><tr class=\"hortonQVAns3\"><td class=\"hortonQVcell\">Steel</td><td class=\"hortonQVpercent\">35%</td><td class=\"hortonQVTotal\">13,895</td></tr><tr><td class=\"hortonQVTotalSum\" colspan=\"3\">Total Votes: 39,721</td></tr></table> </textarea> </form></body></html>

Dec 2, 2009 at 11:30 AM

Ok, I suppose the best way would be to replace the \" in the string with " before parsing it. How come you're getting data with escaped double quotation marks?

Dec 2, 2009 at 2:39 PM

The customer's ad agency is supplying the markup. The way they do business is via iframes but the javascript they use violates cross domain rules so it bombs. I'm trying to work around it by removing the iframes. I have to parse out the markup I need.