not closing <option> element properly

Topics: Developer Forum
Dec 10, 2012 at 5:07 PM

var sb = new StringBuilder("<html>\r\t<body>\r\t\t<select>\r");
sb.AppendLine("\t\t\t<option value =\"foobar\" selected=\"selected\">FooBar</option>");
sb.AppendLine("\t\t\t<option value =\"foo\">foo</option>");
sb.AppendLine("\t\t\t<option value =\"bar\">bar</option>");
sb.AppendLine("\t\t</select>\r\t</body>\r</html>");
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(sb.ToString());
var sb2 = new StringBuilder();
var sw = new StringWriter(sb2);
htmlDoc.Save(sw);
Assert.AreEqual(sb.ToString(), sb2.ToString());
here is the input:
<html>
	<body>
		<select>
			<option value ="foobar" selected="selected">FooBar</option>
			<option value ="foo">foo</option>
			<option value ="bar">bar</option>
		</select>
	</body>
</html>

this is the output:
<html>
	<body>
		<select>
			<option value="foobar" selected="selected">FooBar
			<option value="foo">foo
			<option value="bar">bar
		</select>
	</body>
</html>
Jan 12, 2013 at 3:57 PM

Blair,

I was running into problem myself. My solution was to add these lines before and after I load the HTML into the document (which is when HtmlAgilityPack actual parses the tree):

 

// HtmlNode.ElementFlags is an app-wide public static (singleton, essentially)
// This tell's HtmlAgilityPack to remove the "option" tag's flags.
// The flag we are removing is "HtmlElementFlag.Closed".
HtmlElementFlag oldFlags = HtmlNode.ElementsFlags["option"];
HtmlNode.ElementsFlags.Remove("option");

document.LoadHtml(htmlString);
// Or, document.Load(htmlStream);

// This part is not necessary, I just like cleaning up after myself.
HtmlNode.ElementsFlags["option"] = oldFlags;
Hope this works out for you!