HtmlAgilityPack close form tag automatically problem

Topics: Developer Forum, User Forum
Aug 18, 2011 at 10:00 AM

I am tring to parse an html file with this syntex:

<div><form>...</div>...</form>

the problem is that the HtmlAgilityPack automatically close the form tag before the div ending tag:

<div><form>...</form></div>...</form>

 so I lost many form elements. I already tried:

htmlDoc.OptionFixNestedTags = false;
htmlDoc.
OptionAutoCloseOnEnd = false;
htmlDoc.
OptionCheckSyntax = false;
HtmlNode.ElementsFlags.Remove("form");
HtmlNode.ElementsFlags.Add("form", HtmlElementFlag.CanOverlap);
HtmlNode.ElementsFlags.Add("div", HtmlElementFlag.CanOverlap);

But nothing helps!

thanks for you help!

Aug 19, 2011 at 12:18 AM

Mate, I think these markup is not correct.

 

<div>

	<form>

	</div>

</form>
Should be:
<div>
	<form>
	</form>	
</div>

Perhaps that's causing the error.

 

Aug 19, 2011 at 6:30 AM

Well, I am parsing a site that I have no control over...

Sep 6, 2011 at 10:08 PM
Edited Sep 6, 2011 at 10:30 PM

Html agility pack turns out to be... not so agile.

If "The parser is very tolerant with "real world" malformed HTML" as the front page of this project says, then why can't it handle a simple task of retrieving inputs that belong to a particular form?

Sep 6, 2011 at 10:36 PM

@dovydasm, stop whining and try to find a solution for that.