This project has moved and is read-only. For the latest updates, please go here.
2
Vote

body node not parsed when head not closed

description

I have run across some situations when the head tag was not closed. In this case the parser does not detect the body, so it's missing from the HtmlDocument tree.

Example
<html>
<head>
   <title>this head is wrong</title>

<body>
 <p>example</p>
</body>
</html>
The workaround that I did was to check for the presence of the </head> tag in the text and if not found then add it just in front of <body before loading the HtmlDocument.
if (!html.Contains("</head>"))
    html = html.Replace("<body", "</head><body");
Would be nice to have this fixed in the parser itself.

comments