Can only load xml document from file system, not from web

Topics: Developer Forum
Jan 7, 2011 at 3:49 PM

I've used HAP successfully before, downloading xhtml pages from web. However, now I'm trying to load and parse xml documents. HAP will only load xml documents that are located on my file system, "C:\xml\MyXml.xml" for instance. It will not load it from web ( Using Fiddler, I can see that HAP is actually requesting the xml documents over the web, and the server also respond with the xml document. However, it stops there, nothing get parsed. The HtmlDocument is empty, no ChildNodes or anything. When loading from file system, it get parsed successfully to a HtmlDocument.


Any clues?

Jan 11, 2011 at 8:43 PM

What are you using? HtmlWeb or HtmlDocument?

HtmlDocument.Load(string path) uses StreamReader to read a file from the file system.


var doc = new HtmlDocument();

Feb 15, 2012 at 5:57 PM

Probably much too late to help you, but I ran into this problem as well.  I will post what I found in the hopes of helping others than come across this.

In HtmlWeb.cs I added the bold part:


private bool IsHtmlContent(string contentType)

return contentType.ToLower().StartsWith("text/html") || contentType.ToLower().StartsWith("text/xml");


The place where this is crucial is in HtmlWeb.cs.  Notice that the doc is never loaded if "html" is false, and html will be false because the ContentType of an online xml will be "text/xml".  That is why I made the change above.

// try to work in-memory

if ((doc != null) && (html))

	if (respenc != null)
		doc.Load(s, respenc);
		doc.Load(s, true);