How to Download non-html content with HtmlAgilityPack?

Topics: Developer Forum
Feb 7, 2011 at 9:42 AM

Hi,

I try to do some csv- and xls-downloads using HtmlAgilityPack. I Use the BrowserSession from http://refactoringaspnet.blogspot.com/2010/04/using-htmlagilitypack-to-get-and-post.html because it makes formprocessing really easy.

However, it's not possible to get a csv-response after a POST. HtmlWeb tries to parse the response as html, which obviously cannot succeed. So what I need is the raw response. Have a look at HtmlWeb.Get, line 1486:

Stream s = resp.GetResponseStream();
string rawResponse = new StreamReader(s).ReadToEnd();

Stream s = resp.GetResponseStream();
string rawResponse = new StreamReader(s).ReadToEnd();

It's possible to get the raw response here. Since I don't want to change your code, I tried using HtmlWeb.PostResponseHandle. But HtmlWeb crashes then, because the response stream is already read.

So any idea about how to get binary response data?

Feb 7, 2011 at 10:40 AM

Hey,

in the meantime I figured out that it works if I clone the stream (see http://stackoverflow.com/questions/147941/how-can-i-read-an-http-response-stream-twice-in-c).

However, could anyone give me an advise, if HtmlAgilityPack is a good choice for this case?

 

Thanks in advance.