Html Agility Pack

Topics: Developer Forum
Jun 17, 2010 at 8:56 PM
Wondering if this application supports scraping a password protected site? I have not taken a thorough look at it. Wondering if anyone has done this using Html Agility Pack.
Jun 17, 2010 at 9:05 PM

It depends on how the site is password protected. It does support passing in credentials that use basic auth. It also supports using a proxy and trying to fake cookies to a degree.

Jun 17, 2010 at 9:11 PM

Essentially, I would like to use Html Agility Pack to scrape html from a site that has pages with a .do extension.  I believe this is Java extension.  I have a username and password to log into the target site that I need to scrape information from, so once I log in I need to browse to a couple pages scraping info from them. 

Not sure what is meant by it supporting passing in credentials that use basic auth.  Thank you for any information provided.

Jun 18, 2010 at 8:49 AM

Use HttpWebRequest HttpWebResponse to send/receive from/to server: GET, POST html pages. Htmlagilitypack is good for parsing html, but not best as http client.