1.4.0 Beta 2

Rating:        Based on 19 ratings
Reviewed:  15 reviews
Downloads: 36685
Released: Oct 3, 2009
Updated: Oct 3, 2009 by DarthObiwan
Dev status: Beta Help Icon

Recommended Download

Application HtmlAgilityPack.1.4.0.beta2.binaries
application, 115K, uploaded Oct 3, 2009 - 20257 downloads

Other Available Downloads

Application HtmlAgilityPack.1.4.0.beta2.HAPExplorer
application, 143K, uploaded Oct 3, 2009 - 3719 downloads
Documentation HtmlAgilityPack.1.4.0.beta2.Documentation
documentation, 1276K, uploaded Oct 3, 2009 - 7608 downloads
Application HtmlAgilityPack.1.4.0.beta2.Source
application, 400K, uploaded Oct 3, 2009 - 5101 downloads

Release Notes

Html Agility Pack Beta 2 is a minor update to Beta 1 with support documentation and a few more bug fixes. The two major additions are newly compiled help documentation and the Html Agility Pack Explorer. HAP Explorer is meant to help visualize the node tree of the HtmlDocument object. It supports opening a static file or a url.

Release Notes
  • Added SandCastle/Docproject Documentation project. This will be used to generate Chm and HxS documentation files
  • Added new Html Agility Pack Explorer project. This is a wpf application that can be used to explore the HtmlDocument node tree.
  • Major cleanup on the code base. Ran an Aggressive Resharper code cleanup across the library. Updated XML comments and other minor tweaks for smaller and concise code
  • Included patch for enabling Proxies when getting a url for parsing
  • Fixed XPath property to not include the #document node

The Documentation project requires Sandcastle, DocProject and the Visual Studio 2008 SDK installed. For this reason it is only included in a separate solution.

Reviews for this release

     
I evaluated this against a collection of 30 thousand html pages in the wild. It did very well at converting them all to reasonable xml to store in an SQL SERVER 2005 database xml column type. The minor bugs I found were easy to identify and fix in the source code. The parsing speed was over 100 times better than another project on CodePlex, the System.Html software.
by publius on Mar 31, 2010 at 2:30 PM
     
Very easy to use and useful api.
by hasankhan on Mar 26, 2010 at 5:53 AM
     
After some initial efforts to catch the idea I have found that this library is simple enough and very good time saver. Thank you very much for excellent product. Keep your efforts, please.
by agirenko on Mar 12, 2010 at 7:57 PM
     
Excellent product. Saved my day. Easy to use and really fast to learn. Had to parse some really bad HTML (it even had two body tags) and it worked flawlessly.
by carlescs on Mar 11, 2010 at 3:31 PM
     
Fantastic! I have a complex web scraping app that used the WebBrowser object (MSHTML DOM engine). Other than some threading issues (it's based on COM remember) it worked OK - until I had to deploy it to a remote ASP.NET server (security privileges). The HtmlAgilityPack saved the day. The object model is very similar, and all I had to do was replace API calls and my parsing logic remained in tact!
by andychops on Mar 9, 2010 at 6:52 PM
     
Great Library, saves me so much time
by rsoeteman on Mar 5, 2010 at 2:57 PM
     
Briljant package. Takes some time to get familiar with, but ideal for HTML parsing. Currently working on a parser for imdb.com pages because imdb-api is way too complex and buggy.
by loekf on Mar 2, 2010 at 2:43 PM
     
In Response to some of the people out there who have said HTML AGILITY is useless : This might not be the very best HTML editor /parser out there but is definitely one of those , but saved me tons of time .You guys deserve appreciation for your work. Keep up the good work Guys :) Kudos !!!!!!!!
by acharyapank on Feb 24, 2010 at 3:44 PM
     
Great piece of work. A time-saver. Look forward to its continued development
by cwford on Feb 10, 2010 at 7:39 PM
     
Brilliant! I needed to parse a hierarchy of hundreds of linked web pages, absolute piece of cake with this library. Got my hacky little project done in under an hour, and can move on with my life. Thank you, thank you, thank you.
by specialBobby on Feb 3, 2010 at 11:21 PM
     
Great project, makes html parsing a breeze. Thanks!!
by mausch on Jan 27, 2010 at 8:32 PM
     
Love the direction this library is going. It still has a few rough edges and some land mines (stack overflows), but definitely on course.
by SMHoff on Jan 22, 2010 at 2:49 PM
     
Awesome set of classes to go after online content. Really does turn any bit of HTML into a XPath enabled, surfable, DOM. Awesome work guys.
by inestyne on Dec 12, 2009 at 9:56 PM
     
Due to how close this API is to System.Xml, I felt immediately comfortable with it. It did the job I wanted and saved me lots of time on a personal project. I didn't immediately see anything to help manipulate the style attribute, but I certainly won't hold that against this release. Nice Job!
by cooperpx on Dec 12, 2009 at 7:26 PM
     
Excellent, thanks for the great work!
by reteep on Nov 3, 2009 at 12:52 PM