Documentation
documentation,
873K, uploaded
May 7 2010
-
36827 downloads
HAP Explorer
application,
155K, uploaded
May 7 2010
-
15282 downloads
1.4.0 Adds some serious new features to Html Agility Pack to make it work nicer in a LINQ driven .NET World. The HtmlNodeCollection and HtmlAttributeCollection now generic ILists and expose IEnumerable<T> methods to mimic LINQ to XML. This opens an alternative to XPATH for querying the HTML tree. Beyond this 1.4.0 introduces tons of code cleanups and removal of all old non-generic classes (no more arraylists :). 1.4.0 also brings basic msdn like documentation and a new program called HAP Explorer for viewing the HTML tree. Changes from Beta 2. The biggest changes are better support for character encoding and support for medium trust environments.
- Removed DescendantNodes() function since it was identical to the Descendants() function.
- Patch# 4706. Added UserAgent property to HtmlWeb class to be used in webrequests. Minor update to code supplied by radicull
- Patch# 4432 . Applied HtmlEntity decoding of UniCode html entities supplied by tsai
- Patch# 4396. Applied UTF-8 changes from JudahGabriel
- Applied JonGalloways HAPExplorer patch
- Added Visual Studio 2010 Beta 2 Solution
- Fixed compatibility in Medium Trust environments. Added a default list of extensions and content types to be used when the registry is not available.
- Updated Charset detection to use a Dictionary<string,string> instead of arraylists of NameValuePair
- search tag support in HAPExplorer
|
Great Tool. I'm using it on my project too :) http://tokencrawler.codeplex.com
by
tiagonmas
on
Dec 20 2011 at 8:26 PM
Great job. Turned my html string into a valid xhtml in no time.
by
kazino
on
Dec 15 2011 at 1:21 PM
This is so cool!
by
wrwcmaster
on
Nov 13 2011 at 4:59 AM
Excellelent !!!
by
l_beka_l
on
Oct 28 2011 at 3:42 PM
Easy to use, works great. An effective tool. Thank you for producing.
by
Cheeso
on
Oct 24 2011 at 7:02 PM
It takes three lines of code to give it some HTML and get back the attribute value of some particular node, and it seems pretty snappy as well.
The HAPExplorer app is great for locating a node of interest within some HTML that you then extract using the XPath to it.
by
davepl1968
on
Aug 25 2011 at 10:37 PM
Very good!
by
bumm
on
Aug 25 2011 at 12:58 PM
Good in general, but lucks customization in some places. E.g. no option for indenting output html, or no option to specify encoding in HtmlWeb.Load method.
by
pabdulin
on
May 11 2011 at 6:44 PM
I'd give it 4+ rating if it didn't cause stack overflows for many pages out on the web. And by "many", probably less than 0.1%, but these stack overflows cause the entire app to crash and be restarted, so this is a fatal problem.
by
aaronsilvas
on
May 9 2011 at 4:18 PM
I've been forced to scrape HTML lately and although I could have done it without HTML Agility Pack, I can't for the life of me think why I would want to. HTML Agility Pack takes the tedious parts of HTML parsing away, leaving me free do deal with the logical structure of the document I'm scraping.
This is a must-have library.
by
Randolpho
on
Apr 29 2011 at 8:29 PM
KISS: Very simple learning curve, trivial to integrate with, LINQ support very comfortable.
The only thing I was missing was "native" async support for loading the HTML documents.
by
rondarz
on
Apr 16 2011 at 12:58 PM
Hi HtmlAgilityPack Team, I came cross this library when I was in the midst of having troubles in parsing HTML page in server-side. I found it so easy to use and it does its job really good. This library is a must-have for me and my team. Keep it up ! You rock!
by
KyawThurein
on
Apr 1 2011 at 8:35 PM
Dear HtmlAgiltyTeam,
Thank you for contributing this excellent library. This is a really impressive piece of code and extremly helpful. Thank you again! günther
by
qhaut
on
Mar 31 2011 at 11:22 AM
Great work !
Expecting The New Linq Version !
by
dfang
on
Mar 10 2011 at 9:52 AM
Nice work !
by
softlion
on
Feb 25 2011 at 4:05 PM
Excellent library. It's saved me having to write taxing regular expressions and also promotes cleaner code.
by
simoncoughlan
on
Feb 20 2011 at 3:35 PM
I ran into a lot of issues with "real world" HTML documents. This solves pretty much all of them.
by
Richarmeleon
on
Feb 17 2011 at 7:47 PM
Lovely bit of code.
by
mjhufford
on
Feb 1 2011 at 1:38 PM
A great library! It replaced much custom coding for parsing plain text and increased the robustness of the parsing. It's used heavily in greatvocab.com along with the Unifico project.
by
ccook
on
Jan 23 2011 at 12:59 PM
. I have to convert html to plain text in one of my applications and i used this library, although i have to make a little modification to the code, this lib was really helpfull.
<a href="http://cazarebacau.info/">cazare bacau</a>
by
johnnyjames789
on
Jan 21 2011 at 12:58 PM
Superb library! Been trying it for a while and is definitely awesome.
Thanks a lot
by
batman99
on
Jan 13 2011 at 4:58 AM
Wow. Fantastic! Just what was needed.
by
drewid
on
Jan 2 2011 at 6:53 AM
Fantastic!!! Exactly what I am searching for!! No more Webbrowser. Keep ahead!
by
bugmenot2
on
Dec 23 2010 at 6:55 PM
This project is fantastic. I am currently using it in a web scraping project and can't fault it!
Kepp up the good work.
by
MattDev2000
on
Dec 21 2010 at 9:50 AM
I have started using 1.4.0 release and what a wonderful project. Perfectly work in html manipulation. This can be a kind of standard project that can be used for html parsing.
by
saranghadavale
on
Dec 14 2010 at 12:45 PM
excellent project!
by
tonyqus
on
Dec 5 2010 at 2:37 AM
This is a great library. I have to convert html to plain text in one of my applications and i used this library, although i have to make a little modification to the code, this lib was really helpfull.
I wish the developers "All the best" for there future endeavors.
by
chowdarysway
on
Nov 26 2010 at 8:07 AM
Thanks for this nice lib. I find savetoXml feature very robust.
Do not forget to flush() in Save methods ! This fix was already given in issue tracker page by JAPoole.
by
jliettehk
on
Nov 17 2010 at 7:59 AM
<i>Excellent work</i>
by
xmen
on
Nov 12 2010 at 2:07 AM
Great library, excellent functionality, having attempted (and failed) to write something similar, very impressed with this!
by
JAPoole
on
Oct 21 2010 at 2:53 PM
Great library! Thanks!
by
stefan_ne
on
Sep 10 2010 at 1:25 AM
This is by far the best HTML parser I've used. It even parses ASP without any problems.
by
raygun
on
Sep 7 2010 at 6:20 AM
nice !
by
medcl
on
Sep 2 2010 at 9:54 AM
This class library is truly magnificent. I can't believe how many hours it saved me. Thanks to all the guys who worked on this.
I noticed some reviewers complain about the lack of documentation to do this or that: Learn xpath first to efficiently use Html Agility Pack.
by
Anders_Rask
on
Aug 27 2010 at 9:35 AM
This project truly is a godsend. Ive used Nokogiri, Hpricot, and others, and this project is far and above the best.
by
whitehawk
on
Jul 19 2010 at 1:06 AM
Love this project. We are using this in some text templates over at facebookgraphtoolkit.codeplex.com to automatically generate objects using Facebook's online api documentation. Works really well.
by
ntotten
on
Jul 14 2010 at 1:01 PM
This really saved me a lot of time, to be honest I wonder why it is not included under .net framework...
To be more specific, before I found this toolkit, I have used more than 16 hours trying to do the job with XmlDocument. I found this toolkit and within minutes I got the job done :)
I just hope that maintenance will continue...
by
fel
on
May 14 2010 at 3:13 AM
The HTML Agility Pack has saved me many many hours over the years I've been using it.
THANK YOU for continuing to maintain and improve it.
You rock
by
gduncan411
on
May 9 2010 at 4:48 PM