This project has moved and is read-only. For the latest updates, please go here.

Get Text and Anchor Text?

Jun 2, 2008 at 12:20 PM
Edited Jun 2, 2008 at 12:35 PM
Hello there everyone! First of all thanks for this great pack. It has really helped me a lot!

I have a couple of questions and if someone would be willing to help me it would be great!

I am doing a project in which I have to extract three things from a webpage.
1. All of the links (urls) which are inside it.
2. It's pure text (which is the page's html without the tags etc.) and
3. After i've found the links I need to go there in the html file and get 10 words before and 10 words after each one of them.

The first one isn't that hard.With: HtmlNodeCollection myanchors = htmlData.DocumentNode.SelectNodes("//a[@href]"); i can find the links. But what about the other 2 parts?

Thanks in advance
Jul 11, 2008 at 10:27 AM
For 2 I would use the innerText attribute on the body tag.
For 3 you should probably look at the outerText property (never used it, I might be wrong!)