Getting Img Src and Href

Sep 7, 2011 at 2:06 AM

I have pages that use images as links, and I am trying to get the link as well as the images url.

So if it was: <a href="LINK"><img src="IMAGEURL"></a>

I need to collect both the link and the image url.

This is what I have, but I don't know how to go about collecting the image url from inside the existing foreach

HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
{ HtmlAttribute att = link.Attributes["href"]; }
Is there a way of dong this?

Sep 7, 2011 at 3:22 AM

The image tag is contained within the <a> node, you would have to look for the attribute you want inside the child nodes of the <a> node.

Sep 7, 2011 at 3:59 PM

This is what I did, but it is only returning the src value of the first image and then repeating it for all others.

HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load(url);
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a[@href]");
foreach (HtmlNode linkNode in linkNodes)
HtmlAttribute link = linkNode.Attributes["href"];
HtmlNode imageNode = linkNode.SelectSingleNode("//img");
HtmlAttribute src = imageNode.Attributes["src"];

string imageLink = link.Value;
string imageUrl = src.Value;

Whats wrong here?