Removing element by class

Topics: User Forum
Mar 7, 2011 at 9:34 AM

Hi,

I'm using the html agility pack to read the contents of my html document into a string etc. After this is done, I would like to remove certian elements in that content by their class, however I am stumbling upon a problem.

My Html looks like this:

 

<div id="wrapper">
	<div class="maincolumn" >
        <div class="breadCrumbContainer">
            <div class="breadCrumbs">
            </div>
        </div>
		
        <div class="seo_list">
			<div class="seo_head">Header</div>
        </div>

Content goes here...
</div>

 

Now, I have used an xpath selector to get all the content within the <div id="wrapper"> and used the InnerHtml property like so:

 

                node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
                if (node != null)
                {
                    pageContent = node.InnerHtml;
                }

From this point, I would like to remove the div with the class of "breadCrumbContainer", however when using the code below, I get the error: "Node "<div class="breadCrumbContainer"></div>" was not found in the collection"

                node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
                node = node.RemoveChild(node.SelectSingleNode("//div[@class='breadCrumbContainer']"));

                if (node != null)
                {
                    pageContent = node.InnerHtml;
                }

Can anyone shed some light on this please?

Thanks,

Dave

 

May 21, 2012 at 4:49 AM

I'm sure Dave is passed his problem, but I was hitting this tonight and I eventually found a solution.  Posting to help others:

Here's what worked for me to remove an element using its class:

                HtmlAgilityPack.HtmlDocument htmldoc = new HtmlDocument();
                htmldoc.LoadHtml(value.Element(aw + "content").Value);
                var divs = htmldoc.DocumentNode.SelectNodes("//div");
                if (divs != null)
                {
                    foreach (var tag in divs)
                    {
                        if (tag.Attributes["class"] != null)
                        {
                            if (string.Compare(tag.Attributes["class"].Value, "feedflare", StringComparison.InvariantCultureIgnoreCase) == 0)
                            {
                                tag.Remove();
                            }
                        }
                    }
                }

                Description = (htmldoc.DocumentNode.OuterHtml);  //Gets the output as a string