Feb 12, 2012 at 2:35 PM
Edited Feb 12, 2012 at 2:39 PM
I have a SQL database containing the HTML for the pages in my site. I use ADO to pull this out and display it.
I am in the process of developing a feature which allows me to replace certain phrases with links. This works well at a basic level but I need to do some extra checking to make it work perfectly.
Basically, I'm pulling out the HTML and, before I pass it to the page, assigning it to a string variable in C#. I then have a second table with two columns: one containing phrases I want to look for; another containing full URLs I want to replace the phrase
with. So, for example, if the phrase BBC was found, I replace it <a href=http://www.bbc.co.uk>BBC</a>.
I'm looping through this second table looking at every phrase, like so:
tempText = tempText.Replace((string)rdr2["phrase"], "<a href=\"" + (string)rdr2["url"] + "\">" + (string)rdr2["phrase"] + "</a>");
I need to replace the phrase with the link ONLY when the phrase is not either already within a link or is in a header.
I've tried checks for "phrase</a>" but these don't work as the phrases in links are often in <span>s within links. The same goes for headers.
I have been advised that the Html Agility Pack is the way to go.
I can't find any documentation on it however so would appreciate it if someone could offer modifications to my code to take care of this and, perhaps, a brief explanation so I can undersand it, or a pointer to some documentation.