How to: several xpaths for 1 variable

Aug 17, 2009 at 4:23 AM

Hello, This is a design issue.

I wish to extract from an html files some text. the problem is that there are some html files that the xpath different then other html pages.

What I am trying  to achieve is to try extract data untill I succedd. lets say I am trying XPATH1 and it failed, then I want to try XPATH2...

How to do this?

 

Aug 18, 2009 at 2:37 PM

 

/*
xpath.for.pagename.txt:
START CONTENT
# Retrieves the headline
# If XYZ this element exists.
/body/html/div[id='head']/text()
# IF ZYX this element exists, not the above.
/body/html/div[id='headline']/text()
END CONTENT
*/

// How you organize these xpaths is up to you.
const string XPathFile = "xpath.for.pagename.txt";

HtmlNode result = null;
foreach(string xpath in File.ReadAllLines(XPathFile))
{
    // Skip comments
    if( xpath.StartsWith("#") )
        continue;
    result = doc.SelectSingleNode(xpath);
    if( result == null )
        continue;
    // Make sure you got the right node, let's say you only want TextNodes.
    if( result.Name == HtmlNode.HtmlNodeTypeNameText )
        break;
}

if( result != null )
{
     MessageBox.Show(result.InnerText);
}

Its very basic. You can build around this idea.