SelectSingelNode returns null

Topics: User Forum
Jan 21, 2012 at 2:39 PM

Hello, I'm a beginner in using HtmlAgilityPack and XPath.

I have download and install the Pack and the Pack Tester. Using the Tester I have gotten an XPath.

 

My problem is that in my code (see below) SelectSingleNode will always return null when using the xpath that I got from the Tester!

I have tried it in HAP Testbed v 1.1 and it works there so what am I doing wrong?

The code below should set result to "1960"

// Anders

HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load("http://allmusic.com/artist/buddy-emmons-p631/credits");
string xpath = "/html[1]/body[1]/div[1]/div[5]/div[1]/div[1]/div[3]/table[1]/tr[7]/td[1]";
HtmlNode htmlnode = doc.DocumentNode.SelectSingleNode(xpath);
if (htmlnode != null)
{ 
result = htmlnode.InnerText;
}
Jan 21, 2012 at 8:38 PM

Ok, I have found the problem! This time it wasn't me!!!

I have installed HtmlAgilityPack via NuGet and it installed version 1.4.3

This version has an error when handling tables!

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html> 
<head>   
<title>hap test table</title> 
</head> 
<body>   
<table>     
<tr>       
<td>foo</td>       
<td>bar</td>     
</tr>   
</table> 
</body>
</html>

becomes

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html> 
<head>   
<title>hap test table</title> 
</head> 
<body>   
<table>     
<tr>       
<td>foo               
<td>bar
          </td></td></tr></table> 
</body>
</html>

If I go back to version 1.4.0 then it works like it should...

I guess I have to make an Issue of this...

// Anders

Jan 31, 2012 at 3:11 AM

Same problem is valid for the list elements in revision 94773.

Feb 2, 2012 at 4:39 AM
Edited Feb 2, 2012 at 4:40 AM

I found that the table handling error is caused by different logic from 1.4 stable version.

You can see that it is different on the searching name ( one is from parameter and one is _currentnode.Name)

1.4 stable version
HtmlDocument.cs line 1122

 

        private HtmlNode FindResetterNode(HtmlNode node, string name)
        {
            HtmlNode resetter = (HtmlNode)_lastnodes[name];
            if (resetter == null)
                return null;
            if (resetter.Closed)
            {
                return null;
            }
            if (resetter._streamposition < node._streamposition)
            {
                return null;
            }
            return resetter;
        }

        private bool FindResetterNodes(HtmlNode node, string[] names)
        {
            if (names == null)
            {
                return false;
            }
            for (int i = 0; i < names.Length; i++)
            {
                if (FindResetterNode(node, names[i]) != null)
                {
                    return true;
                }
            }
            return false;
        }

current  revision 9477
HtmlDocument.cs line 1190

       private HtmlNode FindResetterNode(HtmlNode node)
        {
            HtmlNode resetter = Utilities.GetDictionaryValueOrNull(Lastnodes, _currentnode.Name);
            if (resetter == null)
                return null;

            if (resetter.Closed)
                return null;

            return resetter._streamposition < node._streamposition ? null : resetter;
        }

        private bool FindResetterNodes(HtmlNode node, string[] names)
        {
            if (names == null)
                return false;

            for (int i = 0; i < names.Length; i++)
            {
                if (FindResetterNode(node) != null)
                    return true;
            }
            return false;
        }