exception when attempting xslt transformation

Topics: User Forum
Dec 26, 2006 at 10:20 PM
Hi,

I'm running into an ArgumentOutOfRangeException when attempting to run an xslt transformation on an html page. The exception happens in HtmlNodeNavigator.LocalName, where an index of 0 is used with an empty list:

_currentnode.Attributes_attindex

In other words, Attributes is an empty list, and _attindex is 0.

I believe the root cause is in the class HtmlNodeNavigator, where _currentnode is getting out of synch with _attindex. There are many places where _currentnode is set in this class without any change to _attindex. For example the MoveToNext method sets _currentnode to _currentnode.NextSibling but does not change _attindex or check to see if it is set to something beyond the end of the Attributes list of the new _currentnode.

There are many html files that I have tried that fail this way, and here is one of them:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

<html>
<head>
<title></title>
<style type="text/css">
td {
border: 1px solid black;
padding: 0px;
font: 10pt Arial
}
table {
border-collapse: collapse;
}
</style>
</head>

<body>
<table>
<tr><td>test</td><td>testing</td></tr>
<tr><td>TESTingPPP</td><td>Instrument Model No:</td></tr>
<tr><td>test</td><td>testing</td></tr>
</table>


</body>
</html>

Here is the xslt:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="UTF-8"/>
<xsl:template match="/">
<xsl:copy-of select="*"/>
</xsl:template>
</xsl:stylesheet>

Here is the code:

HtmlDocument doc = new HtmlDocument();
doc.Load(@"c:\temp\test.html");
XslCompiledTransform xslt = new XslCompiledTransform();
xslt.Load(@"c:\temp\test.xsl");
using ( FileStream fileStream = new FileStream(@"c:\temp\test.xml", FileMode.OpenOrCreate) )
{
xslt.Transform(doc, new XsltArgumentList(), fileStream);
fileStream.Close();
}

David Pirkle
Coordinator
Jan 1, 2007 at 1:31 PM
Hi,

You are right, there is a design error in the library. It's an old one, and it requires some work to fix actually.

Because an HtmlAttribute is not an HtmlNode, by design, SelectNodes cannot directly retrieve HTML attributes.

Funny though, not many people have noticed this so far (that I am aware of) :-)