This web is Taiwan Stock Exchange,(
There are no problems with my usage of Html Agility Pack on other websites.
However, I met some difficulties on this one.
I couldn't get Value from it.
The error is Null of Reference Exception.
Here is my code：
Public Sub Main()
Dim client As New WebClient()
Dim ms As New MemoryStream(client.DownloadData("http://bsr.twse.com.tw/bshtm/bshtm_report_Messages.aspx?strDate=20100413&StartNumber=2475&FocusIndex=1"))
Dim doc As New HtmlDocument()
Dim docStockContext As New HtmlDocument()
Dim values As String() = docStockContext.DocumentNode.SelectSingleNode("./tbody/tr/td").InnerText.Trim().Split(ControlChars.Lf)
My.Response.Write(values(0).Trim() & "<br/>")
doc = Nothing
docStockContext = Nothing
client = Nothing
Please help me to solve my problems,THX!
Hello. I'll try to give you some advice on how to solve that based on my own experience crawling websites.
In a first moment, I tried to do it just like you, and started having much trouble with the null references. I'm not with VS now, so syntax can be a bit wrong, but I think it will be enough for you to start working on.
You didn't mention which is the element throwing the exception, but I assume it's the values one. Do the following:
HtmlNodeCollection hnc = docStockContext.DocumentNode.SelectNodes("//tbody/tr/td")
Then, in debug, check the structure that hnc has, and use this to find the values you want. And, of course, before it:
if (hnc != null) // C#, sorry... long time no vb :-)
Regards, hope this helps, let me know if anything is not clear.
May 4, 2010 at 1:29 PM
Edited May 4, 2010 at 1:53 PM
I'm having a similar problem i.e. that the SelectNodes doesn't work with MemoryStreams. It does seem to work with files, though. Instead of DownloadData() use DownloadFile() and rewrite your code accordingly.