Parsing only shows one cell of data from table

Dec 7, 2010 at 3:01 PM

I have a dynamically created table on a web page that I need to extract only the first 2 columns. The layout of the table is this:

<table summary = "" id="data" border="1" cellpadding="2" cellspacing="1">
<tr>
	<th>column1</th>
	<th>column2</th>
	<th>column3</th>
	<th>column4</th>
	<th>column5</th>
	<th>column6</th>
	<th>column7</th>
</tr></thead>
<tr>
	<td>data1</td>
	<td>data2</td>
	<td>data3</td>
        <td>data4</td>
	<td>data5</td>
	<td>data6</td>
	<td>data7</td>
	</tr>
<tr>
	<td>data1</td>
	<td>data2</td>
	<td>data3</td>
        <td>data4</td>
	<td>data5</td>
	<td>data6</td>
	<td>data7</td>
	</tr>

Once the table is loaded dynamically, I only need the first 2 columns of data to be shown in a textbox. I have the following code, but it is displaying 
only one cell in the 7th column from the 6th row.

HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("http://www.website.com");

 // Get all columns in the document
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table[1]");

// Iterate all rows in the first table
HtmlNodeCollection rows = tables[0].SelectNodes("//tr");
for (int i = 0; i < rows.Count; ++i)
{
    // Iterate all columns in this row
    HtmlNodeCollection cols = rows[i].SelectNodes("//td");
    for (int j = 0; j < cols.Count; ++j)
    {
         // Get the value of the column and print it
         string value = cols[j].InnerText;
         txtBox1.Text = value;
     }
}
Any suggestions? Thanks.

Dec 8, 2010 at 7:41 AM
Edited Dec 8, 2010 at 7:42 AM

I don't see in your code anything that says: show me first cell. I see this: iterate through cells and show last (show all cells one after other, but after all - it will display only last one).

BTW, thead tag doesn't opens and table - doesn't closes.

Dec 8, 2010 at 2:05 PM

So what should be changed? I'm rather new to all this so any help would be appreciated. Thanks for your reply.

Dec 9, 2010 at 7:48 AM

Maybe I was unclear in my first post. So what you do in your code:

  1. Load page
  2. Find first table
  3. Find rows in table
  4. For each row in rows:
    1. Find cell in row
    2. Cell content write to textbox

So where you see answer to your problem:
I only need the first 2 columns of data to be shown in a textbox.


You have i rows, with j columns and you must somewhere say something like: textbox1.text = rowCells[1].Value (in your code rowCells is cols).
Also you write: I need 2 columns. But you have only one textbox. So what you put into that textbox? first column? second? Merged? Table has many rows: what you want show there?
  1. Load page
  2. Find first table
  3. Find rows in table
  4. In each row:
    1. Do something with cell[neededFirstColumnIndex]
    2. Do something with cell[neededSecondColumnIndex]
Dec 9, 2010 at 2:51 PM

I guess what I'm having trouble with is understanding the //th and //td. How do I differentiate between all of them? If I only need 2 columns from the table, how would i grab the info from just the first 2 //th? Btw, there is only one table on the page.

Dec 9, 2010 at 9:39 PM

If I have 2 textboxes, I would have column 1 data in textbox1 and column 2 data in textbox2.

Dec 10, 2010 at 12:58 PM
HtmlWeb web = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = web.Load("http://www.website.com");

 // Get all columns in the document
HtmlNodeCollection tables = doc.DocumentNode.SelectNodes("//table[1]");

// Iterate all rows in the first table
HtmlNodeCollection headCells = tables[0].SelectNodes("//tr/thead/th"); // if you fix your html, else you can
//HtmlNodeCollection headCells = tables[0].SelectNodes("//tr//th");

if (headCells == null || headCells.Count <2)
{
   MessageBox.Show("Error. Cells not found. ");
   return;
}
textBox1.Text = headcells[0].Value;
textBox2.Text = headcells[1].Value;

I don't think that this will help you...
Dec 10, 2010 at 3:07 PM

This helps somewhat. But it only returns the first row of data that I need. How would I get all the rows?

Dec 10, 2010 at 5:59 PM

Read the book about C# - this will help you most. As I said before:

You have i rows, with j columns and you must somewhere say something like: textbox1.text = rowCells[1].Value (in your code rowCells is cols). 
Also you write: I need 2 columns. But you have only one textbox. So what you put into that textbox? first column? second? Merged? Table has many rows: what you want show there?

This is my last post in this thread.