Multiple forms with no name or id

Topics: Developer Forum
Jan 16, 2013 at 5:57 PM

I need to parse form elements from a page that has two form tags one after another like so.

<div class="new">
<form action="http://xxx.xxx.com/y" method="post" enctype="multipart/form-data">
<input type="hidden" name="hf1" value="one">
<input type="hidden" name="hf2" value="new">
<input type="file" name="file">
<button type="submit" name="do"  value="new image">new image</button>
</form>
</div>

<div class="done">	
<form action="http://xxx.xxx.com/y/" method="post">
<input type="hidden" name="hf1" value="one">
<input type="hidden" name="hf2" value="done">
<button type="submit" name="do" value="Done with Images">done</button>
</form>
To check it i used the following code.
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
HtmlNode.ElementsFlags.Remove("form");

doc.Load(new StringReader(streamData));
var forms = doc.DocumentNode.Descendants("form");

var count = 0;
foreach (var form in forms)
{
  foreach (HtmlNode element in form.SelectNodes("//input[@type='hidden']"))
  {
    if (!element.Attributes.Contains("name")) continue;
    System.Diagnostics.Debug.WriteLine("Form-" + count.ToString() + " - Name: " + element.Attributes["name"].Value + " - Value: " + element.Attributes["value"].Value);
  }
  count++;
}

The issue is each form seems to have the hidden elements of both forms, the output is

Form-0 - Name: hf1 - Value: one
Form-0 - Name: hf2 - Value: add
Form-0 - Name: hf1 - Value: one
Form-0 - Name: hf2 - Value: done
Form-1 - Name: hf1 - Value: one
Form-1 - Name: hf2 - Value: add
Form-1 - Name: hf1 - Value: one
Form-1 - Name: hf2 - Value: done

What I was expecting was

Form-0 - Name: hf1 - Value: one
Form-0 - Name: hf2 - Value: add
Form-1 - Name: hf1 - Value: one
Form-1 - Name: hf2 - Value: done

 

I assume it has something to do with the two forms not having any name or id.
Any idea how fix this, FYI I do not have control over the page source.

 

Regards

Al

 

 

Jan 25, 2013 at 5:59 AM

Your getting all inputs each time as you are looking from the root path for all input elements.

you need to look from the form onwards

think its but couldn't check as not install on this machine..

form.SelectNodes("./input[@type='hidden']")

 

Lee

Jan 26, 2013 at 4:47 PM
Edited Jan 26, 2013 at 5:29 PM

Thanks Lee,

 I'll give that a try.

Al