and has 6 comments
It's a horribly old bug, something that was reported on their page since 2007 and it in the issue list for HtmlAgilityPack since 2011. You want to parse a string as an HTML document and then get it back as a string from the DOM that the pack is generating. And it closes the form tag, like it has no children.
Example: <form></form> gets transformed into <form/></form>

The problem lies in the HtmlNode class of the HtmlAgilityPack project. It defines the form tag as empty in this line:
ElementsFlags.Add("form", HtmlElementFlag.CanOverlap | HtmlElementFlag.Empty);
One can download the sources and remove the Empty value in order to fix the problem or, if they do not want to change the sources of the pack, they have the option of using a workaround:
HtmlNode.ElementsFlags["form"]=HtmlElementFlag.CanOverlap;
Be careful, though, the ElementsFlags dictionary is a static property. This change will be applied on the entire application.

Comments

Siderite

I don&#39;t know. The Codeplex site as well as the Twitter feed seem to have been inactive since aug 2012. But HAP is widely used in a lot of projects.

Siderite

Anonymous

There is anybody maintaining the HTMLAgilityPack now?

Anonymous

Siderite

http://ftp.ics.uci.edu/pub/ietf/html/rfc1866.txt - the RFC from nov 1995. I quote: _The &lt;FORM&gt; element contains a sequence of input elements, along with document structuring elements._ Even if Simon Mourier considered some elements as possibly overlapping, that doesn&#39;t excuse assuming they always overlap or saving them with the closed format when transformed back to string. Even Simon writes in that StackOverflow issue: _you can save them back without breaking the original HTML_ which is exactly what this *bug* is about. Errare humanum est, perseverare diabolicum

Siderite

Anonymous

Html Agility Pack was designed well before HTML4. Read here for an explanation of FORM handling: http://stackoverflow.com/questions/4218847/htmlagilitypack-does-form-close-itself-for-some-reason

Anonymous

Siderite

Even if static flags that affect the behaviour of every class that was or ever will be instantiated would be good design (and it isn&#39;t!) one can hardly call defining the _form_ element as childless, akin to _br_ or _meta_, any form of design. I am referencing here the form definition page at the w3 site, just in case you feel like arguing about it:http://www.w3.org/TR/html4/interact/forms.html

Siderite

Anonymous

It&#39;s not a &quot;bug&quot;. It was desiged this way, and it&#39;s configurable via the ElementFlags, like you said (it&#39;s not a &quot;workaround&quot;, again, it was designed this way as well)

Anonymous

Post a comment