You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have an issue with form-elements that contains duplicate attributes. I was expecting that attributes in form-elements should be deduplicated in the same way as all other elements when parsing HTML via Parser.htmlParser() but duplicate attributes seems to always be retained. Using Parser.xmlParser(), form attributes are correctly deduplicated so the issue only affects HTML.
Looking though old issues and pull requests, I found #1219 that seems to fix deduplication for start tags but that fix doesn't seems to apply to form-elements.
Here is a simple test-case adopted from HtmlParserTest:
@Test public void dropsDuplicateAttributesInFormElement() {
String html = "<form One=One ONE=Two Two=two one=Three One=Four two=Five></form>";
Parser parser = Parser.htmlParser().setTrackErrors(10);
Document doc = parser.parseInput(html, "");
Element p = doc.selectFirst("form");
assertEquals("<form one=\"One\" two=\"two\"></form>", p.outerHtml()); // normalized names due to lower casing
assertEquals(1, parser.getErrors().size());
assertEquals("Dropped duplicate attribute(s) in tag [form]", parser.getErrors().get(0).getErrorMessage());
}
The text was updated successfully, but these errors were encountered:
perlan
added a commit
to perlan/jsoup
that referenced
this issue
May 5, 2023
Add test-case and fixes for attribute deduplication in form and empty elements
Fixes#1949
---------
Co-authored-by: Jonathan Hedley <jonathan@hedley.net>
jhy
added a commit
that referenced
this issue
May 8, 2023
I have an issue with form-elements that contains duplicate attributes. I was expecting that attributes in form-elements should be deduplicated in the same way as all other elements when parsing HTML via Parser.htmlParser() but duplicate attributes seems to always be retained. Using Parser.xmlParser(), form attributes are correctly deduplicated so the issue only affects HTML.
Looking though old issues and pull requests, I found #1219 that seems to fix deduplication for start tags but that fix doesn't seems to apply to form-elements.
Here is a simple test-case adopted from
HtmlParserTest
:The text was updated successfully, but these errors were encountered: