Skip to content

Not able to identify escaped/unescaped html entity in the text nodes #2206

Closed as not planned
@Muthukirthan

Description

Not able to identify whether the input document has & or &amp; in the text node, since Jsoup escapes the character in text node. Same goes to other entities like </&lt;.

This does not provide any control to the Jsoup users where they can take any action based on input. For example; If we want to remove < character in text node but preserve when given as entity &lt;

Note: Please let me know if there is already a way to differentiate this.


Providing an option where I could input Jsoup to not modify the text node will be super helpful. This provides more flexibility and control to the customers.

@jhy

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions