Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for dropping \u0000 during de-serialization #2165

Closed
vghero opened this issue Oct 24, 2018 · 2 comments
Closed

Support for dropping \u0000 during de-serialization #2165

vghero opened this issue Oct 24, 2018 · 2 comments

Comments

@vghero
Copy link

vghero commented Oct 24, 2018

It would be nice if there would be a de-serialization feature, to drop unicode null characters from JSON during deserialization. I would consider this character harmful, since it .e.g. breaks persisting in postgres and support for strings containing such chars is limited. WDYT?

@cowtowncoder
Copy link
Member

This sounds like a rather specific feature, and probably challenging from optimization perspective (since no content within String values is dropped, although there is escape handling).

But wouldn't this be something that could be quite easily handled at level of UTF-8 (etc) backed InputStream? I assume it'd be fine to drop them also as part of insignificant whitespace?

For InputStream, Reader, one can register InputDecorator which is called by JsonFactory before constructing parser -- it could wrap input source in such filter.

@vghero
Copy link
Author

vghero commented Oct 29, 2018

I also thought about a custom jackson string type deserializer. But as you pointed out, that might come with a performance penalty :(. On the other hand, dropping such chars in general from incoming requests might not be always a good idea (binaries etc.). So a request input stream wrapping filter would have to be restricted to certain content types only.
Hm. Besides wrapping the input stream (which might also be a problem due to now non-matching content-length in stream and header) and not looking at performance, a jackson string type deserializer could be a way to go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants