Support for dropping \u0000 during de-serialization #2165

vghero · 2018-10-24T16:22:38Z

It would be nice if there would be a de-serialization feature, to drop unicode null characters from JSON during deserialization. I would consider this character harmful, since it .e.g. breaks persisting in postgres and support for strings containing such chars is limited. WDYT?

cowtowncoder · 2018-10-26T04:57:17Z

This sounds like a rather specific feature, and probably challenging from optimization perspective (since no content within String values is dropped, although there is escape handling).

But wouldn't this be something that could be quite easily handled at level of UTF-8 (etc) backed InputStream? I assume it'd be fine to drop them also as part of insignificant whitespace?

For InputStream, Reader, one can register InputDecorator which is called by JsonFactory before constructing parser -- it could wrap input source in such filter.

vghero · 2018-10-29T08:49:04Z

I also thought about a custom jackson string type deserializer. But as you pointed out, that might come with a performance penalty :(. On the other hand, dropping such chars in general from incoming requests might not be always a good idea (binaries etc.). So a request input stream wrapping filter would have to be restricted to certain content types only.
Hm. Besides wrapping the input stream (which might also be a problem due to now non-matching content-length in stream and header) and not looking at performance, a jackson string type deserializer could be a way to go?

cowtowncoder closed this as completed Dec 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for dropping \u0000 during de-serialization #2165

Support for dropping \u0000 during de-serialization #2165

vghero commented Oct 24, 2018

cowtowncoder commented Oct 26, 2018

vghero commented Oct 29, 2018

Support for dropping \u0000 during de-serialization #2165

Support for dropping \u0000 during de-serialization #2165

Comments

vghero commented Oct 24, 2018

cowtowncoder commented Oct 26, 2018

vghero commented Oct 29, 2018