Description
Background and Motivation
Streaming large JSON data to a client sometimes requires the use of streaming the response to get the ideal latency during loading.
E.g. one usually knows the entire data size and their position in that buffer and can typically easily display a progress element client side during such streaming.
This is especially hard when you have an object which has nested objects (which unfortunately depending on your settings of course) TimeSpan
is serialized as a nested object, consider:
public class MyModel{
public string Name {get; set;} = "Whatever";
public int Age {get; set;} = 777;
public DateTime Birthday {get; set;} = new DateTime(1986, 11, 12);
public TimeSpan SomeDuration {get; set;} = TimeSpan.Zero;
}
Resulting Json would be:
{
"name": "Whatever",
"age": 777,
"birthday": "1986-11-12T00:00:00.0000000",
"someDuration:{"ticks":0,"days":0,"hours":0,"milliseconds":0,"minutes":0,"seconds":0,"totalDays":0,"totalHours":0,"totalMilliseconds":0,"totalMinutes":0,"totalSeconds":0}
}
And you can see the potential for the splitting of the duration to be greater than that of anything else in my model in this case; and usually what occurs is that part of the key is written at then end of the stream i.e. tot
or total
or totalS
etc which means I have to go back and find whatever started my object (can be really hard depending on the nesting) delimit /slice there, attempt to partially parse what I do have (if anything) and then wait for the rest to proceed.
Proposed API
namespace System.Text.Json
{
public sealed class JsonSerializerOptions .. {
+ public bool WriteIndividualObjects {get; init;}
}
Usage Examples
I have a list of 1 - N objects (Or just 1 really large object)
I want to continue to let ASP.Net handle response serialization and I don't want to manually write a JsonConverter
to have the semantics I want for each individual type.
I want the JsonSerializer
to respect WriteIndividualObjects
and although it cannot be guaranteed that the client will have the adequate buffer to receive such data in a single receive I will only write in those chunks specified.
The goal is to avoid when chunking the data that the receiver has to deal with partial objects if at all possibe:
Extreme Case
[{...................... | Chunk (partial)
....},{...........}| Chunk (Split, End and Start Partial)
,{.......................| Chunk (partial)
.}] |EOS
Typical Case
[{......},{.........},{....... | Chunk (2 objects 1 Partial)
....},{...........}| Chunk (Split, End and Start Partial)
,{.......................| Chunk (partial)
.}] |EOS
I would like to avoid the split JSON object (especially keys) in-between chunks if at all possible.
I realize this is highly dependent on the clients receive buffer among other things however if there was at least a strategy which did not involve manually having to write each object into the response stream then I would be satisfied.
After setting proposed property I would expect the writes / flushes to the stream to occur at object boundaries so it was less likely I would have to deal with nested object graphs (yes I know if the object is very large it's hard to avoid this).
Result Case
[{......},| Start of list and 1 complete object and delimiter
{.........},| 1 Complete object and delimiter
{...........},| 1 Complete object and delimiter
{...........},| 1 Complete object and delimiter
{.....................}]|EOS 1 Complete object and end of list
Perhaps the start and end of list should also be small writes to allow for even easier parsing. (Especially on fast connections), i.e.:
[|Start of list
{......},| 1 complete object and delimiter
{.........},| 1 Complete object and delimiter
{...........},| 1 Complete object and delimiter
{...........},| 1 Complete object and delimiter
{.....................} 1 Complete object (user knows end of list must be next)
]|end of list
Alternative Designs
namespace System.Text.Json
{
public sealed class JsonSerializerOptions .. {
+ public int FlushSize {get; init;} //char([]) /Span<char>FlushMarker(s) etc
}
As it seems to compliment nicely DefaultBufferSize, which in theory I would then set to something like 8192 and the FlushSize
to 4096 but that still can't guarantee that I write a chunk that contains invalid JSON i.e.: {.......,"SomeKe
, y": null}
Consider Also
namespace System.Text.Json
{
public sealed class JsonSerializerOptions .. {
+ public void OnStartWrite(); //Called when a single complete object or primitive was being written to the underlying writer
+ public void OnEndWrite(); //Called when a single complete object or primitive was completely written to the underlying writer
+ public void OnStartRead(); //Called when a single complete object or primitive was being read to the underlying writer
+ public void OnEndRead(); //Called when a single complete object or primitive was completely read to the underlying writer
}
Would allow for manually flushing of the Stream
underlying or whatever other action might be required.
Possibly in addition to although could technically be used instead of the aforementioned but would require sizes to be known in advance and is probably useful in it's own regard.
One could also use Web Sockets / SignalR though that requires somewhat of a paradigm shift in Api and overhead for running the end point.
One could also use a client side parser such as oboejs which provides a variety of options and is very small.
I Believe there are also several variations of this approach, most of which are spec compliant e.g.:
- Using NewLine to delimit chunks of complete data only
{....}\r\n
or[\r\n{....}\r\n,\r\n{....}\r\n,\r\n]\r\n
- Using NewLine to delimit all pieces of an object e.g.
WriteIndented
(Just need to ensure not emitting a partial object key or value, if in a large graph stop at the last valid key which can fit in the buffer and start the next write at that key) - Using a special object to delimit completed data
[{...},true,{....}]
etc (true
in this case is the delimiter, still error prone as thetrue
delimiter maybe split currently [no matter what you choose]) without changes
Risks
Low, depending on design choice, OnWrite
methods could get chatty very quickly.
Benefits
Using fetch and ReadableStreams one can then attempt to use this setting such that they can avoid complex parsing logic which must deal with incomplete JSON data while streaming.
Other notes
It maybe beneficial to control both the Tab and NewLine and other characters emitted by the Reader and Writers through the options.
Would not be opposed to FlushIndividualObjects
as the name as that is perhaps more inline with the actual semantics of it's operation.
Would not be opposed to FlushDelimiters
as the name if that approach was taken.
Activity