-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add .chunk()
associated function to blocking::Response
, add .json_chunk()
method to Response
and blocking::Response
#2000
Open
LoganDark
wants to merge
2
commits into
seanmonstar:master
Choose a base branch
from
LoganDark:json_chunk
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is assuming that every "chunk", which is basically a
read()
call on the socket, is a full JSON message? It's a fragile assumption, since the data could be combined into a single read, or it could be too large and be broken up into multiple reads...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
The assumption is that
chunk()
corresponds to one chunk fromTransfer-Encoding: chunked
. Not that it corresponds to whatever the OS decidesread()
is.Since the chunk size is dictated by the server, they should be complete JSON messages when you are using an endpoint that returns a complete JSON message each chunk.
If this assumption is incorrect, then
reqwest
probably needs to be extended to supportTransfer-Encoding: chunked
, because otherwise I can't consume token streams fromtext-generation-inference
.AFAICT,
reqwest
has tests that expect.body_mut().next()
to be an entire chunk (fromTransfer-Encoding: chunked
), and the implementation ofchunk()
defers to that exact same method call, so if that correspondence is not true then I don't know what is even going on.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reqwest (and hyper) does support chunked transfer-encoding. But it doesn't buffer up every "chunk" that way. The decoder in hyper will do 1 OS read, and then pass on either up to the chunked delimiter, or the full thing if the delimiter is not yet reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you suggest I should tell when a delimiter is reached, then? (Is there a method for this?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chunked delimiters have no semantic significance, so they can also be changed as they go through gateways/proxies. I'm not familiar with the project you linked. Most often, JSON streaming is done by delimiting JSON objects with newlines. Then what you do, is read and buffer until you get a newline, and then you can decode the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a server-side issue, isn't it? If the server hosts an endpoint that doesn't use the encoding properly, that's a bug on their end.
If
reqwest
needs to be resilient to this sort of issue, I would still want to support endpoints that actually work properly - so we'd end up having two families of functions, one that uses chunks and one that uses newlines.In that case, I'll still need a way to tell the end of a chunk apart from an arbitrary read boundary.
Is there any way to test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there isn't a way to test for it. The properties of the transfer are something that hyper handles, they are not considered part of the content (vs
Content-Encoding
).It's sort of similar to how calling
read()
on an OS TcpStream won't tell you which bytes were in each segment, since the OS will combine segments into a single buffer if it receives multiple in-between calls toread()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to perform a couple tests and report back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it looks like this server actually uses Server-Sent Events, not just sending raw chunks. So it is delimited by two newlines between each message. That's a bit better.
In that case, this will probably need a better name. What do you say to this: in addition to the
chunk
family of functions, there'sevent
that strips thedata:
prefix and reads until\n\n
- then havejson_event
that decodes from there?There wouldn't be any need for
json_chunk
at all then, or I could just make it read until a single newline like you suggested, just in case there is an API out there that doesn't use SSE.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my plan - let's remove the new
json_chunk
methods from this PR and narrow the scope to just adding that blockingchunk
method - and then I'll work on some Server-Sent Events support in a separate PR.It looks like reqwest's
Decoder
doesn't support it - we can read chunks until\n\n
, but we can't just give portions of the chunk back to theDecoder
if the server happens to send multiple events in a single chunk; if chunking is an implementation detail and we have to be agnostic, then we have to support that scenario.Maybe we can provide an iterator over each event in an SSE stream? Then the iterator can keep track of whether the last event ended in the middle of some chunk.