Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc will delay more than 20 seconds before it returns #88

Closed
qiyun opened this issue Feb 13, 2012 · 6 comments
Closed

rpc will delay more than 20 seconds before it returns #88

qiyun opened this issue Feb 13, 2012 · 6 comments
Assignees

Comments

@qiyun
Copy link

qiyun commented Feb 13, 2012

This issue is introduced after Jan 2 2012 commit. Before that it works fine. But now each rpc, if a little bit data is involved, say, sending 10K data from client to server, the serve will block for more than 20 seconds before it returns, and the server is absolutely doing nothing during that time. This occurs even when the client and server are on the same machine.
I wonder if the server is waiting until a timeout occurs.

@qiyun
Copy link
Author

qiyun commented Feb 13, 2012

BTW, this kind of issue will have a huge impact on performance. Maybe we should add more tests to guarantee the quality of yaws in future.

And I also like to thank you all for making this great software.

@vinoski
Copy link
Collaborator

vinoski commented Feb 13, 2012

Do you have a test case that shows the problem? I can't figure out what problem you're seeing given your description.

It's unlikely the Jan 2 2012 commit broke this, since that just moved a few files around.

@qiyun
Copy link
Author

qiyun commented Feb 14, 2012

I debug into yaws_server.erl and find the problem:

  1. deliver_dyn_part() calls my mod:out().
  2. My mod:out() has read all client data during process.
  3. handle_out_reply returns undefined.
  4. a flush is added. It uses the old CliDataPos.
  5. flush will hang until a timeout occurs.

I think to consume all the client data at once in my mod:out function will be more efficient instead of returning get_more. So how can I change deliver_dyn_part() function to reflect this need? Thank you for your suggestions.

@capflam
Copy link
Collaborator

capflam commented Feb 14, 2012

Well, calls to flush() after handle_out_reply() were added to let a script return without taking care of remaining data on the socket. A good example is a script used to upload files that must return an error if the request's content length is greater than an upper bound.

The current code is not designed to handle your use case, this is the responsibility of yaws to read incoming data and pass it to scripts. It reads data by blocks to save memory (and avoid memory exhaustion) and manages the chunked transfer-encoding requests. Retrieving more data by returning 'get_more' could add a small overhead, but I suspect that is negligible compared to a tcp read.

To limit/avoid the reads by blocks, you can play with 'partial_post_size' value. If you have the hand on the client or if you know that the requests size is never a problem, you can set it to 'nolimit'. But it's generally not a good choice. In the most cases, let Yaws read data is the better choice.

On the other hand, maybe you have good reasons to read data on your side. And it may seem unfair to forbid it. So I'll try to quickly find an elegant solution to handle this case.

@qiyun
Copy link
Author

qiyun commented Feb 15, 2012

In your example, when uploading files, if the request's content length is greater than an upper bound, isn't it better to just return an error code then close the tcp connection than flushing the data? I assume closing the tcp will clean out all caching data and release all the resource for this connection. If using flushing, if the incoming data is huge (say > 1G), do we flush all of them before we close down the tcp connection?

@capflam
Copy link
Collaborator

capflam commented Feb 15, 2012

You're right, closing the connection is a better solution in this case. However some clients (eg. Firefox and google-chome...) do not handle connection close gracefully when the connection is closed during sending data; they do not read the server response. So, when the response is important, to warn the end-user that an error occured for example, we must be fair by flushing data before closing the connection.

Systematically flushing data is probably not a good idea. So I'll revert the commit d09ed3d. But, I'll also offer a method to flush data on demand to avoid the get_more/Mod:out() roundtrips.

@ghost ghost assigned capflam Feb 15, 2012
capflam pushed a commit that referenced this issue Feb 15, 2012
2 changes here:

 * Revert "Flush remaining data when dynamic content is delivered"

This reverts commit d09ed3d.
Systematically flushing data is not a good idea.

 * Add "flush" as possible return value of the out/1 function

Some clients (eg. Firefox and google-chome...) do not handle connection
close gracefully when the connection is closed during sending data; they
do not read the server response. So, when the response is important, to
warn the end-user that an error occured for example, we must be fair by
flushing data before closing the connection.
@qiyun qiyun closed this as completed Feb 16, 2012
jgrinstead pushed a commit to jgrinstead/yaws that referenced this issue Apr 23, 2015
2 changes here:

 * Revert "Flush remaining data when dynamic content is delivered"

This reverts commit d09ed3d.
Systematically flushing data is not a good idea.

 * Add "flush" as possible return value of the out/1 function

Some clients (eg. Firefox and google-chome...) do not handle connection
close gracefully when the connection is closed during sending data; they
do not read the server response. So, when the response is important, to
warn the end-user that an error occured for example, we must be fair by
flushing data before closing the connection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants