rpc will delay more than 20 seconds before it returns #88

qiyun · 2012-02-13T07:19:49Z

This issue is introduced after Jan 2 2012 commit. Before that it works fine. But now each rpc, if a little bit data is involved, say, sending 10K data from client to server, the serve will block for more than 20 seconds before it returns, and the server is absolutely doing nothing during that time. This occurs even when the client and server are on the same machine.
I wonder if the server is waiting until a timeout occurs.

qiyun · 2012-02-13T07:26:47Z

BTW, this kind of issue will have a huge impact on performance. Maybe we should add more tests to guarantee the quality of yaws in future.

And I also like to thank you all for making this great software.

vinoski · 2012-02-13T14:41:56Z

Do you have a test case that shows the problem? I can't figure out what problem you're seeing given your description.

It's unlikely the Jan 2 2012 commit broke this, since that just moved a few files around.

qiyun · 2012-02-14T10:02:47Z

I debug into yaws_server.erl and find the problem:

deliver_dyn_part() calls my mod:out().
My mod:out() has read all client data during process.
handle_out_reply returns undefined.
a flush is added. It uses the old CliDataPos.
flush will hang until a timeout occurs.

I think to consume all the client data at once in my mod:out function will be more efficient instead of returning get_more. So how can I change deliver_dyn_part() function to reflect this need? Thank you for your suggestions.

capflam · 2012-02-14T22:01:26Z

Well, calls to flush() after handle_out_reply() were added to let a script return without taking care of remaining data on the socket. A good example is a script used to upload files that must return an error if the request's content length is greater than an upper bound.

The current code is not designed to handle your use case, this is the responsibility of yaws to read incoming data and pass it to scripts. It reads data by blocks to save memory (and avoid memory exhaustion) and manages the chunked transfer-encoding requests. Retrieving more data by returning 'get_more' could add a small overhead, but I suspect that is negligible compared to a tcp read.

To limit/avoid the reads by blocks, you can play with 'partial_post_size' value. If you have the hand on the client or if you know that the requests size is never a problem, you can set it to 'nolimit'. But it's generally not a good choice. In the most cases, let Yaws read data is the better choice.

On the other hand, maybe you have good reasons to read data on your side. And it may seem unfair to forbid it. So I'll try to quickly find an elegant solution to handle this case.

qiyun · 2012-02-15T01:46:48Z

In your example, when uploading files, if the request's content length is greater than an upper bound, isn't it better to just return an error code then close the tcp connection than flushing the data? I assume closing the tcp will clean out all caching data and release all the resource for this connection. If using flushing, if the incoming data is huge (say > 1G), do we flush all of them before we close down the tcp connection?

capflam · 2012-02-15T11:25:06Z

You're right, closing the connection is a better solution in this case. However some clients (eg. Firefox and google-chome...) do not handle connection close gracefully when the connection is closed during sending data; they do not read the server response. So, when the response is important, to warn the end-user that an error occured for example, we must be fair by flushing data before closing the connection.

Systematically flushing data is probably not a good idea. So I'll revert the commit d09ed3d. But, I'll also offer a method to flush data on demand to avoid the get_more/Mod:out() roundtrips.

2 changes here: * Revert "Flush remaining data when dynamic content is delivered" This reverts commit d09ed3d. Systematically flushing data is not a good idea. * Add "flush" as possible return value of the out/1 function Some clients (eg. Firefox and google-chome...) do not handle connection close gracefully when the connection is closed during sending data; they do not read the server response. So, when the response is important, to warn the end-user that an error occured for example, we must be fair by flushing data before closing the connection.

ghost assigned capflam Feb 15, 2012

qiyun closed this as completed Feb 16, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc will delay more than 20 seconds before it returns #88

rpc will delay more than 20 seconds before it returns #88

qiyun commented Feb 13, 2012

qiyun commented Feb 13, 2012

vinoski commented Feb 13, 2012

qiyun commented Feb 14, 2012

capflam commented Feb 14, 2012

qiyun commented Feb 15, 2012

capflam commented Feb 15, 2012

rpc will delay more than 20 seconds before it returns #88

rpc will delay more than 20 seconds before it returns #88

Comments

qiyun commented Feb 13, 2012

qiyun commented Feb 13, 2012

vinoski commented Feb 13, 2012

qiyun commented Feb 14, 2012

capflam commented Feb 14, 2012

qiyun commented Feb 15, 2012

capflam commented Feb 15, 2012