Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support callable body for explicit streaming support. #1745

Merged
merged 2 commits into from
Jan 21, 2022

Conversation

ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 19, 2021

Allow a rack response body to explicitly opt into streaming by implementing #call instead of #each. This interface is desirable for a number of reasons. However, this PR should be considered an experimental spike as there are also a number of downsides, and this essentially represents a breaking change to some typical but unwise use cases.

An example would look something like this:

run lambda{|env|
  body = lambda do |stream|
    stream.write("Hello World")
  ensure
    stream.close
  end
  
  [200, [], body]
}

Semantically, it is almost identical to partial hijack. Full hijack is removed. Of course, servers are welcome to support both Rack 2.x hijack, and 3.x whatever.

Current Design

Rack currently supports streaming using rack.hijack. There are two ways to hijack the underlying socket for a given request: a partial hijack and a full hijack. Not all servers support both modes of operation and neither method is fully compatible with HTTP/2.

Partial Hijack

A partial hijack allows the web server to respond with the response status, headers, and then yields the stream back to the application code. It does this by using a special rack.hijack header:

def call(env)
	body = proc do |stream|
		stream.write("Hello World")
		stream.close
	end
	
	if env['rack.hijack?']
		return [200, {'rack.hijack' => body}, nil]
	end
end
  • It is not clear whether the body needs to handle the formatting of the response, i.e. chunked encoding, compression, transfer encoding, etc.
  • It mixes rack. specific headers with normal HTTP headers which is computationally expensive for the server to detect.
  • It's not clear what the returned body should be.

Full Hijack

A full hijack allows the web server to completely take control of the underlying socket. This occurs at the point the rack.hijack is invoked.

def call(env)
	if hijack = env['rack.hijack']
		socket = hijack.call
		
		Thread.new do
			socket.write("Hello World")
			socket.close
		end
		
		return nil # ???
	end
end
  • The socket might not actually be a raw socket, if the server is using TLS.
  • Extreme care is required in order to handle the protocol correctly.
  • It's not clear if such a design can work with anything other than HTTP/1.
  • The user code must return from the call(env) method in order for the server continue handling requests.
  • It's not clear what the response should be.

Supporting HTTP/2

It is possible to support partial hijack in HTTP/2. The hijacked stream is multiplexed the same as any other connection. However, such a design cannot be thread safe without significant design trade-offs. Full hijack is semantically incompatible with HTTP/2.

Typical Scenarios

There are several typical scenarios where rack.hijack has been used.

ActionCable

ActionCable uses full hijack in order to negotiate and communicate using WebSocket protocol. Such a design requires the ActionCable server to run in the same process as the web server.

MessageBus

Actually on Falcon
MessageBus uses full hijack in order to keep the socket alive and feed messages occasionally.

Design Options

There are three possible directions we can go in.

Rack Hijack

We can continue to adapt and bend this model to suit. It currently works, but it will present issues going forward w.r.t. HTTP/2 and generally feels like a hack.

Callable Body

As proposed here, some method on the body other than #each is used to determine a streaming response.

  • Most intuitive design IMHO.
  • Can break some questionably designed "body" middlewares: [200, [], self.dup].

Typed Body

As an alternative, using a module or class to indicate that a response body is streamable, e.g. Rack::Streamable. Streaming would be determined by:

body.is_a?(Rack::Streamable)
  • More explicit design.
  • How does this work with BodyProxy?
  • How does one implement this without an explicit Rack dependency?

Explicit Enumeration

It's entirely possible to return a body which can yield chunks of data, i.e.

class StreamingBody
  def each(&stream)
    stream.call "Hello"
    stream.call "World"
  end
end

run lambda{|env|
  [200, [], StreamingBody.new]
}

This is already possible with Rack 2.x, however it's not as flexible as a full stream interface. This has been discussed.

  • Works today.
  • Inefficient handling of bi-directional I/O (need to use rack.input).
  • Less flexible, both in terms of interface & lifetime/scope.
  • Less intuitive.

@jeremyevans
Copy link
Contributor

This callable body proposal breaks both of the typical scenarios where rack.hijack is used (full hijacking), so I would not consider it as a replacement for rack.hijack. This just breaks the common scenarios where rack.hijack is used, with no consideration for how those cases should be handled. Are we just going to tell MessageBus and ActionCable to stick with Rack 2.x? Considering ActionCable runs in the same process as Rails, that would be equivalent to telling Rails to stick with Rack 2.x. I don't think that is a wise choice.

Is it necessary to remove rack.hijack to implement this proposal? I see how the proposals are related, but it doesn't seem necessary. It seems like you are removing rack.hijack simply because you don't like it, and not because it is required to implement support for #call.

As streaming responses can already be handled via #each, I think also supporting #call for streaming adds little value. Could you go into more detail what advantages #call support adds for streaming, that you cannot get using #each? Is it just that the #call API is nicer in some cases, or is there something more?

@tenderlove
Copy link
Member

This callable body proposal breaks both of the typical scenarios where rack.hijack is used (full hijacking), so I would not consider it as a replacement for rack.hijack.

Can you elaborate? I don't understand why this is true. Is it due to headers?

Regardless, I don't see why we can't support rack.hijack as well as this proposal.

Is it just that the #call API is nicer in some cases, or is there something more?

I think making the API nicer is a legit reason for change. I imagine #each as a "pull" operation (iow, the webserver says "give me more data") when most people interact with sockets and files via a "push" (writing to the file or socket).

@jeremyevans
Copy link
Contributor

Is it just that the #call API is nicer in some cases, or is there something more?

I think making the API nicer is a legit reason for change. I imagine #each as a "pull" operation (iow, the webserver says "give me more data") when most people interact with sockets and files via a "push" (writing to the file or socket).

I agree with you that even if the change only makes the API nicer without adding any features, we should still consider it. In my opinion, a large change to SPEC should require the API be made much nicer, not just a little nicer. How much nicer the API is made by this change is going to be subjective, of course.

I also want to check my understanding that this change is only for a nicer API, and it doesn't enable anything not already possible. Maybe this does enable something not currently possible and I missed it?

@ioquatix
Copy link
Member Author

ioquatix commented Apr 19, 2021

Maybe this does enable something not currently possible and I missed it?

It's semantically identical to partial hijack.

This callable body proposal breaks both of the typical scenarios where rack.hijack is used (full hijacking)

I assert that all existing systems which use full hijack can be implemented better (simpler, safer, more compatible) using streaming responses. We already have some examples of this, e.g. async_cable.

Servers can continue to support the Rack 2.x hijack model as well as the Rack 3.x streaming model. The Falcon PoC does this. However, removing hijack is about communicating that the hijack interface should not be used going forward. It's already completely optional so this isn't really materially any different...

@tenderlove
Copy link
Member

AFAICT full hijack can't be implemented on H2 servers (because there is no real socket to hand out). Mainly I care about:

  • Not breaking apps
  • Communicating to web server authors the API they should implement
  • Communicating to web server consumers what API they should expect

If we could accomplish all of these goals, then I am happy. Removing hijack makes me worried because I don't want to break apps. If we clearly communicate to web server authors and web server consumers that it's OK for a web server to implement multiple versions of the Rack SPEC, then I think my list is satisfied.

Giving web server authors explicit encouragement may even help them to optimize their servers (for example puma --rack2 could be implemented differently).

@jeremyevans
Copy link
Contributor

This callable body proposal breaks both of the typical scenarios where rack.hijack is used (full hijacking)

I assert that all existing systems which use full hijack can be implemented better (simpler, safer, more compatible) using streaming responses. We already have some examples of this, e.g. async_cable.

Current applications that use an evented approach to full hijack and a single threaded approach to streaming responses will not work if you switch full hijack to streaming responses. For example, applications currently using Unicorn with MessageBus. You may argue that a simpler, safer, more compatible approach would just be switch to another web server, but I don't find that argument compelling. I'm assuming by more compatible you are referring to HTTP 2/3 or TLS support, since any approach that removes rack.hijack support will be facto be less compatible at a Rack level, and not more.

Servers can continue to support the Rack 2.x hijack model as well as the Rack 3.x streaming model. The Falcon PoC does this. However, removing hijack is about communicating that the hijack interface should not be used going forward. It's already completely optional so this isn't really materially any different...

Removing full hijack support breaks apps. As rack.hijack is currently optional, I think we should keep it in the SPEC instead of removing it, even if we decide to add #call support. Servers such as Falcon can already not support rack.hijack if they want, or only support it via a flag.

If we clearly communicate to web server authors and web server consumers that it's OK for a web server to implement multiple versions of the Rack SPEC, then I think my list is satisfied.

I'm against encouraging servers to support multiple versions of SPEC, and against removing optional full hijacking from SPEC without replacement. rack.hijack (both full and partial) is already optional, so there doesn't seem to be a reason to remove it.

Theoretically, we could add #call and make it optional as well. Considering that partial hijacking is optional and not commonly used, and #call is a replacement for partial hijacking, making #call optional makes sense to me. I could even consider dropping partial hijacking from SPEC in replacing it with #call, assuming it handles exactly the same use cases.

@tenderlove
Copy link
Member

I'm against encouraging servers to support multiple versions of SPEC,

Why?

and against removing optional full hijacking from SPEC without replacement. rack.hijack (both full and partial) is already optional, so there doesn't seem to be a reason to remove it.

I think this is fair. Though I'm still interested in why we shouldn't encourage web server authors to support multiple SPEC versions. It would be nice if we could iterate on the SPEC, and allowing web servers to implement multiple versions would allow us to do that. Only ever being able to add to the SPEC seems painful.

@ioquatix
Copy link
Member Author

ioquatix commented Apr 19, 2021

For example, applications currently using Unicorn with MessageBus.

@SamSaffron are you able to comment on this proposal w.r.t. this use case?

Removing full hijack support breaks apps.

I think it's more nuanced than this.

Full hijack is completely optional, so apps that depend on it are already liable to be broken - i.e. it's not a given that the model works or makes sense from one server to the next, or even any other protocol other than HTTP/1.x. The proposed interface is mandatory and thus can be depended on in a Rack 3 compliant server. I am completely against making it optional, this sends an incredibly confusing message to consumers of the interface.

To elaborate on the above point, it's not clear how we could even make full hijack work with HTTP/2, since the way headers are sent is entirely different. It gives a false sense of security because it ties apps to a very specific protocol. The entire point of Rack is to be an abstract wrapper, yet we provide this protocol specific escape hatch which violates all the goals of the project. It's like "we couldn't figure out how to do streaming, so here is the socket so you can figure it out for yourself, good luck with that!". The reason to remove full hijack is because it's fundamentally unworkable as an abstraction going forward, without emulating the entire HTTP/1 behaviour at the wire level for HTTP/2 and HTTP/3 clients. We basically dug our own hole on this, and this PR is a way to try and get out of it.

The only "valid" use cases I've seen for full hijack is upgraded HTTP/1 connections, including server sent events (SSE) and WebSockets. Both concerns are handled better with streaming responses managed by the server itself, so application authors would be wise to migrate to the new model.

Existing applications can continue working using Rack 2.x and this should be around for a long time if Rack 1.x is anything to go by. Falcon supports Rack 1.x and 2.x, and I imagine that servers will want to support both interfaces. This is not the only breaking change that's been adopted for Rack 3.x, so from a compatibility PoV, servers may want to retain Rack 2 and Rack 3 compatibility shims, or may prefer to have major versions that continue to be supported which implement different SPECs.

Completely in agreement with @tenderlove on iterating on the SPEC (e.g. as proposed) and servers being compatible with Rack 2 and Rack 3 makes total sense to me. Assuming this, there is no risk to backwards compatibility except when Rack 2.x becomes a distant memory.

@jeremyevans
Copy link
Contributor

I'm against encouraging servers to support multiple versions of SPEC,

Why?

Removing something from SPEC without replacement and then telling servers to implement an old version of SPEC if they want it sends the wrong message in my opinion. If we have a replacement for full hijacking, then we can replace it in SPEC and encourage web servers to implement the new SPEC. What if a server wants to continue to support full hijacking and also wants to support #call? If both are supported in the Rack 3 SPEC (with full hijacking being optional, and #call being either mandatory or optional), then a server can do both. If full hijacking is removed from Rack 3 SPEC, then a server would have to choose between using Rack 2 SPEC for full hijacking and using Rack 3 SPEC for #call.

and against removing optional full hijacking from SPEC without replacement. rack.hijack (both full and partial) is already optional, so there doesn't seem to be a reason to remove it.

I think this is fair. Though I'm still interested in why we shouldn't encourage web server authors to support multiple SPEC versions. It would be nice if we could iterate on the SPEC, and allowing web servers to implement multiple versions would allow us to do that. Only ever being able to add to the SPEC seems painful.

I'm not proposing we never remove anything from SPEC. I'm simply proposing we don't remove something optional without a replacement when it has common use cases (even if we don't like those use cases).

I am completely against making it optional, this sends an incredibly confusing message to consumers of the interface.

By your own admission, by far the most common use of hijacking is for full hijacking and not partial hijacking. If partial hijacking was not required by SPEC, and isn't used much, why should #call be required and not optional? How is making #call optional more confusing than making hijacking optional?

@SamSaffron
Copy link
Contributor

I assert that all existing systems which use full hijack can be implemented better (simpler, safer, more compatible) using streaming responses. We already have some examples of this, e.g. async_cable.

Hmm, what if it takes me 30 seconds to tell if I should return a 200 vs 408? How do you inform Rack that later on once you figure stuff out you will have your status code and headers ready?

Overall I am very much with @jeremyevans . It is possibly we could re-engineer message bus so it works with the new interface, it does not defer decisions on status code and headers.

That said, getting chunking right under all conditions is hard, see: https://github.com/discourse/message_bus/blob/e30ed86c9e29679b7fd090bb797007f3933efbef/lib/message_bus/client.rb#L258-L259

There is huge potential for having disparity in implementation of chunked encoding between servers which can break apps erratically.

Killing off full hijack would certainly break this hack, with no option of a replacement: https://github.com/discourse/discourse/blob/42f6c9b6b9db81e2a6a9ca45e76f067980525f67/lib/hijack.rb , that would be a huge shame.

Regarding HTTP/2 ... to me this feels like an elusive pipe dream to have this built into the app for many upcoming years, just chuck NGINX in front and your are done.

@tenderlove
Copy link
Member

Removing something from SPEC without replacement and then telling servers to implement an old version of SPEC if they want it sends the wrong message in my opinion.

Ya, that makes sense. And removing a feature with no replacement doesn't seem to align well with my "don't want to break apps" line item 😆

My biggest problems with full hijack are (and I'm quoting the SPEC here):

There is a deliberate lack of full specification around
rack.hijack_io, as semantics will change from server to server.
Users are encouraged to utilize this API with a knowledge of their
server choice, and servers may extend the functionality of
hijack_io to provide additional features to users. The purpose of
rack.hijack is for Rack to "get out of the way", as such, Rack only
provides the minimum of specification and support.

This seems awful from a user's perspective and a portability perspective (users now have to know how the webserver internals work, and apps written to use this are coupled to the implementation). The other thing is, from what I understand full hijacking is for when people just want the websever to act essentially as a TCP connection manager. Rack is targeted towards the web, so I think this particular feature is a poor fit.

Despite my feelings towards the hijack feature, I don't see the harm keeping it (and it's still optional!!) until we figure out either a replacement or a good way to deprecate. @ioquatix is keeping the optional full hijack feature around a show stopper for adding #call? I can tell you aren't a fan of the hijack feature either, but I feel like this shouldn't stop us from making improvements.

@SamSaffron
Copy link
Contributor

Perhaps we can change the spec to inform consumers that they will always get a minimal IO instance?

You may not have #write and #flush , cause reasons ...

Certainly does not fill with confidence.

Maybe:

An instance of IO that implements: #write , #flush , #close

Also ... returning 418 "I am a teapot" is getting tired, in some ways throw :async that thin implemented was so much clearner. Maybe the spec should dictate what status should be returned? That way just by looking at status middleware below could tell that something was hijacked?

All for making another options for streaming, but against killing off the old stuff.

@tenderlove
Copy link
Member

That way just by looking at status middleware below could tell that something was hijacked?

Ya, I think we should probably define this, but lets do it in a different ticket. The inability for middleware to understand that a request was hijacked is definitely a problem.

@ioquatix
Copy link
Member Author

ioquatix commented Apr 19, 2021

Hmm, what if it takes me 30 seconds to tell if I should return a 200 vs 408?

Why should this matter? On an event driven server, you can take 30 minutes or 30 days if you want. On a server like Puma, are you assuming that putting it into a background thread mitigates the issue? Because unfortunately, if you don't handle this correctly you will run out of threads.

By your own admission, by far the most common use of hijacking is for full hijacking and not partial hijacking.

Not sure where I said this. Completely disagree with this as an argument for why full hijack is a good idea. It's a terrible idea. Partial hijacking is by far the most compatible and useful, because it fits the semantic model of HTTP.

The other thing is, from what I understand full hijacking is for when people just want the webserver to act essentially as a TCP connection manager. Rack is targeted towards the web, so I think this particular feature is a poor fit.

This 100x.

If we can get this merged just by retaining hijack, I'm okay with that, but I think it sends a very confusing message. We should definitely continue this discussion with a plan to deprecate and remove it. It doesn't belong in Rack IMHO.

@jeremyevans
Copy link
Contributor

By your own admission, by far the most common use of hijacking is for full hijacking and not partial hijacking.

Not sure where I said this.

Sorry, maybe I misunderstood. In the section on Typical Scenarios in your original post, you only list full hijacking scenarios and not partial hijacking scenarios. This led me to believe you thought partial hijacking was not nearly as common, which matches my own experience.

Completely disagree with this as an argument for why full hijack is a good idea. It's a terrible idea. Partial hijacking is by far the most compatible and useful, because it fits the semantic model of HTTP.

I never said full hijacking was a good idea, and I apologize if I implied that. Even if everyone agrees it is a bad idea/terrible hack, it is commonly used and we should support it until we can replace it. I think the consensus is that partial hijacking is not a replacement for full hijacking.

While it is only supposition on my part, I would think if partial hijacking was far more useful, it would be more common than full hijacking.

@ioquatix
Copy link
Member Author

ioquatix commented Apr 20, 2021

I have restored hijack to the lint code. I understand why you might want to retain it in terms of compatibility, but from my perspective I don't see that we want to carry this baggage forward. Rack 3 is allowed to and has made make breaking changes. Servers can and will support both for a long time. To me, that's good enough and a fair reason to remove it from Rack 3 which sends a clear message about it's state going forward. That being said, I don't want to hold up this PR and they are actually largely independent from a technical perspective (but not a developer usage perspective). I will still argue for removal and if not that then at least clear deprecation. I don't see that there is any future in "full hijack".

@ioquatix ioquatix marked this pull request as ready for review April 20, 2021 00:26
@SamSaffron
Copy link
Contributor

Why should this matter?

Mainly people are using puma/unicorn/passenger, all the mainstream appservers out there are heavily constrained on parallel rack requests, for many reasons. Hijack provides an outlet where you can bolt on some evented processing to appservers with limited concurrency.

We should definitely continue this discussion with a plan to deprecate and remove it.

We need a usable outlet with full parity (not partial). The long term solution can not be "just stop using unicorn / puma and start using falcon"

hijack is about providing a common interface for mixed processing - some stuff is within the threadpool the appserver provides, other stuff is in a consumer managed pool. I agree entirely it is hairy, but disagree with junking it and providing a partial solution instead.

@ioquatix
Copy link
Member Author

I understand the use case, I just don’t think it belongs in rack IMHO.

@ioquatix ioquatix self-assigned this Apr 20, 2021
@matthewd
Copy link
Contributor

Why should this matter? On an event driven server, you can take 30 minutes or 30 days if you want. On a server like Puma, are you assuming that putting it into a background thread mitigates the issue? Because unfortunately, if you don't handle this correctly you will run out of threads.

This seems to be an effective summary of my general objection to the no-full-hijack issue: you're seeking to remove something that can be correctly implemented on existing and widely-used servers, and offering in exchange an API that can only work with a newer, differently-shaped, server model.


Hijacking aside, while I like the idea of simplifying the API that a streaming response needs to implement, I still really dislike the fact that doing so means any middleware must account for two different possible body shapes. If we're going to force this degree of API change, I'd hope we could go further: switch the app call argument (currently env) to an object, which has methods, and upon which the response can be assembled. Instead of requiring rote returning of effectively-opaque values, switch to an API that can directly offer the services that applications (and especially, middlewares) want: yes, setting the body to an enumerable or stream-writing-callable.. but also "register a callback for when the response is finished" so we can stop all the BodyProxy allocations. This object could also keep properly-setup instances of Request and Response (+ headers), so we can stop leaving every middleware to rebuild them from scratch.

It'd also be neat if some middlewares could move off the main app call stack: if they just want to be notified before and/or after the request occurs [and modify headers at that point, say], it sure seems like they could more easily live in an array somewhere. And as a bonus, if we exist in between middleware invocations instead of building them into an opaque linked-list of @app references, we gain the ability to upgrade their available API in isolation in future minor versions (like new ways to define a streaming body), instead of any iteration being a stop-the-world breaking change.

These are not new ideas, so I'm sure they're been written here before, and this PR is clearly not the place to explore other specific API changes. But anyway: whether something like this is worth the effort, I don't know... but I contend it would work toward improving performance and enriching the API made available to the Rack project's users -- who are middleware, application, and framework developers -- in a way that this also-breaking change does not. I guess I'm just being blinkered, but apart from attempting to force current hijackers to rewrite onto a more server controlled partial-hijack-alike, I'm not seeing how the disruption of introducing callable-body in isolation pays off for our downstreams... I agree it's more intuitive that the explicit enumeration style for the set of developers who need it, but that in my mind that has to weigh against the intuitiveness of the dual interface existing for all middleware authors. (You also twice dismissed explicit enumeration as "less flexible" without elaborating. I recall the discussion in #1600 getting somewhat distracted, but do not recall discussing a way in which stream.write is inherently more flexible than block.call?)


Perhaps the core of my feeling is: I don't want to see Rack evolve into a more elegant project, I want Rack to evolve in a way that allows the downstream ecosystem-at-large to be more elegant. I feel that this tends more to the former.*

* again, it absolutely does make streaming-body-producing downstreams look nicer, but it does so at the expense of every other potentially-body-aware middleware

Copy link
Contributor

@jeremyevans jeremyevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation wise, I have a few inline comments which should be simple to address.

In terms of the general idea, I do not think the benefits of adding this streaming API outweigh the complexity costs of supporting an additional type of body and breaking any middleware that currently deals with the body.

I'm not actually sure that streaming is easier for an application user with this. We would be expecting users of callable bodies to correctly deal with low-level operations such as 'write', close, flush, closed?, and any exception that any of those methods could raise. Those seem way more like server-level issues than application-level issues. This is similar to the problem with full hijacking, just with a less hacky API.

The minimal application example given is:

run lambda{|env|
  [200, [], lambda{|stream| stream.write("Hello World"); stream.close}]
}

What happens if stream.close is omitted in this case? What happens if stream.write raises an exception?

While not perfect, I think explicit enumeration via #each already handles the streaming use case well enough, and I am against this proposal.

test/spec_lint.rb Outdated Show resolved Hide resolved
REQUIRED_METHODS = [
:read, :write, :flush, :close,
:close_read, :close_write, :closed?
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing there is a reason, but can you explain why read and close_read would be needed for response streaming, which should be write-only? Is this due to TLS read-before-write issues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided stream can do both input and output, just like partial hijack.

@ioquatix
Copy link
Member Author

ioquatix commented Apr 20, 2021

While not perfect, I think explicit enumeration via #each already handles the streaming use case well enough, and I am against this proposal.

The problem with explicit enumeration via #each is that:

  • It doesn't handle input well - e.g. WebSockets which need full I/O.
  • It's cumbersome and inefficient to use.
  • It doesn't signal the intent to perform streaming and thus we have middleware which will try to enumerate it and hang.

As you can imagine, with Falcon being on the cutting edge of what's possible, I'm exposed to a lot of use cases which don't fall into the typical "request -> response" model. So, there is definitely a need for something better. Ideally we can standardise that in Rack because that will have a lot of value for people.

@matthewd I agree most of your points and introducing a body which has two distinct ways to be used might be considered poor design. Considering partial hijack, we have the similar issue though:

def call(env)
  return [200, {'rack.hijack' => body}, nil]
end

In this case, should middleware check the headers to see if there is a hijacked body? You can certainly wrap a streaming response but you have to be mindful of how you do it. The conclusions we came to previously is that middleware should never be buffering the response body, and this is why Rack::ETag has been causing so many problems recently, because there is no distinction. This PR introduces that distinction and yes it might be more complex but that's because fundamentally the surface area of the problem is actually more complex than the current interface suggests.

@jeremyevans
Copy link
Contributor

Thanks for bringing up websockets. I can see how websockets would not be handled by #each enumeration. That does make this proposal more attractive for that use case.

@ioquatix Can you please respond to the following part of my previous response?

I'm not actually sure that streaming is easier for an application user with this. We would be expecting users of callable bodies to correctly deal with low-level operations such as 'write', close, flush, closed?, and any exception that any of those methods could raise. Those seem way more like server-level issues than application-level issues. This is similar to the problem with full hijacking, just with a less hacky API.

The minimal application example given is:

run lambda{|env|
  [200, [], lambda{|stream| stream.write("Hello World"); stream.close}]
}

What happens if stream.close is omitted in this case? What happens if stream.write raises an exception?

@ioquatix
Copy link
Member Author

ioquatix commented Apr 20, 2021

Here is some additional scope/background.

Here is the abstraction I ended up with in protocol-http gem:

# Read the next available chunk.
def read
	nil
end

# Should the internal mechanism prefer to use {call}?
# @returns [Boolean]
def stream?
	false
end

# Write the body to the given stream.
def call(stream)
	while chunk = self.read
		stream.write(chunk)
	end
ensure
	stream.close($!)
end

# Enumerate all chunks until finished, then invoke `#close`.
def each
	while chunk = self.read
		yield chunk
	end
ensure
	self.close($!)
end

This is a challenging part of the interface and I've gone around in circles on it quite a bit. There are performance and usability constraints layered across different concerns of the system.

The interface above is used for both request and response bodies. There are elements of symmetry so that streaming just works everywhere. The most basic interface is #read which returns a chunk of data from the underlying stream.

This interface is actually sufficient to do full bi-directional streaming.

#!/usr/bin/env ruby

require 'async'
require 'async/http'

Async do
	endpoint = Async::HTTP::Endpoint.parse("http://localhost:9292")

	server = Async::HTTP::Server.for(endpoint) do |request|
		input = request.body
		output = Async::HTTP::Body::Writable.new
		
		Async do
			while chunk = input.read
				output.write chunk
			end
			
			output.close
		end
		
		Protocol::HTTP::Response[200, [], output]
	end
	
	client = Async::HTTP::Client.new(endpoint)
	
	Async do
		server.run
	end
	
	input = Async::HTTP::Body::Writable.new
	response = client.get("/", [], input)
	output = response.body
	
	Async do |task|
		3.times do
			input.write("Hello World")
			task.sleep(1)
		end
		
		input.close
	end
	
	output.each do |chunk|
		Console.logger.info(response, chunk)
	end
end

The problem with this design is that it's inefficient. You end up with at least one internal queue for the "writable body", when you could be writing directly to the stream itself. The approach above is essentially similar to, say Rails streaming, when you return a body which encapsulates a queue, and that queue is enumerated by the server and someone else (thread, task, etc) is responsible for writing to it.

When I tried to implement 1 million WebSocket connections like this, the overhead was enough to cause problems, like, it is basically a 2x or more multiplier on your per connection cost.

The streaming model proposed here is returning a closure which takes the output stream as an argument, and that's a fundamentally different model and requires that the user code is executed in the context of the server request/response loop. This is more tricky and the server needs to know that this is what's being requested of it. In addition, the code is directly exposed to the underlying stream, rather than a wrapper around the input and output bodies (although that is semantically equivalent, just less efficient).

That's the model exposed by something like rack.hijack and because we do that in the response headers, the body is nil and middleware is generally oblivious to it.

The problem with rack.hijack (partial) is that it's an optional feature and it's limited to the concurrency of the server. I see the odd comment here and there implying that I'm trying to force people to use Falcon but I resent that somewhat - there is absolutely no reason why Puma can't adopt an event driven model for executing user code in the request/response loop. In fact this is basically what full hijack is enabling but everyone is writing their own incompatible event loops which IMHO is way worse since all of it is specific to solving one class of problem. This is actually about moving the ecosystem as a whole forward.

The biggest issue we face here is middleware which assumes that the response body is basically an Array of limited size. The solution to that is to provide something like Rack::Body which follows a similar model to that above. It should be sufficient to say something like:

class ETag
  def call(env)
    _, _, body = @app.call(env)
    if Rack::Body[body].streaming?
      # Ignore.
    else
      # generate ETag?
    end
  end
end

When I tried to make WebSockets work within a rails controller method, it was very ugly and unintuitive:

https://github.com/socketry/live/blob/5a28b8a2a8585e1bb9ffa8a85bd3731f9cc5085c/example/rails/app/controllers/application_controller.rb#L11-L31

What we need here is a standard interface for saying: I have a response which needs direct access to the underlying connection. And the server needs to be aware of what to do. And middleware needs to not break. If partial hijack is the answer, why was it not adopted by Rails?

Let me also add that WebSockets add their own layer of complexity:

https://github.com/socketry/async-websocket/blob/2e575f59033297b6704afd0743e320a2ab1e2103/lib/async/websocket/request.rb#L52-L60

Basically, HTTP/1 and HTTP/2 have completely different mechanisms for negotiating a WebSocket stream.

The adaptor for Rack basically doesn't have enough abstraction here to deal with the complexity.

What happens if stream.close is omitted in this case? What happens if stream.write raises an exception?

If stream.close is omitted, the stream will remain open.

If stream.write raises an exception, it will need to be handled. If it's ignored, it will propagate up. In terms of how that works, depends on what the user has done with the stream. The lifetime of the stream is not dictated by the closure, as in, it's entirely acceptable to do this:

def initialize(app)
  # ...
  
  # Ping all active connections
  Async do
    @connections.each do |connection|
      begin
        connection.ping # exception raised
      rescue
        connection.close
      end
    end
    sleep 1
  end
end

def start(stream)
  @connections << prepare_websocket(stream)
end

def call(env)
  return [200, [websocket headers], self.method(:start)]
end

I don't think this is materially different from a partial hijack. If you don't call #close the resources within the server won't be freed, and the lifetime is not bound to the closure (i.e. #start).

@ioquatix ioquatix requested a review from tenderlove December 13, 2021 05:18
@tenderlove
Copy link
Member

@ioquatix can you update your PR to address @jeremyevans's comment here?

I read through this thread again, and AFAICT that's the only thing outstanding (someone please correct me if I'm wrong).

@ioquatix
Copy link
Member Author

@tenderlove I am about to go swimming but after that I'll review/address the comment and update/rebase the PR.

@tenderlove
Copy link
Member

Thank you!

@ioquatix ioquatix force-pushed the streaming branch 7 times, most recently from 5d19f19 to 32f0fe5 Compare January 15, 2022 04:23
@ioquatix
Copy link
Member Author

ioquatix commented Jan 15, 2022

Okay, I've added @jeremyevans's suggestion regarding each/call verbatim as I believe it's a very appropriate statement for compatibility. We should also update the PoC implementations to take this into account.

@ioquatix
Copy link
Member Author

ioquatix commented Jan 19, 2022

I've rebased falcon on this proposal and it works as expected. With the proposed language around each/call, it also now works with Rack 1.x and 2.x where we use Lint.new.dup as a response body, where as before this was not possible/compatible.

@ioquatix
Copy link
Member Author

I've updated the PR to fix some duplicated text in the spec and updated the commit message to better reflect the changes.

@ioquatix
Copy link
Member Author

Okay, both PoCs are updated, Puma and Falcon, and both are working.

@tenderlove tenderlove merged commit 1760292 into rack:master Jan 21, 2022
@ioquatix ioquatix deleted the streaming branch January 23, 2022 20:27
dentarg added a commit to dentarg/testssl.web that referenced this pull request Oct 28, 2022
Uses "callable body"
- rack/rack#1745
- puma/puma#2740

This doesn't work in Puma 5, it will error like this

    2022-10-28 08:35:53 +0200 Read: #<NoMethodError: undefined method `each' for #<Proc:0x00000001050b2fd8 /Users/dentarg/local_code/testssl.web/app.rb:44 (lambda)>

              res_body.each do |part|
                      ^^^^^>
dentarg added a commit to dentarg/testssl.web that referenced this pull request Oct 28, 2022
Uses "callable body"
- rack/rack#1745
- puma/puma#2740

This doesn't work in Puma 5, it will error like this

    2022-10-28 08:35:53 +0200 Read: #<NoMethodError: undefined method `each' for #<Proc:0x00000001050b2fd8 /Users/dentarg/local_code/testssl.web/app.rb:44 (lambda)>

              res_body.each do |part|
                      ^^^^^>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants