-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADDED] In-process connections #2360
[ADDED] In-process connections #2360
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love this, been keen to have something like this for quite some time :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution!
@@ -184,6 +184,7 @@ type Options struct { | |||
ServerName string `json:"server_name"` | |||
Host string `json:"addr"` | |||
Port int `json:"port"` | |||
DontListen bool `json:"dont_listen"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding a new option, I wonder if we could decide on a port that would disable listening. For instance, port set to 0 means that we use default port 4222. Setting to -1 means that we let OS pick a random free port. We could say anything negative lower than -1 (say -2) means disabled?
You don't have to update the PR for that, just wondering if that would be better or not and let's see what others think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this could work, although it does create the risk of the API shape having more "magic numbers". I'll be happy to accept direction on this from whoever has strong enough opinions on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me. Let's see if other maintainers have an opinion on the DontListen vs special port number.
The magic ports seem a bit unclear and hard to discover to me, otoh we have too many settings already but I think a documented setting might be better. |
Hm, ultimately it doesn't appear as if the It might make sense to re-add the |
I've gone ahead and done that. There's no new API surface this time — |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a go with this change and bringing the changes in the nats.go to create a quick test in the server package and my main concern is the behavior of the net.Pipe when one side closes the connection.
It appears that it results in internal pipe write timeout, which is not that good. For instance, I tried to connect with wrong user/password and the server would report a WriteDeadline exceeded (when trying to write the authorization violation error).
Without more testing (which is going to be a bit annoying since it would require changes from the client that I would not want to commit if we end-up rejecting the whole idea), I think we can't merge this PR.
If you could have some tests and figure out what the issue is with those time outs, then we could reconsider.
Looking at the source for https://github.com/golang/go/blob/d8f348a589b3df4bb48636023be4ccf9ac96d307/src/net/pipe.go#L183-L188 I haven't made any changes to deadlines anywhere so wondering if there is some existing deadline being set elsewhere?
Have you got an example or a specific test that I can look at? We've been experimenting with in-process conns using these commits in our codebase for the last couple of days and we have yet to see this. We'll ultimately need to abandon NATS if we can't figure out a way to perform in-process connections which would be a shame, as it's looking to be a good fit so far. |
I was trying to connect with wrong username/password, so what I see is that the server blocks on flushOutbound() writing So the server reads the CONNECT (not the PING yet), fails the client by trying to write So if I have the client sends only CONNECT and then PING later, the issue is the same, this time the client blocks writing PING because the server is also blocked writing |
Lack of buffering is probably the crux of the issue here —
I suppose a custom implementation could be made which wraps |
Just curious, what is failing? Is there a test that consistently fails? Sometimes tests can be flaky, so you may need to rerun them a few times. For example, I did this 537a5bc |
@variadico No, it was just that in the case of "DontListen" (part of this PR), ReadyForConnections() should ignore the client port, so I asked to add |
@neilalexander We will discuss this PR internally and come up with a decision maybe later next week. Thanks again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Switching to Draft until we decide internally (so it does not get merged by mistake).
Thanks @kozlovic, will wait to hear back on a way forward. Hoping we can come up with something suitable — NATS is shaping up very well for us so far, we just need to be able to get it to work in obscure places like WASM as well as cross-process! |
Is there any update on whether this was discussed further? We're optimistically hoping for good news. 😄 |
@kozlovic and I plan to discuss this week IIRC. |
Actually beginning of next since a member of the team is on vacation this week and we wanted his inputs. |
@neilalexander Sorry for the delay. We just had a discussion today about this, and we feel it is important and part of future kind of deployments, but we would need a bit more information. The wasm limitation may be temporary, are you following wasi developments? Finally, would you have 15/20 minutes to speak with @derekcollison? He would like to have a short discussion with you. Thanks! |
We are starting to see more interest in this and we ourselves are very interested in the WASM/WASI ecosystem. Possible to get 20 minutes of your time to chat live via Zoom or GH? |
Hey @kozlovic, @derekcollison — more than happy to have a chat. I'm in GMT but please feel free to drop me a mail at At a high level, our project is Dendrite, which is a Matrix homeserver implementation. We basically need to be able to run everywhere — within the web browser, on mobile devices and on servers alike. We see the in-process connections as a way of getting around the limitations with not being able to open arbitrary sockets with WASM (and this probably still holds true in WASI as of today, not sure?) and also not opening unnecessary sockets on mobile devices either. I can explain more certainly about what we're hoping to do on the call. |
Neil, Have not forgotten about you, promise. |
Hey Derek, Is there any update? Cheers! |
Still very much on our list, just prioritizing KV and ObjectStore abstractions atm. |
Hi all, is there any update on this? |
Unfortunately nothing has changed atm. Still a nice to have I think for us but not a high priority. |
@neilalexander if you want to update this to the main branch (rebase etc). We can see if we can get this one in.. |
@derekcollison Have rebased against |
Thanks! We still have the hang issues that were mentioned yes? |
I’m not entirely sure what issues it might introduce, but we’ve been using this in Dendrite for months without any particular issues. Admittedly we don’t ever stop NATS or close the connection until the point that the entire process shuts down. We’ve also seen one or two cases of With an “experimental” label attached it’s usable as-is, but you’d probably be better qualified to figure out if there’s anything I’m missing. |
I think Ivan had mentioned some tests he was doing that could block on failed auth or something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind rebasing again (sorry about that). There was a test failure and I tried to recycle it on Travis but it complains about the script not being found (was changed during the Go 1.18 PR). Thanks!
…ne call after createClient
Have rebased again. |
server/server.go
Outdated
for time.Now().Before(end) { | ||
s.mu.RLock() | ||
chk["server"] = info{ok: s.listener != nil, err: s.listenerErr} | ||
s.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to change that to an RLock(). This has changed in main so that is causing issue now with your PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The "InProcess" change make readyForConnections() possibly return much faster than it used to, which could cause tests to fail. Restore the original behavior, but in case of DontListen option wait on the startupComplete gate. Also fixed some missing checks for leafnode connections. Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Rework startupComplete gate from PR #2360
This PR adds three things:
InProcessConn()
function toServer
which builds anet.Pipe
to get a connection to the NATS server without using TCP socketsDontListen
option which tells the NATS server not to listen on the usual TCP listenerstartupComplete
channel, which is closed right before we startAcceptLoop
, andreadyForConnections
will wait for itThe main motivation for this is that we have an application that can run either in a monolith (single-process) mode or a polylith (multi-process) mode. We'd like to be able to use NATS for both modes for simplicity, but the monolith mode has to be able to cater for a variety of platforms where opening socket connections either doesn't make sense (mobile) or just isn't possible (WASM). These changes will allow us to use NATS entirely in-process instead.
An accompanying PR nats-io/nats.go#774 adds support to the client side.
This is my first PR to this project so apologies in advance if I've missed anything obvious anywhere.
/cc @nats-io/core