-
Notifications
You must be signed in to change notification settings - Fork 757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix re-login behavior of remote clients to a server #3486
Conversation
// -> so the response should go thru prHandleNotifyFailString | ||
|
||
remote = Server.remote(\remTest, homeServer.addr, homeServer.options); | ||
// Server.default = remote; // to test IDE server display |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
remote.serverRunning.not and: { timeout > 0 } | ||
} { | ||
timeout = timeout - dt; | ||
dt.wait; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a condition variable here
|
||
// make s play dead now, but leave the process running | ||
homeServer.statusWatcher.stopStatusWatcher.stopAliveThread.serverRunning_(false); | ||
// homeServer.serverRunning.postln; // client thinks it is dead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
|
||
this.assert(homeServer.clientID == 3, | ||
"homeServer gets requested clientID back from server process." | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be a separate test or removed altogether
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, done
); | ||
|
||
// make s play dead now, but leave the process running | ||
homeServer.statusWatcher.stopStatusWatcher.stopAliveThread.serverRunning_(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chaining these calls is somewhat confusing IMO
@@ -544,7 +544,8 @@ Server { | |||
case | |||
{ failString.asString.contains("already registered") } { | |||
"% - already registered with clientID %.\n".postf(this, msg[3]); | |||
statusWatcher.notified = true; | |||
statusWatcher.prRecoverRemoteLogin(msg[3]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use a comment - without looking I don't know what msg[3] is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
}, AppClock) | ||
} | ||
|
||
prRecoverRemoteLogin { |clientID| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs to be added to the list of private methods in the help file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this name supposed to mean, btw? not immediately apparent to me, sorry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, maybe really not a good name.
would you find prHandleRemoteLoginWhenAlreadyRegistered clearer?
// -> this client netaddr is already registered, and should say so! | ||
// -> so the response should go thru prHandleNotifyFailString | ||
|
||
remote = Server.remote(\remTest, homeServer.addr, homeServer.options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flow of logic would make more sense to me if this server were created right after the homeServer; does this constructor call depend on preceding code in a way I'm not seeing?
"homeServer gets requested clientID back from server process." | ||
); | ||
|
||
// make s play dead now, but leave the process running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s -> homeServer
Just two quick questions @adcxyz: If clientIDs are handed out by the server, then if the server recognises the client address, it can just give the same ID or? Or is this for the backwards compatible manually specified ID case? Does it make sense that a client necessarily gets the same ID? If you have a lang crash, leaving orphaned nodes, then getting the same ID could lead to resource conflicts. There are various ways of dealing with that of course (e.g. gracefully free orphaned nodes using predictable default group), but you might want different ones for different use cases. |
Yes, exaclty. This is what the server program has been doing all along, and I think is the clearest and most predictable behavior for most cases, as listed below.
I think keeping previously known IDs is the clearest and most predictable behavior for most cases I can think of: hope that answers your questions? |
hi @brianlheim, thanks for the detailed comments, |
Indeed! Thanks for the detailed and patient explanation. :-) |
Jealous! Have a great time! |
Hi @brianlheim and @muellmusik, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one minor request, should be easy to fix. Looks very good -- it'll be a great relief not having to teach the kids to kill servers on a daily basis.
"// so you may want to release currently running synths by hand:\n".postln; | ||
"%.defaultGroup.release;\n".postf(server.cs); | ||
"// and you may want to redo server boot finalization by hand:\n".postln; | ||
"%.statusWatcher.prFinalizeBoot;\n".postf(server.cs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to suggest this line as something to actually do, it shouldn't involve a private message. What about adding:
+ Server {
finalizeBoot { this.statusWatcher.prFinalizeBoot }
}
then the line above would read:
"%.finalizeBoot;\n".postf(server.cs);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the name ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this solution is a hack, so it's ok to call a private function. Adding access to a private function as a public function seems worse because it has to be walked back with deprecation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, makes some sense. My only worry is that the public use of private methods might be perceived as the standard way to go.
So let's note it in the comment that this is a temporary measure which will be replaced in future versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like the comment above the method says, the whole thing is about making it possible to recover in an emergency, so yes, it is a hack.
// cleanup | ||
remote1.remove; | ||
remote2.remove; | ||
unixCmd("kill -9" + serverPid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not portable at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Is there a crossplatform version of kill?
or is it OK to use Server.killAll in the testsuite?
I would have preferred that, but assumed it is cleaner to only kill the exact process and nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a crossplatform version of kill?
Windows uses taskkill.
is it OK to use Server.killAll in the testsuite?
No. The single process approach is much more sound. Since we have Platform.killAll
already, it would probably not be such a bad idea to add killProcess
that accepts a pid. Just a suggestion. That should be done in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have near zero windows shell skills - would the methods below work?
in UnixPlatform:
kill { |pid|
("kill -9 " ++ pid).unixCmd;
}
and in WindowsPlatform:
kill { |pid|
("taskkill /F /pid " ++ pid).unixCmd;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so.. I can't test at the moment
} { | ||
// same clientID, so leave all server resources in the state they were in! | ||
"// This seems to be a login after a loss of network contact - \n" | ||
"// - reconnected with the same clientID as before, so probably all is well.\n".postln; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this really need to be posted? i think this is reaching a point where it's going to be regarded as noise by users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, it is trying to be helpful in a rare emergency scenario - you are in the middle of playing a show, network drops out, you try to reconnect, and likely happy to see anything informative posted. I doubt that anyone would find that noisy in that situation.
But maybe that is just me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, that makes sense. I think it would be nice if the post message was consistent with other messages coming out of this class, though. No other messages that I've seen start with "//", it seems arbitrary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually since it should be noticeable, how about making it begin with *** for bold posting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just tried it - that does make the post-crash hints pop out nicely in the post window!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all right!
// simulate a remote server process by starting a server process independently of SC | ||
serverPid = unixCmd(Server.program + options.asOptionsString); | ||
|
||
1.wait; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this wait for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover from an earlier version of the test, unneeded now - can remove it.
|
||
this.assert(remote1.clientID == 3, | ||
"after first login, remote client should have its requested clientID." | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these asserts necessary? they seem to be covered well by other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK the Server:remote method is not tested anywhere else, so testing both its normal and its emergency use seemed efficient and appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then it should be in a separate test...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so have a test for remoteLogin, which tests the first login only,
and a test for repeatedRemoteLogin, which will needs a first login as setup,
and then tests the repeated login only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I think all requests done,
and submitted separate PR for kill method
OK - I believe I addressed all requests for changes on this one, |
#3499 has been merged, this can be updated now. |
ok, updated - thanks @brianlheim ! |
.notified_(false) | ||
.serverRunning_(false); | ||
|
||
// 1.wait; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please remove this? Everything else looks good, thanks for the tests!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, done!
Thanks, this is ready for merge IMO. @telephon - want to give it a look since you've got a review already? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
@telephon please approve when you get the chance 👍 |
(sorry for the delay, I had somehow missed the message) |
*** PR Info text completely redone ***
Thanks to questions by @muellmusik and @brianlheim,
the cases covered in this PR became much clearer to me.
The general issue is handling repeated logins in network performance situations,
to recover as gracefully as possible from loss of connection between client and server,
whether by network glicht or client crash.
(server crash is outside the scope of this PR, but should be fully thought through elsewhere.)
Case 1: network connection is lost,
and the same client, same client-side server object logs in again.
-> so, just post that connection is back, and all is well.
Case 2: client crashes, and is restarted;
then the new client logs in again from the same network address:
so the server program sends its previous clientID back to the new client.
the server program response may trigger new allocators - which is correct
that the new server object does not know about:
depending on the specific setup used,
-> so, post about successful reconnect, and advice about release/reinit code.
(Note for later: maybe develop functionality to retrieve these resources
from the server program in the client.)
Case 3: loss of connection and login from a different network address:
This will look like a new login to the server, so the old defaultGroup
and everything in it keeps going, and the new client gets a new clientID.
In the future, this case could be handled better by releasing the previous notify registration,
and then asking for the same clientID it had earlier, to continue seamlessly from there.
This would need some more effort to work.
The tests provided also cover cases 1 and 2:
TestServer_clientID_booted:test_repeatedLogin tests case 1
TestServer_clientID_booted:test_remoteLogins tests a first login by the Server:remote method,
and a repeated login from a new server object after a simulated crash.