Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C#: misuse of InvalidOperationException on network error #7223

Closed
kkm000 opened this issue Jul 2, 2016 · 13 comments
Closed

C#: misuse of InvalidOperationException on network error #7223

kkm000 opened this issue Jul 2, 2016 · 13 comments
Assignees

Comments

@kkm000
Copy link
Contributor

kkm000 commented Jul 2, 2016

Consider

Channel channel = new Channel(machine, ChannelCredentials.Insecure);
var client = new Compiler.CompilerClient(channel);
using (var call = client.Compile())
  call.RequestStream.WriteAsync(new CompileRequest()).Wait();

(the .Wait() here is abused for demo purposes only)

If machine is not listening, them the WriteAsync call will throw InvalidOperationException with the message Send failed. There is a huge problem with that.

  1. [TL;DR: gRPC is using a wrong exception class]. InvalidOperationException semantics is quite close to that of C++ ASSERT; it says, essentially, "you are doing something wrong." There is no explicit guideline published by MS as far as I know (and, unlike Win32, I cannot commend .NET docs on exhaustiveness), but the documentation opposes the IOE to the ArgumentException. Other development community guidelines (a good example) talk about NotSupportedException as opposed to NotImplementedException, and there is also of course ObjectDisposedException. What is common among these presumed opposites is that all of them tell the programmer that the code is not using the object correctly. Kind of what the ASSERT does.

    Now this is clearly not our case. I the programmer cannot control the state of network! InvalidOperationException is out of place here. I cannot think of a good predefined class for an exception in this case. For one, RpcException with the Unavailable status would certainly not be out of place here (and I think the previous gRPC did do just that, in v0.11).

  2. [TL;DR: gRPC is throwing out information about error condition]. Consider

    try {
     . . .
    } catch (InvalidOperationException ex) {
     if (ex.Message == "Send failed") {
       . . .

    I almost hear you saying ugh! as in ugly. Unfortunately, the exception message is the only additional piece of information that I have here. And, the scariest thing, this is how my production code currently looks. Now, using another exception type would be a great help to avoid this ugliness. I am talking only an enhancement here, so this is a lesser point. In the debug log there is (with long paths snipped for clarity):

    I0701 17:25:24.184428 0 c:\jenkins\[snip]\subchannel.c:642: Connect failed:
    {"created":"@1467419124.184000000","description":"OS Error","file":"c:\jenkins\[snip]\tcp_client_windows.c","file_line":114,"os_error":"No connection could be made because the target machine actively refused it.\r\n","syscall":"ConnectEx","wsa_error":10061}
    

    Would not I love to have access to this information through exception properties! Even in a form of loosely defined key/value pairs. I can make decisions like whether I should complain about invalid configuration, or keep calm and retain the address in the load balancer list to retry later.

@Maverik
Copy link

Maverik commented Jul 4, 2016

+1 to both points. I was left scratching my head as well and it didn't occur to me to dig into logging just to find out the why of Send failed. I just assumed I was doing something wrong (just starting out with grpc here).

RpcException indeed would be a good starting substitution here instead of IOE imho. If MSDN is any nudge to how MS tends to handle these case, DirectoryException / CommunicationException are good examples of this sort of situation handling. They're all encompassing within their realm and throw more specialized exceptions where they make sense (for example LdapException or SecurityAccessDeniedException)

@jtattermusch
Copy link
Contributor

@kkm000, thanks for filing this issue, I fully agree with your thoughts. Seems like this is something we need to fix. Also, I apologize for not reacting sooner, I was out of office for extended period of time.

Here are my thoughts on what needs to be done and how. I will start with the client side calls

  • For unary calls, everything works as expected because if such call fails, one receives an RpcException with appropriate status and InvalidOperationException is only thrown if programmer breaks the contract and does something not admissible.
  • For streaming calls, I think for a failed write we should deliver an RpcException with a StatusCode corresponding to the status code of the entire call. As a principle, a single gRPC call should only have a single status code associated with it - so in C# API we need to honor that. Internally in C core, finishing of a write operation and finishing of the entire call with a status code are two different events. So, what needs to happen is when a write operation fails, we also wait for the entire call to finish and then throw RpcException with the correct status because we already obtained it. That should work fine because
    • C core guarantees that if a write operations fails, the entire call will fail "immediately" (soon enough) afterwards. So we wouldn't starve waiting on the status code to be delivered.
    • there should be no performance impact, because we only wait for two events instead of one in case the RPC call fails and that's not really something we need to optimize for.

Does that make sense? If so, things seem to be fairly clear on the client side.

On the server side, things works similarly with the biggest difference being that we cannot throw and RpcException (that comes with a status code) simply because server side calls don't have a status code associated with them - if they fail, they finish with a binary success/non-success flag reported back to the server code. So on the server side, we need to figure out which exception type is the most fitting here - my personal favorite seems to be IOException (but do be honest I am not 100% convinced I like that choice either).

What is your opinion on what the server side behavior should be (and what exception type to use)?

@jtattermusch
Copy link
Contributor

CC @apolcyn @jskeet

@apolcyn
Copy link
Contributor

apolcyn commented Sep 7, 2016

Would it be worth it to try and do this in a non-breaking way, or at least less breaking? Possibly sub-classing the current InvalidOperationException? (got rid of one idea like that in #7861).

@jskeet
Copy link
Contributor

jskeet commented Sep 7, 2016

So what server-side failure type are we talking about? Something like "client killed connection"?

@jtattermusch
Copy link
Contributor

@jskeet yes, if something went wrong with the call and we weren't able to finish the call normally (by sending client a status). That could be network error, client died, etc.

@jskeet
Copy link
Contributor

jskeet commented Sep 7, 2016

Sounds like IOException makes sense in that case then.

@jtattermusch
Copy link
Contributor

Another option would be to throw parameterless "RpcException" for more consistency with client side (but it's just an idea).

@hinaria
Copy link

hinaria commented Sep 7, 2016

I like the idea of throwing an RpcException, a single exception type to signify "this rpc has failed".

IOException feels a little out of place given that the grpc does not model IO, instead we model the invocation of methods that happen to require IO. It would be the only place in grpc that would throw an IOException requiring that exception handlers must now catch two types to handle the "this rpc has failed [and potentially retry]". RpcException seems semantically okay here.

@jtattermusch
Copy link
Contributor

Actually RpcException on server side turned out to be a bit tricky because it already has field Status that only makes sense on the client side. I had the options to introduce a new "ServerRpcException" or stick with "IOException". It seems that IOException is the best choice, because it required the least changes and expresses the concept quite well. I didn't want to add just another exception type for no good reason.

See #8084 for the PR with the fix.

@hinaria
Copy link

hinaria commented Sep 15, 2016

Sounds great. So looking briefly at #8084, we now throw an RpcException on the client, and an IOException on the server? That's perfect.

@jtattermusch
Copy link
Contributor

@imiuka Yes, exactly (once we merge and release of course).

@jtattermusch
Copy link
Contributor

#8084 has been been merged.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants