-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interfaces of Python GA API #6327
Conversation
status code and connectivity magic numbersI'm strongly opposed to flattening our magic numbers. We have two separate documented and enforced-by-convention discriminants that are conceptually distinct. They should remain in separate enumerations (although, they perhaps should acquire their values from the Cython layer as those values will always be in sync with the core). flattening the package structureI feel like this swings the pendulum a bit too far in the opposite direction in terms of package hierarchy. Yes, before we had to go 5+ package/modules deep to find interface definitions that the user would regularly interact with, and yes that was a problem. However, I think there's a happy balance between an entirely flat hierarchy and what we had before. The Python community appears to tend towards organizing packages first by concept or idea and then by implementation detail. I wouldn't be against having something like a module for user-facing invocation-side things, and a module for user-facing service-side things, and a module for common things, for example. Without bikeshedding too much quite, I'd like something akin to:
where concepts would include the separation of stubs from services from transports and handshakes (all the most basic elements required to get something up and running), specialized concepts would include codes and security plugins... tl;dr: If someone were to take our package hierarchy and draw any arbitrary path from root to (public) leaf, I'd want that path to describe a useful tutorial for the project. |
Some answers/opinions (I haven't yet read code):
|
Lots more here to discuss and review now. Highlights:
Questions:
|
Completely legit not to invoke core on serialization failure. You may want It's not obvious to me that one can't write a multi threaded server with Need specifics for memory leaks. Hope to dig into the code later today. On Mon, May 16, 2016, 6:09 AM Nathaniel Manista notifications@github.com
|
Whoops, I left out the "without the server maintaining bookkeeping on what RPCs are active and what tags are due". The Python beta server is another example of a single-completion-queue multithreaded server so my having forgotten the qualification is even more glaring. Here's what I wrote the other day for issue 6597 (but wound up not posting because in the course of trying to prove "I think the workaround is to carry forward the bookkeeping present in the beta server" I went in the plus-completion-queue-per-active-RPC direction instead): Let's ask a precise question: must a core-wrapping server implemented as one thread looping over a completion queue and application code run in separate threads maintain its own records of what RPCs are in progress and what events are "due" from the completion queue? I would like the answer to be "no", but the following scenario suggests to me that it might be "yes".
This leaves the application thread hung waiting for notification on its condition that will never come. Changing the semantics of the server's loop to "call completion queue shutdown when handling the server shutdown tag" just defers the problem:
So if the right time to call completion queue shutdown isn't when processing the server shutdown event, when is it? I don't think I can tell without full bookkeeping. So what seems to me right now to be the case is that I can't rely solely on events coming out of the completion queue to determine whether or not it is safe for the application to take actions like request a read or request sending status because the event alerting the application to cancellation of the RPC is sitting in "completion queue limbo" having been made available to the server loop thread but not yet obtained by the server loop thread. So I think the workaround is to carry forward the bookkeeping present in the beta server: each RPC keeps track of what events are due for it, and the server maintains a record of all active RPCs and in the course of handling an event related to an RPC "asks" the RPC if the event concludes the RPC and removes the RPC from its records if the event does conclude the RPC, and the server also maintains a record of which of new-RPC and server-shutdown events are due from the queue as well, and the server exits its loop after all of its shutdown, it completion of all RPCs, and its having secured its last new-RPC event and its shutdown event. @ctiller: what am I missing, and where have I erred? |
As for the memory leaks: (1) Taking the current draft and removing the
in its output. This clearly has to do with the initial metadata from the client, but given that I've checked and rechecked and superchecked that the associated tags are getting pulled off their completion queues on each side of the connection, I'm not sure what. (2) Hmm; I'm having trouble reproducing the other one right now. Should have made a more detailed note... |
|
||
|
||
class UnaryUnaryMultiCallable(six.with_metaclass(abc.ABCMeta)): | ||
"""Affords invoking a unary-unary RPC.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find it really strange that this is exposed at the top-level. This is not something I think the user should be learning about when skimming the interfaces before they become aware of Server
. This sort of reasoning applies to all of the *Callable
s. Maybe even Future
.
Things that the user does not have the direct power of creating and managing should not be among the things the user has to learn about first if skimming the code, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the sentiment. I think I'm leaning the other way, but I'm open to being persuaded to move these code elements to another module.
It comes down to the question of whether we expect this module to be friendlier to read or friendier to use. If we want it friendlier to read, that means leaving the code elements that users are going to directly use in the file and migrating other code elements out to a less prominent place in some other module or modules. But since those other code elements are referenced from the code elements directly used by the users, that means that they still see them (in Returns:
and Raises:
documentation, at least).
I think I like having everything in this module because it contributes to ease of use, even if that means the text of the module is less friendly for users of the code to read. When you make when you make an asynchronous RPC call, what's the type of the future that you get back? grpc.Future
. Not grpc.future.Future
or grpc.interfaces.Future
. I like buying that benefit at the cost of the module's text being long to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, to start, I'd say grpc.utility.Future
, where grpc.utility
would include every interface we don't expect the user to create themselves sans maybe the error types (else we'd just have a heavily populated grpc
or interfaces
namespace all over again with future
, server
, blahblah
and the user not knowing which things said user cares about, i.e. the root objects said user must self-manage).
If you lean in the current direction, consider comment section headers, because now that we're traveling into long-and-flat-file-structure-land that we've been historically stylistically abhorrent of where there's no easy-to-recognize syntactic difference between code elements of different kinds (especially in this case, e.g. errors vs. contexts vs. callables vs. servers/clients differ only by the natural-language semantics of their identifiers/comments and maybe length for errors), that's what those comments are for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to roll this around while other other comments come to and go from the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, wrapping this up: I think we've heard quite loudly from our users that the grpc
Python package is where they expect every developer-user-facing code element to be found. I can think of two ways to go about satisfying that requirement:
- put everything in
grpc/__init__.py
, as is currently drafted, or - spread code elements out reasonably into files like
_future.py
,_multi_callables.py
,_context.py
, and then alias them ingrpc/__init__.py
, being careful that the names by which they are officially known are the aliases in thegrpc
namespace and their having been defined in submodules is just an implementation artifact.
I favor the first choice, but I could live with the second, because both deliver the desired developer-user experience of
>>> import grpc
>>> help(grpc)
>>> help(grpc.Future)
>>> help(grpc.Channel)
>>> help(grpc.insecure_channel)
. What's not part of the developer-user experience that we need to deliver is "the text of grpc/__init__.py
has to be not too long". Developer-users should be learning gRPC Python from the documentation or maybe from the read-eval-print loop as above.
The length of grpc/__init__.py
is only a problem for us who maintain it and for power developer-users who want to change it. If it were an implementation module, I'd feel like it was on the long side, but since it's nearly all non-executable specification text I think it's just fine.
Please insist on breaking it up if you feel strongly, otherwise have I persuaded you at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seriously considered it, and... I think I like section comments even less than factoring the code elements out into private implementation modules and then aliasing them at top-level. :-P
The grpc
package will wind up with about twenty-five code elements of which users will directly use about a third. More than I'd like in an implementation module, but I'm okay with it for a top-level API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, comment section headers, unlike re-exports, don't wreck auto-completion tools' already hampered abilities to find doc-strings in a dynamic language, so there's that marginal comparison (and all its simpler forms like grepping and whatever). Meh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An aside to an aside: turns out that our auto-completion is smart enough to propagate the requisite info. Welp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seriously, though - what's wrong with comment section headers? Which of those reasons apply here? Why do you believe the cons outweigh the pros in this scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strong personal distaste rooted in the belief that if a body of code is large enough to warrant breaking up it's better to break it up according to the language's mechanism of division (modules, in this case) rather than according to something not at all structural (like section comments).
But I've gotten over it and added section comments. It's far more important that this move forward.
8f4a629
to
adad6fa
Compare
I've dropped the I think I've answered all questions asked up to this point; please come at me with more? Lots more? I'd much rather have approval than consent. Happy to break it up if you'd like as well. |
Oh, and the memory leak question did find an answer: accidental object retention counter to the design of the ( |
@ctiller: now would be a good time for that link you promised me about "target" - right now I just have "the target to which to connect" in my specification; that could definitely be fleshed out. |
Quick question about the
. What's bothering me is that I don't think
? |
READY = (_cygrpc.ConnectivityState.ready, 'ready') | ||
TRANSIENT_FAILURE = ( | ||
_cygrpc.ConnectivityState.transient_failure, 'transient failure') | ||
FATAL_FAILURE = (_cygrpc.ConnectivityState.fatal_failure, 'fatal failure') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been SHUTDOWN
since we've been implementing it, but when the document renamed FATAL_FAILURE
to SHUTDOWN
one spot was missed and that spot was in a table that made it easiest to copy...
I'd be for the rename in principal, but I don't know how much effort is involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to rename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to rename.
Needs resolving. |
On Wed, Apr 27, 2016 at 9:31 PM, Nathaniel Manista <notifications@github.com
C# is actually using terms requestHeaders responseHeaders and trailers
here's what's C#'s doing: For the most commonly used options (like deadline), there's also an
|
On Tue, May 24, 2016 at 8:48 AM, Nathaniel Manista <notifications@github.com
|
Gonna strongly nix |
adad6fa
to
8b14c48
Compare
All outstanding discussion threads addressed and I've broken up the code so that this pull request is just the GA API interfaces. Please everyone review and come forward with any last concerns - it would be nice if this could go in today before the long weekend. @soltanmm, @jtattermusch: okay, back to |
Also since trailing metadata, details, and status code are fellow-travelling objects, I've changed the |
One last question for @ctiller, @jtattermusch, and maybe @ejona86: the current draft shows |
re |
8b14c48
to
85442b7
Compare
@soltanmm: not exposing all information available at call time would save us one interface (call it a I just don't want to proceed without information and implement something different from the other languages that then has us either missing support for use cases that other gRPC languages support or missing simplicity for the use cases that other gRPC languages support because we've got a larger API able to serve complex use cases are unlikely to ever come to pass. |
@nathanielmanistaatgoogle, in general, routing should be based solely on the the method (which is the path in HTTP) and, if supported, "virtual hosting" based on the I don't know if/when/how various languages support different message content types for the same service (e.g., a single service supporting both JSON and protobuf encodings), but it's possible that may be implemented as separate handlers that do some conversion into a common format and then call the "real" handler. There's multiple alternatives though. In Java, we just pass the method name and authority when looking up a handler. For different message content types, in Java we would mostly likely improve our marshaller interface and not touch method lookup/handlers at all. |
+1 to @ejona86's reply. |
|
Instances where we would want to expand this are very rare, but exist. One Both will have implications for how the request should be routed and Would advise having some expandable type available in the interface, On Tue, May 31, 2016 at 10:56 AM Nathaniel Manista notifications@github.com
|
This method is only safe to call before the server is started. | ||
|
||
Args: | ||
generic_rpc_handlers: Some number of GenericRpcHandlers that will be used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to assume this means some kind of iterable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch; fixed.
85442b7
to
1904f4a
Compare
1904f4a
to
0a10d6c
Compare
Back over to y'all. |
lgtm |
@jtattermusch: the MacOS failure is beyond the scope of this pull request; please merge or toss back to me with action items. |
The macOS failures seemed unrelated, so I just merged. Thanks, Nathaniel. |
Lots of minutia to discuss here, but of course the most important question for this is whether or not it has all the growth surfaces that are needed and whether or not future changes are going to be as disruptive as past feature growth was to the alpha and beta APIs.
(Security (mostly) not part of this code review.)
Changes from Beta:
Questions:
concurrent.futures.Future
that motivatedgrpc.framework.foundation.future.Future
haven't been solved. "Maximizing compatibility" withconcurrent.futures.Future
would mean compounding its design flaw and violating the public API ofconcurrent.futures
- please no? Is it a problem putting ourFuture
interface where it is currently drafted in this API? ShouldFuture
appear as a code element ofgrpc
, or should only code elements that we expect to be used in application code be code elements ofgrpc
?async
/await
, they'd be the right place, right?Channel
have the RPC invocation convenience methods depicted here or is it just enough to have the MultiCallable methods? How much do we expect users to make direct use ofChannel
s? How hard (and/or weird) are we making their lives if we make them writemy_response = my_channel.unary_unary(my_method_name)(my_request)
instead ofmy_response = my_channel.blocking_unary_unary(my_method_name, my_request)
?RpcError
also aCall
object? Should theRpcError
have aCall
object instead?Channel
construction parameter be named "target" or "address" or... something else? What exact wording would we like describing it?bytes
in andbytes
out?my_response = my_stub.my_method(my_request)
at the read-eval-print loop, we're going to have to support native strings in at least one place.)metadata_transformer
present in alpha and beta and make security entirely done with the plug-in API? (I suspect yes?)host
is a stub option in the beta API; should it remain a stub option or become a channel option or per-call option?CallOptions
class that can be passed on RPC invocation and that encapsulates disable_compression, subcall_of, and per-call credentials. Do we want to do away with this aggregate object and just have those values as keyword arguments on each RPC invocation (as is drafted here)? If so, is there any need to worry about every future feature being added as yet another keyword argument and eventually having an unmanageable number?@soltanmm and @jtattermusch please also look? Thanks much.