Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the client-side ping policy #11587

Merged
merged 6 commits into from
Sep 19, 2017
Merged

Fix the client-side ping policy #11587

merged 6 commits into from
Sep 19, 2017

Conversation

y-zeng
Copy link
Contributor

@y-zeng y-zeng commented Jun 23, 2017

chttp2_transport was too strict about sending ping. It may prevent legal keepalive pings from being sent. This PR relaxes the restriction as described in Client-side Keepalive:

  • The number of ping frames allowed between data frames will no longer be limited.
  • If no data frame is received, the client is allowed to send a ping every 5 minutes.
  • If a data frame is received, the client is allowed to send the next ping immediately.

@grpc-kokoro
Copy link

[trickle] No significant performance differences

@grpc-kokoro
Copy link

[qps] No significant performance differences

@y-zeng
Copy link
Contributor Author

y-zeng commented Jun 28, 2017

Jenkins: test this please

@grpc-kokoro
Copy link

[trickle] No significant performance differences

@grpc-kokoro
Copy link

[microbenchmarks] No significant performance differences

@y-zeng y-zeng requested a review from sreecha June 30, 2017 01:13
@grpc-kokoro
Copy link

[trickle] No significant performance differences

@grpc-kokoro
Copy link

[microbenchmarks] No significant performance differences

@grpc-kokoro
Copy link

[trickle] No significant performance differences

@grpc-kokoro
Copy link

[microbenchmarks] No significant performance differences

@@ -2398,6 +2418,36 @@ void grpc_chttp2_config_default_keepalive_args(grpc_channel_args *args,
&args->args[i],
(grpc_integer_options){g_default_keepalive_permit_without_calls,
0, 1});
} else if (0 ==
strcmp(args->args[i].key, GRPC_ARG_HTTP2_MAX_PING_STRIKES)) {
g_default_max_ping_strikes = grpc_channel_arg_get_integer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting global state from transport construction is the wrong thing to do:

  1. it's not thread safe (what if I create transports from two threads)
  2. it's order dependent (max_ping_strikes now depends on the construction order of transports, which might change even due to network latency - meaning this could become a random number)

Better would be to hold these values as properties of the transport (preferably as some sub-struct... grpc_chttp2_ping_enforcement_args?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!
These global values are not set from transport construction, they are set in grpc_chttp2_config_default_keepalive_args(). It's an internal interface, and must only be called at the initialization stage, as stated here.
The transport struct has corresponding properties in grpc_chttp2_repeated_ping_policy. When constructing a transport, we will use g_default_max_ping_strikes as the default value of grpc_chttp2_transport.ping_policy.max_ping_strikes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does initialization stage mean?

  • before grpc_init?
  • after grpc_init?
  • before transport construction?

If it's an internal interface, who is supposed to call it? (no code can, because it's internal)

What is the justification for introducing yet another configuration mechanism?

@ctiller
Copy link
Member

ctiller commented Sep 6, 2017

Needs conflict resolution

@y-zeng
Copy link
Contributor Author

y-zeng commented Sep 11, 2017

Resolved conflict. Please see the email thread for the explanation of grpc_chttp2_config_default_keepalive_args().

@grpc-kokoro
Copy link

[microbenchmarks] No significant performance differences

@grpc-kokoro
Copy link

Corrupt JSON data (indicates timeout or crash): 
    bm_fullstack_trickle.BM_PumpStreamServerToClient_Trickle_8_4k.opt.new: 1


[trickle] No significant performance differences

@grpc-kokoro
Copy link

[trickle] No significant performance differences

@grpc-kokoro
Copy link

[microbenchmarks] Performance differences noted:
Benchmark                                                                       atm_add_per_iteration
------------------------------------------------------------------------------  -----------------------
BM_StreamCreateSendInitialMetadataDestroy<RepresentativeClientInitialMetadata>  +7%

@ctiller
Copy link
Member

ctiller commented Sep 18, 2017

Please let me know once merged.

@y-zeng
Copy link
Contributor Author

y-zeng commented Sep 18, 2017

Known failures:
Asan: #12245
Asan (Internal CI): #12544
Basic Tests Linux (opt): #12313, #12428
Basic Tests MacOS [dbg] (internal CI): #12138, #12538
Basic Tests MacOS [opt] (internal CI): #12138, #12538
Basic Tests Multi-language Linux (internal CI): #9538
Basic Tests Windows (internal CI): #12595
Msan C (internal CI): #12394
gRPC_interop_pull_requests: #12261
Tsan C: #11618
UBsan C: #12544
Basic Tests Mac: #12138
Basic Tests Windows: #11745

@ctiller
Copy link
Member

ctiller commented Sep 19, 2017

Bad news: This isn't actually fixing the problem I was seeing with #11072

@ctiller ctiller mentioned this pull request Sep 19, 2017
@y-zeng y-zeng merged commit 0aedb81 into grpc:master Sep 19, 2017
@y-zeng y-zeng mentioned this pull request Sep 22, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jan 22, 2019
@lock lock bot unassigned sreecha Jan 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants