Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE #4643]Fix Grpc client can't reconnection when runtime after crashing the runtime and restarting it #4644

Merged
merged 2 commits into from
Dec 14, 2023

Conversation

mxsm
Copy link
Member

@mxsm mxsm commented Dec 13, 2023

Fixes #4643

Motivation

Explain the content here.
Explain why you want to make the changes and what problem you're trying to solve.

  • Fix Grpc client can't reconnection when runtime after crashing the runtime and restarting it

Modifications

Describe the modifications you've done.

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

…ter crashing the runtime and restarting it.
Copy link

codecov bot commented Dec 13, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (87ed40d) 17.38% compared to head (0532991) 17.39%.

Files Patch % Lines
...sh/client/grpc/consumer/EventMeshGrpcConsumer.java 50.00% 2 Missing ⚠️
...entmesh/client/grpc/consumer/SubStreamHandler.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master    #4644   +/-   ##
=========================================
  Coverage     17.38%   17.39%           
  Complexity     1757     1757           
=========================================
  Files           797      797           
  Lines         29776    29774    -2     
  Branches       2573     2573           
=========================================
+ Hits           5177     5178    +1     
+ Misses        24121    24118    -3     
  Partials        478      478           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -85,7 +85,8 @@ public EventMeshGrpcConsumer(final EventMeshGrpcClientConfig clientConfig) {
}

public void init() {
this.channel = ManagedChannelBuilder.forAddress(clientConfig.getServerAddr(), clientConfig.getServerPort()).usePlaintext().build();
this.channel = ManagedChannelBuilder.forAddress(clientConfig.getServerAddr(), clientConfig.getServerPort()).enableRetry().usePlaintext()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't gRPC generally supports retry by default? Or rather, it's not like that, we must manually enable retry?

gRPC一般不是默认支持重试吗?还是说并非如此,必须手动开启重试?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't gRPC generally supports retry by default? Or rather, it's not like that, we must manually enable retry?

gRPC一般不是默认支持重试吗?还是说并非如此,必须手动开启重试?

This can be deleted.

@@ -99,8 +99,7 @@ public void onNext(final CloudEvent message) {

@Override
public void onError(final Throwable t) {
LogUtils.error(log, "Received Server side error", t);
close();
log.error("Received Server side error", t);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the close() method, will it affect the acceptance of messages when the completed method cannot be called in the end under abnormal circumstances?

您移除了close()方法,那异常情况下最终调用不到completed方法,会不会影响消息的接受?

Copy link
Member Author

@mxsm mxsm Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the close() method, will it affect the acceptance of messages when the completed method cannot be called in the end under abnormal circumstances?

您移除了close()方法,那异常情况下最终调用不到completed方法,会不会影响消息的接受?

My understanding here is that it is not necessary to close it in the case of a stream. Once closed, it will directly disconnect from the server. The SDK will directly exit.
Here is the gRPC description of StreamObserver#onCompleted.
image
This is also the reason why retries are not possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After StreamObserver#onCompleted is called, the connection should not be disconnected, and the client can continue to receive responses from the server. But this source code comment reminded me that although the connection was not disconnected, gRPC indicates that the client is no longer sending messages to the server. (@hhuang1231 This can be seen as a supplement to this reply #4587 (comment), which may be the reason why heartbeat cannot be sent.)

StreamObserver#onCompleted被调用以后,应该并不会断开连接,客户端可以继续接收服务端的响应。但是这段源码注释提醒了我,虽然连接没断开,但是gRPC这时标识客户端不再发送消息给服务端。(@hhuang1231 这可以看作是对这条回复#4587 (comment) 的补充,这是我认为的可能导致无法发送心跳的原因。)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gRPC keepAlive connections have the ability to automatically close the connection when C/S times out, without the need to manually call close().

@mxsm mxsm requested a review from pandaapo December 14, 2023 01:44
Comment on lines 87 to 90
public void init() {
this.channel = ManagedChannelBuilder.forAddress(clientConfig.getServerAddr(), clientConfig.getServerPort()).usePlaintext().build();
this.channel = ManagedChannelBuilder.forAddress(clientConfig.getServerAddr(), clientConfig.getServerPort()).usePlaintext()
.build();
this.consumerClient = ConsumerServiceGrpc.newBlockingStub(channel);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional line break here is not that good~

@xwm1992 xwm1992 added this to the 1.10 milestone Dec 14, 2023
@xwm1992 xwm1992 merged commit 6115009 into apache:master Dec 14, 2023
13 checks passed
@mxsm mxsm deleted the bug-mxsm-4643 branch December 14, 2023 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants