Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

balancer/endpointsharding: Ignore empty endpoints #7674

Merged

Conversation

zasweq
Copy link
Contributor

@zasweq zasweq commented Sep 26, 2024

This PR ignores empty endpoints in endpoint sharding, and adds a validation helper in the resolver package that all petiole policies should call.

RELEASE NOTES:

  • resolver: add ValidateEndpoints
  • balancer/endpointsharding: ignore endpoints with zero addresses.

@zasweq zasweq requested a review from dfawley September 26, 2024 22:21
@zasweq zasweq added this to the 1.68 Release milestone Sep 26, 2024
@zasweq zasweq added the Type: Feature New features or improvements in behavior label Sep 26, 2024
Copy link

codecov bot commented Sep 26, 2024

Codecov Report

Attention: Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.

Project coverage is 81.85%. Comparing base (4084b14) to head (3c568e2).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
balancer/endpointsharding/endpointsharding.go 0.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7674      +/-   ##
==========================================
+ Coverage   81.68%   81.85%   +0.17%     
==========================================
  Files         374      374              
  Lines       38136    38140       +4     
==========================================
+ Hits        31150    31219      +69     
+ Misses       5710     5661      -49     
+ Partials     1276     1260      -16     
Files with missing lines Coverage Δ
resolver/resolver.go 100.00% <100.00%> (ø)
balancer/endpointsharding/endpointsharding.go 64.86% <0.00%> (+1.46%) ⬆️

... and 21 files with indirect coverage changes

Copy link
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a unit test for this behaviour to catch future regressions?

balancer/endpointsharding/endpointsharding.go Outdated Show resolved Hide resolved
Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the request for a test case.

// Check/return early if any endpoints have no addresses.

addrs := resolver.NewAddressMap()
// Check/return early if any endpoints have no addresses, or there is a duplicate address map.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/map//.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@dfawley dfawley assigned zasweq and unassigned dfawley Sep 27, 2024
@zasweq
Copy link
Contributor Author

zasweq commented Sep 30, 2024

Added a unit test, thanks.

@zasweq zasweq assigned dfawley and unassigned zasweq Sep 30, 2024
es := NewBalancer(nil, balancer.BuildOptions{})
addr := resolver.Address{Addr: "addr1"}
err := es.UpdateClientConnState(balancer.ClientConnState{
ResolverState: resolver.State{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty limited in what it covers. E.g. it doesn't cover addresses across endpoints, or endpoints with multiple addresses. Please add some more test cases to cover more possibilities -- probably make it table driven where []Endpoint is the input and the result is probably just a bool to indicate whether there should be an error or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more coverage for passing and failing tests.

@dfawley dfawley assigned zasweq and unassigned dfawley Oct 1, 2024
@zasweq zasweq assigned dfawley and unassigned zasweq Oct 1, 2024
@dfawley dfawley assigned zasweq and unassigned dfawley Oct 7, 2024
@zasweq zasweq assigned dfawley and unassigned zasweq Oct 7, 2024
@zasweq
Copy link
Contributor Author

zasweq commented Oct 8, 2024

Updated endpointsharding as per offline discussion:

  • Validate errors if empty endpoints or no addresses present in endpoints
  • UpdateClientConnState simply ignores endpoints with no addresses

Comment on lines 81 to 83
if len(endpoints) == 0 {
return errors.New("endpoints list is empty")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just delete this? It's a subset of the other validation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this here soley for the sake of the more specific error string. I think the two are distinct, which is why I left this clause for a different error message.

Comment on lines 78 to 79
// ValidateEndpoints returns an error if the endpoints list is empty, or no
// addresses are present in endpoint list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should indicate how/why/when to use this, not what it does.

// The EndpointSharding balancer assumes ValidateEndpoints has been used to validate all Endpoints
// passed to UpdateCLientConnState.

?

Or is this scenario actually not a problem for the endpointsharding balancer itself? In which case maybe this function belongs in package resolver or balancer?

Copy link
Contributor Author

@zasweq zasweq Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can move it to resolver or balancer, as this should trigger failure case in all petiole policies. Mark - "All petiole LB policies reject updates with no addresses."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also since it's gatekeeped in all petiole policies maybe it does make sense to keep it here? Although looks like we won't use this in petiole like ring hash so I'll move it to resolver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated comment as well, let me know what you think.

Comment on lines 93 to 83
// UpdateClientConnState creates a child for new endpoints and deletes children
// for endpoints that are no longer present. It also updates all the children,
// and sends a single synchronous update of the childrens' aggregated state at
// the end of the UpdateClientConnState operation. If any endpoint has no
// addresses it will ignore that endpoint. Otherwise, returns first error found
// from a child, but fully processes the new update.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nobody will see this godoc comment here since it's on an unexported type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you ok with that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with comments that aren't visible to anyone but someone reading the code. But make sure they aren't supposed to be showing up in godoc. Godoc should have enough info that the package is usable by someone who doesn't know what it does before viewing the godocs, and unexported comments should focus on how the implementation works and why it's the way it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reread this comment and I think the language here appropriate for this not public facing operation (all relate to the UpdateClientConnState() operation), and the exported comments seem good enough to me. Do you want me to take some of this information and put it in exported go doc?

@dfawley dfawley assigned zasweq and unassigned dfawley Oct 8, 2024
@zasweq zasweq assigned dfawley and unassigned zasweq Oct 8, 2024
@zasweq zasweq changed the title balancer/endpointsharding: Return error if duplicate addresses balancer/endpointsharding: Ignore empty endpoints Oct 8, 2024
@purnesh42H purnesh42H removed this from the 1.68 Release milestone Oct 16, 2024
@purnesh42H purnesh42H added this to the 1.69 Release milestone Oct 16, 2024
@@ -330,3 +331,19 @@ type AuthorityOverrider interface {
// typically in line, and must keep it unchanged.
OverrideAuthority(Target) string
}

// ValidateEndpoints validates endpoints are valid from a petiole policies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Valid" now appears 3 times. And spelling mistake:
ValidateEndpoints validates endpoints from a petiole _policy's_ perspective.

But now you should also explain what that actually means?

How about:

ValidateEndpoints validates that the endpoints are valid to use in a petiole policy - i.e. they must be non-empty and each must contain at least one address.

And what's a petiole policy? Is this defined somewhere? Maybe add: "See [A61] for details"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm ok. At first I thought I needed to add Outlier Detection to this, but that isn't a petiole policy (can't use it directly on top of pick first, that is invalid), and this is meant to trigger reresolution and reject the update. This happens from the petiole policy layer so the only thing important in OD is to ignore endpoints with no addresses and to have the undefined behavior in shared addresses, and leave petiole policies to trigger this error.

Copy link
Contributor Author

@zasweq zasweq Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to

// ValidateEndpoints validates endpoints from a petiole policy's perspective.
// Petiole policies should call this before calling into their children. See
// [gRPC A61](https://github.com/grpc/proposal/blob/master/A61-IPv4-IPv6-dualstack-backends.md)
// for details.

Comment on lines 93 to 83
// UpdateClientConnState creates a child for new endpoints and deletes children
// for endpoints that are no longer present. It also updates all the children,
// and sends a single synchronous update of the childrens' aggregated state at
// the end of the UpdateClientConnState operation. If any endpoint has no
// addresses it will ignore that endpoint. Otherwise, returns first error found
// from a child, but fully processes the new update.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with comments that aren't visible to anyone but someone reading the code. But make sure they aren't supposed to be showing up in godoc. Godoc should have enough info that the package is usable by someone who doesn't know what it does before viewing the godocs, and unexported comments should focus on how the implementation works and why it's the way it is.

@dfawley dfawley assigned zasweq and unassigned dfawley Oct 18, 2024
@zasweq zasweq assigned dfawley and unassigned zasweq Oct 18, 2024
@zasweq zasweq force-pushed the enforce-address-uniqueness-in-endpoint-sharding branch from e3164e3 to 849735f Compare October 28, 2024 20:31
@dfawley dfawley assigned zasweq and unassigned dfawley Oct 28, 2024
@zasweq zasweq force-pushed the enforce-address-uniqueness-in-endpoint-sharding branch from 849735f to 3c568e2 Compare October 28, 2024 20:49
@zasweq zasweq merged commit e7435d6 into grpc:master Oct 28, 2024
16 checks passed
misvivek pushed a commit to misvivek/grpc-go that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants