Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdsbalancer: switch cluster watch to generic xDS client API #6600

Merged
merged 8 commits into from
Sep 12, 2023

Conversation

easwars
Copy link
Contributor

@easwars easwars commented Aug 30, 2023

This is a significant simplication of the existing implementation of the CDS LB policy:

  • Replaces the existing clusterHandler, clusterNode types with an extremely simple, dumbed down type called clusterWatcher that simply forwards updates from the xDS client to the parent LB policy
  • The logic for building the discovery mechanisms is now solely under the responsibility of the type implementing the CDS LB policy, and it does so by maintaining a set of clusterWatchers, and tries to recursively traverse the cluster graph every time one of the underlying clusters receives an update
  • Also, added a couple of new tests for scenarios relating to cycles in the aggregate graph

And as part of all this, replaced the call to xdsclient.WatchCluster() with the helper function for the generic xDS API, xdsresource.WatchCluster().

In a follow-up PR, I will delete the WatchCluster method from the xdsclient.XDSClient interface. I wanted to keep that separate to keep the focus of this PR on the changes to the CDS LB policy.

#resource-agnostic-xdsclient-api

RELEASE NOTES: none

@easwars easwars requested a review from zasweq August 30, 2023 03:20
@easwars easwars added the Type: Internal Cleanup Refactors, etc label Aug 30, 2023
@easwars easwars added this to the 1.58 Release milestone Aug 30, 2023
@arvindbr8 arvindbr8 modified the milestones: 1.58 Release, 1.59 Release Sep 6, 2023
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good direction. Mark mentioned that since it's not reused it makes sense for this logic to be part of the cds balancer, especially since the graph like structure can be handled from a traversal owned by cds balancer.

xds/internal/balancer/cdsbalancer/cdsbalancer.go Outdated Show resolved Hide resolved
xds/internal/balancer/cdsbalancer/cdsbalancer.go Outdated Show resolved Hide resolved
xds/internal/balancer/cdsbalancer/cluster_handler_test.go Outdated Show resolved Hide resolved
xds/internal/balancer/cdsbalancer/cluster_handler_test.go Outdated Show resolved Hide resolved
@@ -657,8 +657,10 @@ func (s) TestAggregatedClusterSuccess_IgnoreDups(t *testing.T) {
// Tests the scenario where the aggregate cluster graph has a node that has
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file can be renamed now :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about it, but resisted the urge. Will probably do it after I this PR is approved, as the last step before merging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright sg.

@zasweq zasweq assigned easwars and unassigned zasweq Sep 8, 2023
@easwars easwars assigned zasweq and unassigned easwars Sep 8, 2023
@easwars
Copy link
Contributor Author

easwars commented Sep 8, 2023

Thanks for the review @zasweq

@@ -657,8 +657,10 @@ func (s) TestAggregatedClusterSuccess_IgnoreDups(t *testing.T) {
// Tests the scenario where the aggregate cluster graph has a node that has
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright sg.

xds/internal/balancer/cdsbalancer/cdsbalancer.go Outdated Show resolved Hide resolved
xds/internal/balancer/cdsbalancer/cdsbalancer.go Outdated Show resolved Hide resolved
@zasweq zasweq assigned easwars and unassigned zasweq Sep 11, 2023
@easwars easwars assigned zasweq and unassigned easwars Sep 12, 2023
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some minor nits. Great change overall, just scales up the generation of DMs to recurse through list. Fun one to review :).


// Handles a good Cluster update from the xDS client. Kicks off the discovery
// mechanism generation process from the top-level cluster and if the cluster
// graph is resolved, generated child policy config and pushes it down.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: generates*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 444 to 445
b.logger.Warningf("Failed to generate discovery mechanisms: %v", err)
b.onClusterError(b.lbCfg.ClusterName, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 450 and 451 send the same error string down as the logging string. I think the prefix "Failed to generate discovery mechanisms" adds valuable information here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized that onClusterError also logs the error. So, got rid of the log statement here, and in other places where I was logging and passing the error to onClusterError.

if ok {
continue
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: nix newline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 549 to 551
// Generates discovery mechanisms corresponding to the clusters in the aggregate
// cluster graph. If a new cluster is encountered when traversing the aggregate
// cluster graph, a watcher is created for it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: this doesn't necessarily apply only to aggregate clusters (root cluster can be a leaf, or it can recurse and hit a leaf). I'm fine as is though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the comment a bit to indicate that this is a recursive function and that the base case for the recursion is a leaf cluster.

dms, ok, err = b.generateDMsForCluster(child, depth+1, dms, clustersSeen)
if err != nil || !ok {
missingCluster = true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nix newline please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@zasweq zasweq assigned easwars and unassigned zasweq Sep 12, 2023
@easwars
Copy link
Contributor Author

easwars commented Sep 12, 2023

@bitcoinfinancier : Just get out of here.

@easwars
Copy link
Contributor Author

easwars commented Sep 12, 2023

LGTM. Some minor nits. Great change overall, just scales up the generation of DMs to recurse through list.

Thanks for the detailed review.

Fun one to review :).

I have a few more coming up to switch LDS and RDS :)

@easwars easwars merged commit 82a568d into grpc:master Sep 12, 2023
1 check passed
@easwars easwars deleted the cdsbalancer_cleanup branch September 12, 2023 17:02
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants