Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds/internal/resolver: switch to generic xDS API for LDS/RDS #6729

Merged

Conversation

easwars
Copy link
Contributor

@easwars easwars commented Oct 13, 2023

#resource-agnostic-xdsclient-api

RELEASE NOTES: none

@easwars easwars requested a review from zasweq October 13, 2023 22:40
@easwars easwars added the Type: Internal Cleanup Refactors, etc label Oct 13, 2023
@easwars easwars added this to the 1.60 Release milestone Oct 13, 2023
@codecov
Copy link

codecov bot commented Oct 13, 2023

Codecov Report

Merging #6729 (4fccb12) into master (6e9c88b) will increase coverage by 0.52%.
Report is 45 commits behind head on master.
The diff coverage is 77.94%.

Additional details and impacted files

Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

xds/internal/resolver/xds_resolver.go Outdated Show resolved Hide resolved
xds/internal/resolver/watch_service.go Outdated Show resolved Hide resolved
xds/internal/resolver/watch_service.go Outdated Show resolved Hide resolved
xds/internal/resolver/watch_service.go Outdated Show resolved Hide resolved
xds/internal/resolver/watch_service.go Outdated Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Outdated Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Outdated Show resolved Hide resolved
@zasweq zasweq assigned easwars and unassigned zasweq Oct 18, 2023
@arvindbr8 arvindbr8 modified the milestones: 1.60 Release, 1.61 Release Nov 14, 2023
@easwars easwars assigned zasweq and unassigned easwars Nov 29, 2023
@easwars
Copy link
Contributor Author

easwars commented Nov 29, 2023

Thanks for the review @zasweq. I think I've gotten to all of your comments.

Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits.

}
endpoint = strings.TrimPrefix(endpoint, "/")
r.ldsResourceName = bootstrap.PopulateResourceTemplate(template, endpoint)
r.listenerWatcher = newListenerWatcher(r.ldsResourceName, r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an e2e test for LDS Accept, and then dynamic RDS, and then RPC. Even this simple test failed server side. We should have a list of test cases somewhere for this, like I did for server side (I feel like there are some corner cases here - switching over to new one only when new RDS is ready etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have tests in watch_service_test.go that test scenarios you mention here. But these are not e2e style in the sense that you are looking for, i.e they don't make RPCs. But they are e2e from the point of view of the resolver, because they verify that the resolver produces appropriate service config corresponding to the new update.

Having a list of test cases for the e2e style tests for the xDS resolver would be a great thing to have, but I wouldn't block the current change for that because that is going to take some time. And if you really want to block this change on having such e2e style tests, then unfortunately someone else will have to take over this PR and take it to completion with those e2e style tests.

I'll be happy to start that effort of coming up with a list of e2e style tests, but I cannot guarantee that it will see the light of day before I take off for my next block of leave. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's not block this PR, but I think the e2e test cases will be useful to have. Especially if we do Mark's change with respect to moving all xDS into this resolver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. For context, we had code coverage in the listener wrapper server side, but not e2e test. So the RDS codepath hit, but the moment accept() hit, it paniced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The listener wrapper is a component inside the server. So, having a unit test there is not the same as the test that I'm talking about where we ensure that the output expected of the resolver is achieved, i.e service config. I agree that there is still a chance for things to fail, since we are not checking the config selector here.

And yes, my full support for e2e style tests, and I can get started on compiling a list after this PR.

Comment on lines 124 to 125
// - Verifies that the bootstrap configuration contains certificate providers if
// the use of xDS credentials is specified by the user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see what I was saying here, the bullet below has the if clause first, then what happens if the if conditional hits. This is super minor though.

xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
test/xds/xds_client_federation_test.go Show resolved Hide resolved
xds/internal/resolver/watch_service.go Outdated Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Outdated Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Outdated Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Outdated Show resolved Hide resolved
Comment on lines 416 to 426
r.logger.Warningf("Failed to build a config selector for resource %q: %v", r.ldsResourceName, err)
r.cc.ReportError(err)
return
}

if !r.sendNewServiceConfig(cs) {
// JSON error creating the service config (unexpected); erase
// this config selector and ignore this update, continuing with
// the previous config selector.
cs.stop()
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think either of these error conditions should hit, but is this what we want? Should these have two separate flows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think either of these error conditions should hit

Which error conditions are you talking about?

If you talking about the following two:

  • newConfigSelector: this can fail when extracting the matcher from the route. See xdsresource.RouteToMatcher
  • sendNewServiceConfig: this probably should never fail because we generate the service config in the xDS resolver. But defensive programming, yo!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these have two separate flows?

What do you mean by this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the first error case, which should also never hit (route to matcher), should we be ignoring the error and continuing to use old configuration? That is my question. Two separate flows meaning we do different things on these two errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you are saying. But here are my thoughts:

  • This code is unchanged
  • When we fail in newConfigSelector, we don't have a config selector, so we can't call cs.stop()
  • If we don't want to make newConfigSelector return an error, we need to make some changes to the xdsresource package and do what is done in RouteToMatcher during resource unmarshaling. It basically checks the same conditions there, so you are right in the sense that RouteToMatcher cannot fail. But there is no good reason for it to be done from here instead of being done at resource unmarshaling time, and the matcher being part of the internal representation of Route.

What do you think? I'm ok making this change as a follow-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not calling cs.stop() since it's not there. I just think it should continue to use the old working configuration (this is an edge case that will never hit as it's written), as a JSON errors does. I'm not arguing changing the error plumbed back from route to matcher, just thinking it should have the same behavior as the JSON marshaling error, both errors that can never happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not calling cc.ReportError when we cannot build a config selector. But I will be happy to make that change when we can guarantee that building the matchers from the provided route cannot fail, and the only way I can think of guaranteeing that is to build the matchers at resource unmarshal time and have it ready in the internal representation of the route.

Now, we have two places where there are checks to see that the path matchers and header matchers are what we support, and although they seem to be doing the same thing today, there is no guarantee that these two implementations will not diverge tomorrow. So, it is better to do the conversion at unmarshal time, and that way, we can be dead sure that building a config selector will not fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is equivalent to an nack, which we will ignore and continue to use old configuration. "when we can guarantee that building the matchers from the provided route cannot fail" - I feel like what I'm arguing is to remove error, it's to treat this as an nack error and continue to use old configuration. Why should a user stop having his client work if we somehow don't nack but fail here, vs. an nack which will leave his configuration as such.

@zasweq zasweq assigned easwars and unassigned zasweq Dec 7, 2023
Copy link
Contributor Author

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pass.

}
endpoint = strings.TrimPrefix(endpoint, "/")
r.ldsResourceName = bootstrap.PopulateResourceTemplate(template, endpoint)
r.listenerWatcher = newListenerWatcher(r.ldsResourceName, r)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have tests in watch_service_test.go that test scenarios you mention here. But these are not e2e style in the sense that you are looking for, i.e they don't make RPCs. But they are e2e from the point of view of the resolver, because they verify that the resolver produces appropriate service config corresponding to the new update.

Having a list of test cases for the e2e style tests for the xDS resolver would be a great thing to have, but I wouldn't block the current change for that because that is going to take some time. And if you really want to block this change on having such e2e style tests, then unfortunately someone else will have to take over this PR and take it to completion with those e2e style tests.

I'll be happy to start that effort of coming up with a list of e2e style tests, but I cannot guarantee that it will see the light of day before I take off for my next block of leave. Let me know what you think.

Comment on lines 124 to 125
// - Verifies that the bootstrap configuration contains certificate providers if
// the use of xDS credentials is specified by the user.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha. Ok, I see what you meant here. Ok, changed it.

xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Outdated Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/xds_resolver.go Show resolved Hide resolved
xds/internal/resolver/watch_service.go Outdated Show resolved Hide resolved
test/xds/xds_client_federation_test.go Show resolved Hide resolved
@easwars easwars assigned zasweq and unassigned easwars Dec 7, 2023
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@easwars easwars merged commit 477bd62 into grpc:master Dec 7, 2023
13 of 14 checks passed
@easwars easwars deleted the xds_resolver_cleanup_for_xds_generic_api branch December 7, 2023 22:39
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants