-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integration test flaky due to missing endpoints #6045
Comments
This is failing quite often now, esp. on v1beta3 builds. Not yet sure if it's 100% broken yet, though. |
I'm seeing the error Dawn wrote above as well as the following error near constantly on v1beta3.
|
Not 100% broken. Last build passed after being restarted. |
So I spent some time digging into this. Both errors seem to have the root cause that the endpoints controller (pkg/service/endpoints_controller.go) is taking forever (on the order of tens of seconds) to do its job of synchronizing the endpoint objects to the current state of services and pods. If you increase the time out to 60 seconds instead of 30, these errors both go away. There are a few reasons it's so slow. The endpoints controller is a serial loop that does 4-5 calls to the apiserver per service that it synchronizes endpoints for. I added some logging to pkg/service/endpoints_controller.go and found that, at least on the Linux VM on my mac, it was taking about 2-3 seconds for each and every apiserver call. That's also the reason that this has also gotten worse recently - the addition of patchservice as a test object gives the endpoint controller even more work to do, and delays the creation of endpoints for There are a few ways to resolve this issue, and we should probably do them sequentially:
cc @thockin |
Also discussed in #6518 |
Do we have any idea why apiserver is so slow in the integration test? (referring back to 2. in #6045 (comment)) |
I ran integration tests 100 times from today's head and all passes, so closing the issue. |
Yes, 2. looks much better - most calls are now taking a few milliseconds, although https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/service/endpoints_controller.go#L337 still takes about 200ms, but it's not a big deal atm. |
I0327 07:02:03.729736 732 integration.go:287] default/service2 endpoint: 1.2.3.4:1234 &api.ObjectReference{Kind:"Pod", Namespace:"default", Name:"foo", UID:"2823d7fd-d44f-11e4-9efc-02420a0000c3", APIVersion:"", ResourceVersion:"186", FieldPath:""}
I0327 07:02:03.730711 732 integration.go:283] Error on creating endpoints: endpoints "service1" not found
F0327 07:02:03.730752 732 integration.go:785] FAILED: service in other namespace should have no endpoints: false
!!! Error in hack/test-integration.sh:47
'"${KUBE_OUTPUT_HOSTBIN}/integration" --v=2 --apiVersion="$1"' exited with status 255
Call stack:
The text was updated successfully, but these errors were encountered: