Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Federation e2e support for AWS #27791

Merged
merged 1 commit into from
Jun 23, 2016

Conversation

colhom
Copy link

@colhom colhom commented Jun 21, 2016

I've observed e2e test failures on the two local runs I did, but the framework seems to come up successfully. logs. Ideas on this?

I'm in the process of validating GKE as well, and will modify the title if it succeeds.

\cc @nikhiljindal @quinton-hoole

Analytics

@k8s-bot
Copy link

k8s-bot commented Jun 21, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

3 similar comments
@k8s-bot
Copy link

k8s-bot commented Jun 21, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented Jun 21, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-bot
Copy link

k8s-bot commented Jun 21, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

Otherwise, if this message is too spammy, please complain to ixdy.

@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-label-needed labels Jun 21, 2016
@ghost ghost assigned ghost and unassigned roberthbailey Jun 21, 2016
@@ -22,7 +22,11 @@ spec:
- --etcd-servers=http://localhost:2379
- --service-cluster-ip-range={{.FEDERATION_SERVICE_CIDR}}
- --secure-port=443
{{if eq .KUBERNETES_PROVIDER "aws"}}
- --external-hostname={{.FEDERATION_API_HOST}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why is this AWS specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does not work if we do not set this?
Reading the documentation, I get the impression that if its a public IP, we can set it to external-hostname.
And if its a cluster IP, we can set it to advertise-address.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, it's something to do with the automatic certificate generation logic and AWS loadbalancers giving your DNS names instead of IP addresses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the DNS name is cluster internal only? Doesnt seem right to set it as external-hostname.

cc @lavalamp

@nikhiljindal
Copy link
Contributor

ok to test

@kubernetes/sig-cluster-federation

@nikhiljindal
Copy link
Contributor

Looking at the logs, federation apiserver test passed, and the 2 federation service tests failed. Those 2 tests are failing on GCE as well right now. federation apiserver test passing means that bring up worked fine. So this is good!

@ghost
Copy link

ghost commented Jun 21, 2016

Awesome! Thanks for this @colhom. I'll take a deeper look into the logs to confirm that the reasons for those two tests failures are the same reasons for failures on GCE and GKE. If so, we already have fixes in the works.

@madhusudancs @matchstick FYI

@ghost
Copy link

ghost commented Jun 21, 2016

@colhom FYI manual GKE validation was done by Kelsey Hightower recently. See:

https://github.com/kelseyhightower/kubernetes-cluster-federation
https://youtu.be/86jZdmAjWns

@ghost
Copy link

ghost commented Jun 21, 2016

@colhom You seem to be running as an old version of the tests, as the line numbers in your log file don't line up with the line numbers in upstream/master:HEAD.

Could you perhaps rebase and rerun?

Note also that you're probably running into #27802 which is now merged.

So all-in-all, a rebase and rebuild would be very useful right now.

@@ -135,6 +135,30 @@ function create-federation-api-objects {
fi
sleep 5
done

# Poll until DNS record for federation-apiserver is propagated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we do not wait?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl will fail because the DNS name for the ELB could not be resolved. I guess we could soak that up by retrying kubectl for a few minutes, but this seems more explicit.

@ghost
Copy link

ghost commented Jun 21, 2016

@colhom You definitely also want to rebase to get #27794
Pretty sure you'll run into that one too.

@colhom colhom force-pushed the gke-aws-federation branch from 3a14e1b to 5bacc48 Compare June 22, 2016 01:23
@colhom
Copy link
Author

colhom commented Jun 22, 2016

@quinton-hoole rebased, I'll let you know how it goes.

@k8s-bot
Copy link

k8s-bot commented Jun 22, 2016

GCE e2e build/test passed for commit 5bacc48.

@ghost
Copy link

ghost commented Jun 22, 2016

@colhom How do your results look after the rebase?

@ghost ghost added this to the v1.3 milestone Jun 22, 2016
@ghost ghost added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 22, 2016
@@ -1491,7 +1491,7 @@ function test-build-release {
# Assumed vars:
# Variables from config.sh
function test-setup {
"${KUBE_ROOT}/cluster/kube-up.sh"
. "${KUBE_ROOT}/cluster/kube-up.sh"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change really necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it, subsequent calls to things like detect-security-group will fail as VPC_ID is set within kube-up and hence will not be availabile if kube-up.sh executes in a sub-process.

@ghost ghost added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Jun 23, 2016
@ghost ghost added lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 23, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jun 23, 2016

GCE e2e build/test passed for commit 5bacc48.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 890ac5f into kubernetes:master Jun 23, 2016
@nikhiljindal
Copy link
Contributor

Our e2e tests on jenkins are failing right now and I suspect this is the culprit.
The cluster fails to come up with the following logs:

attempting to get federation-apiserver loadbalancer hostname (1 / 30)
attempting to get federation-apiserver loadbalancer hostname (2 / 30)
attempting to get federation-apiserver loadbalancer hostname (3 / 30)
attempting to get federation-apiserver loadbalancer hostname (4 / 30)
attempting to get federation-apiserver loadbalancer hostname (5 / 30)
attempting to get federation-apiserver loadbalancer hostname (6 / 30)
attempting to get federation-apiserver loadbalancer hostname (7 / 30)
attempting to get federation-apiserver loadbalancer hostname (8 / 30)
attempting to get federation-apiserver loadbalancer hostname (9 / 30)
2016/06/24 09:15:29 e2e.go:218: Error running up: exit status 1
2016/06/24 09:15:29 e2e.go:214: Step 'up' finished in 14m7.817992497s
2016/06/24 09:15:29 e2e.go:114: Error starting e2e cluster. Aborting.

cc @kubernetes/sig-cluster-federation

@nikhiljindal
Copy link
Contributor

Verified that the test turned red right after this PR was merged.
Reverting
fedred

k8s-github-robot pushed a commit that referenced this pull request Jun 24, 2016
Automatic merge from submit-queue

Revert "Federation e2e supports aws"

Reverting #27791 to get our Jenkins tests green again.

cc @kubernetes/sig-cluster-federation
@ghost
Copy link

ghost commented Jun 29, 2016

@colhom Have you managed to figure out what went wrong here?
@nikhiljindal FYI

@ghost
Copy link

ghost commented Jun 29, 2016

Sorry, just saw your comments on #28030. We'll have to fix the nslookup issue.

eparis pushed a commit to eparis/kubernetes that referenced this pull request Jun 29, 2016
Automatic merge from submit-queue

Revert "Federation e2e supports aws"

Reverting kubernetes#27791 to get our Jenkins tests green again.

cc @kubernetes/sig-cluster-federation

(cherry picked from commit e93608c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants