Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure object returned by volume getCloudProvider incorporates cloud config #23769

Merged

Conversation

saad-ali
Copy link
Member

@saad-ali saad-ali commented Apr 2, 2016

This PR addresses #23517.

Problem
The existing GCE PD and AWS EBS volume plugin code were fetching cloud provider without specifying a cloud config: cloudprovider.GetCloudProvider("gce", nil)
This caused the cloud provider to use default auth mechanism, which is not acceptable for the provisioning controller running on GKE master.

Fix
This PR does the following:

  • Modifies the GCE PD and AWS EBS volume plugin code to use the cloud provider object pre-constructed by the binary with a cloud config.
  • Enable provisioning E2E test for GKE (to catch future issues).

Thanks to @cjcullen for debugging and finding the root cause! 👍

This should be cherry-picked into the v1.2 branch for the next release.

@saad-ali saad-ali added release-note Denotes a PR that will be considered when it comes time to generate release notes. cherrypick-candidate labels Apr 2, 2016
@k8s-cherrypick-bot
Copy link

Removing label cherrypick-candidate because no release milestone was set. This is an invalid state and thus this PR is not being considered for cherry-pick to any release branch. Please add an appropriate release milestone and then re-add the label.

@k8s-github-robot
Copy link

Labelling this PR as size/M

@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 2, 2016
@saad-ali saad-ali added this to the v1.2 milestone Apr 2, 2016
@k8s-bot
Copy link

k8s-bot commented Apr 2, 2016

GCE e2e build/test passed for commit e95dd021bca7c6b1a357b492badac3b364b17b67.

@roberthbailey roberthbailey added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Apr 2, 2016
@roberthbailey
Copy link
Contributor

/cc @justinsb since this also makes some minor changes to the AWS cloud provider.

}

cloudProvider := plugin.host.GetCloudProvider()
awsCloudProvider, ok := cloudProvider.(*aws.AWSCloud)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing weird indentation here.. gofmt needed?

@justinsb
Copy link
Member

justinsb commented Apr 2, 2016

Needs gofmt (CI picked it up too - I thought I was going crazy!). But changes LGTM on the AWS side.

I also like that the provider is now a real singleton - I wonder if that was causing issues in the past for the AWS volume mount cache (before I removed it in 1.2)

@alex-mohr alex-mohr added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2016
@alex-mohr
Copy link
Contributor

@saad-ali can you please get the gofmt and build failures fixed, submit this, cherrypick it to 1.2 and get @david-mcmahon to signoff and start the 1.2.2 build please? @a-robinson is GKE oncall this week.

After that's done, let's start a postmortem to figure out what happened and how we can improve the process so similar issues are less likely to re-occur?

@saad-ali
Copy link
Member Author

saad-ali commented Apr 4, 2016

Yep, will do.

@saad-ali saad-ali force-pushed the fixVolumeCloudProvider branch from e95dd02 to e7b14e7 Compare April 4, 2016 17:59
@k8s-github-robot
Copy link

PR changed after LGTM, removing LGTM.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2016
@roberthbailey roberthbailey added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2016
@k8s-bot
Copy link

k8s-bot commented Apr 4, 2016

GCE e2e build/test passed for commit e7b14e7.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Apr 4, 2016

GCE e2e build/test passed for commit e7b14e7.

@saad-ali
Copy link
Member Author

saad-ali commented Apr 4, 2016

@k8s-bot unit test this issue: #21954

@saad-ali
Copy link
Member Author

saad-ali commented Apr 4, 2016

@k8s-bot unit test this issue: #21640

@cjcullen
Copy link
Member

cjcullen commented Apr 5, 2016

LGTM

@saad-ali
Copy link
Member Author

saad-ali commented Apr 5, 2016

Verified dynamic provision on GKE works with this PR:

2016/04/04 17:23:05 e2e.go:194: Running: Ginkgo tests
Setting up for KUBERNETES_PROVIDER="gke".
... in gke:prepare-e2e()
... in gke:detect-master()
... in gke:detect-project()
Your active configuration is: [default]

... Using project: cloud-kubernetes-dev
... in gke:detect-node-instance-group()
Apr  4 17:23:10.161: INFO: Fetching cloud provider for "gke"

[...]
Running Suite: Kubernetes e2e suite
===================================
Random Seed: 1459815789 - Will randomize all specs
Will run 1 of 268 specs

Apr  4 17:23:10.683: INFO: >>> testContext.KubeConfig: /usr/local/google/home/saadali/.kube/config

Apr  4 17:23:10.809: INFO: Waiting up to 10m0s for all pods (need at least 0) in namespace 'kube-system' to be running and ready
Apr  4 17:23:11.073: INFO: 10 / 10 pods in namespace 'kube-system' are running and ready (0 seconds elapsed)
Apr  4 17:23:11.073: INFO: expected 3 pod replicas in namespace 'kube-system', 3 are Running and Ready.
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[k8s.io] Dynamic provisioning [k8s.io] DynamicProvisioner 
  should create and delete persistent volumes
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/volume_provisioning.go:122
[BeforeEach] [k8s.io] Dynamic provisioning
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework.go:101
STEP: Creating a kubernetes client
Apr  4 17:23:11.074: INFO: >>> testContext.KubeConfig: /usr/local/google/home/saadali/.kube/config

STEP: Building a namespace api object
Apr  4 17:23:11.120: INFO: Waiting up to 2m0s for service account default to be provisioned in ns e2e-tests-volume-provisioning-lr4pk
Apr  4 17:23:11.158: INFO: Service account default in ns e2e-tests-volume-provisioning-lr4pk with secrets found. (38.350711ms)
STEP: Waiting for a default service account to be provisioned in namespace
Apr  4 17:23:11.158: INFO: Waiting up to 2m0s for service account default to be provisioned in ns e2e-tests-volume-provisioning-lr4pk
Apr  4 17:23:11.196: INFO: Service account default in ns e2e-tests-volume-provisioning-lr4pk with secrets found. (38.250266ms)
[BeforeEach] [k8s.io] Dynamic provisioning
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/volume_provisioning.go:49
[It] should create and delete persistent volumes
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/volume_provisioning.go:122
STEP: creating a claim with a dynamic provisioning annotation
Apr  4 17:23:11.240: INFO: Waiting up to 5m0s for PersistentVolumeClaim pvc-t5h6s to have phase Bound
Apr  4 17:23:11.285: INFO: PersistentVolumeClaim pvc-t5h6s found but phase is Pending instead of Bound.
Apr  4 17:23:13.324: INFO: PersistentVolumeClaim pvc-t5h6s found but phase is Pending instead of Bound.
Apr  4 17:23:15.362: INFO: PersistentVolumeClaim pvc-t5h6s found and phase=Bound (4.12245163s)
STEP: checking the claim
STEP: checking the created volume is writable
Apr  4 17:23:15.486: INFO: Waiting up to 15m0s for pod pvc-volume-tester-27jl5 status to be success or failure
Apr  4 17:23:15.525: INFO: Nil State.Terminated for container 'volume-tester' in pod 'pvc-volume-tester-27jl5' in namespace 'e2e-tests-volume-provisioning-lr4pk' so far
Apr  4 17:23:15.525: INFO: Waiting for pod pvc-volume-tester-27jl5 in namespace 'e2e-tests-volume-provisioning-lr4pk' status to be 'success or failure'(found phase: "Pending", readiness: false) (38.769729ms elapsed)
[...]
Apr  4 17:25:16.467: INFO: Nil State.Terminated for container 'volume-tester' in pod 'pvc-volume-tester-o4avb' in namespace 'e2e-tests-volume-provisioning-lr4pk' so far
Apr  4 17:25:16.467: INFO: Waiting for pod pvc-volume-tester-o4avb in namespace 'e2e-tests-volume-provisioning-lr4pk' status to be 'success or failure'(found phase: "Pending", readiness: false) (1m48.597845805s elapsed)
STEP: Saw pod success
STEP: Sleeping to let kubelet destroy all pods
STEP: deleting the claim
Apr  4 17:26:18.605: INFO: Waiting up to 10m0s for PersistentVolume pv-gce-o4lw5 to get deleted
Apr  4 17:26:18.643: INFO: PersistentVolume pv-gce-o4lw5 found and phase=Released (38.376811ms)
Apr  4 17:26:19.682: INFO: PersistentVolume pv-gce-o4lw5 found and phase=Released (1.077469187s)
Apr  4 17:26:20.722: INFO: PersistentVolume pv-gce-o4lw5 found and phase=Released (2.116694121s)
Apr  4 17:26:21.761: INFO: PersistentVolume pv-gce-o4lw5 found and phase=Released (3.156407539s)
Apr  4 17:26:22.800: INFO: PersistentVolume pv-gce-o4lw5 was removed
[AfterEach] [k8s.io] Dynamic provisioning
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework.go:102
Apr  4 17:26:22.838: INFO: Waiting up to 1m0s for all nodes to be ready
STEP: Destroying namespace "e2e-tests-volume-provisioning-lr4pk" for this suite.

• [SLOW TEST:196.964 seconds]
[k8s.io] Dynamic provisioning
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework.go:420
  [k8s.io] DynamicProvisioner
  /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework.go:420
    should create and delete persistent volumes
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/volume_provisioning.go:122
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
Ran 1 of 268 Specs in 197.356 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 267 Skipped PASS

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Apr 5, 2016

GCE e2e build/test passed for commit e7b14e7.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Apr 5, 2016

GCE e2e build/test passed for commit e7b14e7.

@saad-ali saad-ali closed this Apr 5, 2016
@saad-ali saad-ali reopened this Apr 5, 2016
@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 02e0b29 into kubernetes:master Apr 5, 2016
@saad-ali saad-ali deleted the fixVolumeCloudProvider branch April 5, 2016 22:21
@roberthbailey roberthbailey added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Apr 6, 2016
@zmerlynn
Copy link
Member

zmerlynn commented Apr 6, 2016

@saad-ali: I tried to create an automated cherry-pick for this on release-1.2 and the following conflict arose:

Applying: Ensure volume GetCloudProvider code uses cloud config
Using index info to reconstruct a base tree...
M       pkg/volume/aws_ebs/aws_util.go
M       pkg/volume/gce_pd/gce_util.go
A       test/e2e/volume_provisioning.go
Falling back to patching base and 3-way merge...
CONFLICT (modify/delete): test/e2e/volume_provisioning.go deleted in d5fcbb8a2cea9a780242187aeb810969fdb64562 and modified in Ensure volume GetCloudProvider code uses cloud config. Version Ensure volume GetCloudProvider code uses cloud config of test/e2e/volume_provisioning.go left in tree.
Auto-merging pkg/volume/gce_pd/gce_util.go
Auto-merging pkg/volume/aws_ebs/aws_util.go
error: Failed to merge in the changes.
Patch failed at 0001 Ensure volume GetCloudProvider code uses cloud config
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Please create a cherry-pick ASAP if you want this in 1.2.2 or the release train will leave without you.

@saad-ali
Copy link
Member Author

saad-ali commented Apr 6, 2016

Please create a cherry-pick ASAP if you want this in 1.2.2 or the release train will leave without you.

Working on it.

@saad-ali
Copy link
Member Author

saad-ali commented Apr 6, 2016

#23943

zmerlynn added a commit that referenced this pull request Apr 6, 2016
#23769-upstream-release-1.2

Automated cherry pick of #21140 #23769 upstream release 1.2
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.2" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this pull request Dec 1, 2016
…ck-of-#21140-kubernetes#23769-upstream-release-1.2

Automated cherry pick of kubernetes#21140 kubernetes#23769 upstream release 1.2
shouhong pushed a commit to shouhong/kubernetes that referenced this pull request Feb 14, 2017
…ck-of-#21140-kubernetes#23769-upstream-release-1.2

Automated cherry pick of kubernetes#21140 kubernetes#23769 upstream release 1.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.