VSphere cloud provider code refactoring #49164

BaluDontu · 2017-07-18T23:43:30Z

The current PR tracks the vSphere Cloud Provider code refactoring which includes the following changes.

VCLib Package - A framework used by vSphere cloud provider for managing the vSphere entities. VCLib package mainly does the following:
- Volume management on datastore (Create/Delete)
- Volume management on Virtual Machines (Attach/Detach)
- Storage Policy Management
vSphere Cloud Provider changes to implement the cloud provider interfaces by calling into VCLib package.
Modifications to e2e tests to accomodate the latest design changes.

vSphere cloud provider: vSphere cloud provider code refactoring

k8s-ci-robot · 2017-07-18T23:43:38Z

Hi @BaluDontu. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

BaluDontu · 2017-07-19T01:18:57Z

@brendandburns @saad-ali @jingxu97 : Can anyone of you run the tests? ("ok-to-test")

jingxu97 · 2017-07-19T20:32:31Z

/ok-to-test

msau42 · 2017-07-20T22:16:14Z

/ok-to-test

BaluDontu · 2017-07-20T23:38:06Z

I have resolved the conflicts and rebased the branch to the latest.

divyenpatel

Need a fix in vsphere.go -> newVSphere()

divyenpatel · 2017-07-21T23:49:47Z

pkg/cloudprovider/providers/vsphere/vsphere.go

+	// Create context
+	ctx, cancel := context.WithCancel(context.TODO())
+	defer cancel()
+	err = vSphereConn.Connect(ctx)


On the worker node we should not connect to vCenter if vm-name is specified in the vsphere.conf file.

Used following vsphere.conf file on Kubernetes 1.7.1, and restarted kubelet. Node was able to register successfully.

# cat vsphere.conf [Global] vm-name = "node1"

When same file is used on this Refactored code. kubelet failed to restart with following errors

{"log":"E0721 23:34:39.438996 11722 connection.go:75] Failed to create new client. err: Post https:///sdk: http: no Host in request URL\n","stream":"stderr","time":"2017-07-21T23:34:39.439139608Z"} {"log":"E0721 23:34:39.439033 11722 connection.go:41] Failed to create govmomi client. err: Post https:///sdk: http: no Host in request URL\n","stream":"stderr","time":"2017-07-21T23:34:39.439156709Z"} {"log":"E0721 23:34:39.439041 11722 vsphere.go:202] Failed to connect to vSphere\n","stream":"stderr","time":"2017-07-21T23:34:39.439160254Z"} {"log":"Error: failed to run Kubelet: could not init cloud provider \"vsphere\": Post https:///sdk: http: no Host in request URL\n","stream":"stderr","time":"2017-07-21T23:34:39.439163376Z"}

We should fix this as part of this PR.

I fixed it in latest commit. Now we don't connect to VC on worker nodes if we have vm-name available in vsphere.conf file.

divyenpatel · 2017-07-24T20:58:15Z

pkg/cloudprovider/providers/vsphere/vsphere_util.go

 	if err != nil {
-		return nil, err
+		glog.Errorf("Failed to Profile ID by name: %s. err: %+v", storagePolicyName, err)


Failed to Profile ID by name => Failed to get Profile ID by name

I have addressed it in latest commit.

BaluDontu · 2017-07-25T20:34:42Z

Fixed bazel errors and review comments from @divyenpatel .

BaluDontu · 2017-07-26T15:25:50Z

/retest

BaluDontu · 2017-07-26T15:26:47Z

@brendandburns @saad-ali @jingxu97 : Can one of you guys review this PR ?

BaluDontu · 2017-07-26T18:57:00Z

/retest

divyenpatel · 2017-07-26T23:17:07Z

LGTM

BaluDontu · 2017-07-27T15:37:11Z

/assign @jingxu97

divyenpatel · 2017-08-02T19:50:10Z

This PR is internally reviewed by vSphere Cloud Provider team. Here are the reference PRs.

vmware-archive#154
vmware-archive#155
vmware-archive#158
vmware-archive#164
vmware-archive#166:
vmware-archive#170
vmware-archive#174
vmware-archive#191
vmware-archive#192
vmware-archive#193
vmware-archive#201

divyenpatel · 2017-08-02T19:50:20Z

LGTM

divyenpatel · 2017-08-02T20:57:45Z

/lgtm

k8s-ci-robot · 2017-08-02T20:57:52Z

@divyenpatel: changing LGTM is restricted to assignees, and only kubernetes org members may be assigned issues.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

divyenpatel · 2017-08-02T21:04:30Z

/approve

BaluDontu · 2017-08-03T19:25:12Z

@childsb @saad-ali can you please take a look at this PR.

divyenpatel · 2017-08-09T14:33:25Z

@brendandburns @davidopp @jingxu97 Can you please approve this PR?
We have other changes in the pipeline based on this PR.

All tests passed. PR needs approval label.

luomiao · 2017-08-09T17:39:43Z

LGTM
(I reviewed this PR during the internal vsphere cloud provider team review)

luomiao · 2017-08-09T17:39:58Z

/lgtm
/approve

divyenpatel · 2017-08-09T17:51:21Z

@saad-ali we need approval from pkg/volume/vsphere_volume/OWNERS
Can you help?

kerneltime · 2017-08-09T22:47:29Z

/approve

kerneltime · 2017-08-09T23:25:07Z

/approve no-issue

k8s-github-robot · 2017-08-09T23:25:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BaluDontu, divyenpatel, jingxu97, kerneltime, luomiao

Associated issue requirement bypassed by: kerneltime

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~pkg/cloudprovider/providers/vsphere/OWNERS~~ [kerneltime,luomiao]
~~pkg/volume/vsphere_volume/OWNERS~~ [kerneltime]
~~test/e2e/storage/OWNERS~~ [jingxu97]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

k8s-github-robot · 2017-08-10T00:14:35Z

/test all [submit-queue is verifying that this PR is safe to merge]

k8s-github-robot · 2017-08-10T01:18:58Z

Automatic merge from submit-queue

@luomiao

Automatic merge from submit-queue Attempt to Attach Volume fails for very first time on vSphere in 1.6 release Before every attach, vSphere cloud provider checks if the SCSI controller is present on the VM. If not present, it will try to create one and attach the disk to that SCSI controller on VM. If already present, if will use the SCSI controller. For the very first time when SCSI controller is not present on the VM, we try to create one and retrieve the SCSI controller for the disk to created on that SCSI controller. But in release 1.6, after successful creation of SCSI controller, we are not assigning back the SCSI controller to the existing "scsicontroller" variable. Because of this the very first time, attach of the disk will fail. This problem is not observed on master, as we have taken care of this in vSphere cloud provider refactoring - #49164 @luomiao @divyenpatel @rohitjogvmw ```release-note vSphere: Fix attach volume failing on the first try. ```

@luomiao

Automatic merge from submit-queue Attempt to Attach Volume fails for very first time on vSphere in 1.7 release Before every attach, vSphere cloud provider checks if the SCSI controller is present on the VM. If not present, it will try to create one and attach the disk to that SCSI controller on VM. If already present, it will use the SCSI controller. For the very first time when SCSI controller is not present on the VM, we try to create one and retrieve the SCSI controller for the disk to created on that SCSI controller. But in release 1.7, after successful creation of SCSI controller, we are not assigning back the SCSI controller to the existing "scsicontroller" variable. Because of this the very first time, attach of the disk will fail. This problem is not observed on master, as we have taken care of this in vSphere cloud provider refactoring - #49164 @luomiao @divyenpatel @rohitjogvmw ```release-note vSphere: Fix attach volume failing on the first try. ```

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 18, 2017

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 18, 2017

k8s-github-robot assigned brendandburns and davidopp Jul 19, 2017

k8s-github-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. release-note-label-needed labels Jul 19, 2017

k8s-github-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Jul 19, 2017

k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 20, 2017

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 20, 2017

BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch from ae8077d to 8f80be8 Compare July 20, 2017 23:25

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 20, 2017

BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch from 8f80be8 to 9198ec0 Compare July 20, 2017 23:36

divyenpatel suggested changes Jul 21, 2017

View reviewed changes

divyenpatel reviewed Jul 24, 2017

View reviewed changes

BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch from 9198ec0 to b55543b Compare July 25, 2017 20:33

BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch 2 times, most recently from f33d537 to 1c00005 Compare July 26, 2017 00:02

BaluDontu mentioned this pull request Jul 26, 2017

Blog highlighting the new vSphere Cloud Provider Code Refactoring changes vmware-archive/kubernetes-archived#219

Closed

k8s-ci-robot assigned luomiao Aug 9, 2017

divyenpatel mentioned this pull request Aug 9, 2017

Mark volume as detached when node does not exist for vsphere #50281

Merged

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 9, 2017

k8s-github-robot merged commit a881405 into kubernetes:master Aug 10, 2017

BaluDontu deleted the vSphereCloudProviderCodeRefactoring branch September 5, 2017 23:08

This was referenced Sep 8, 2017

vSphere Cloud Provider code refactoring - vSphere Cloud Provider metrics support vmware-archive/kubernetes-archived#199

Closed

vSphere Cloud Provider code refactoring - DetachDisk + DeleteVolume Implementation vmware-archive/kubernetes-archived#197

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VSphere cloud provider code refactoring #49164

VSphere cloud provider code refactoring #49164

BaluDontu commented Jul 18, 2017 •

edited

Loading

k8s-ci-robot commented Jul 18, 2017

BaluDontu commented Jul 19, 2017

jingxu97 commented Jul 19, 2017 •

edited

Loading

msau42 commented Jul 20, 2017

BaluDontu commented Jul 20, 2017

divyenpatel left a comment

divyenpatel Jul 21, 2017

BaluDontu Jul 25, 2017

divyenpatel Jul 24, 2017

BaluDontu Jul 25, 2017

BaluDontu commented Jul 25, 2017

BaluDontu commented Jul 26, 2017

BaluDontu commented Jul 26, 2017

BaluDontu commented Jul 26, 2017

divyenpatel commented Jul 26, 2017

BaluDontu commented Jul 27, 2017

divyenpatel commented Aug 2, 2017

divyenpatel commented Aug 2, 2017

divyenpatel commented Aug 2, 2017

k8s-ci-robot commented Aug 2, 2017

divyenpatel commented Aug 2, 2017

BaluDontu commented Aug 3, 2017 •

edited

Loading

divyenpatel commented Aug 9, 2017

luomiao commented Aug 9, 2017

luomiao commented Aug 9, 2017

divyenpatel commented Aug 9, 2017

kerneltime commented Aug 9, 2017

kerneltime commented Aug 9, 2017

k8s-github-robot commented Aug 9, 2017

k8s-github-robot commented Aug 10, 2017

k8s-github-robot commented Aug 10, 2017

VSphere cloud provider code refactoring #49164

VSphere cloud provider code refactoring #49164

Conversation

BaluDontu commented Jul 18, 2017 • edited Loading

k8s-ci-robot commented Jul 18, 2017

BaluDontu commented Jul 19, 2017

jingxu97 commented Jul 19, 2017 • edited Loading

msau42 commented Jul 20, 2017

BaluDontu commented Jul 20, 2017

divyenpatel left a comment

Choose a reason for hiding this comment

divyenpatel Jul 21, 2017

Choose a reason for hiding this comment

BaluDontu Jul 25, 2017

Choose a reason for hiding this comment

divyenpatel Jul 24, 2017

Choose a reason for hiding this comment

BaluDontu Jul 25, 2017

Choose a reason for hiding this comment

BaluDontu commented Jul 25, 2017

BaluDontu commented Jul 26, 2017

BaluDontu commented Jul 26, 2017

BaluDontu commented Jul 26, 2017

divyenpatel commented Jul 26, 2017

BaluDontu commented Jul 27, 2017

divyenpatel commented Aug 2, 2017

divyenpatel commented Aug 2, 2017

divyenpatel commented Aug 2, 2017

k8s-ci-robot commented Aug 2, 2017

divyenpatel commented Aug 2, 2017

BaluDontu commented Aug 3, 2017 • edited Loading

divyenpatel commented Aug 9, 2017

luomiao commented Aug 9, 2017

luomiao commented Aug 9, 2017

divyenpatel commented Aug 9, 2017

kerneltime commented Aug 9, 2017

kerneltime commented Aug 9, 2017

k8s-github-robot commented Aug 9, 2017

k8s-github-robot commented Aug 10, 2017

k8s-github-robot commented Aug 10, 2017

BaluDontu commented Jul 18, 2017 •

edited

Loading

jingxu97 commented Jul 19, 2017 •

edited

Loading

BaluDontu commented Aug 3, 2017 •

edited

Loading