Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VSphere cloud provider code refactoring #49164

Conversation

BaluDontu
Copy link
Contributor

@BaluDontu BaluDontu commented Jul 18, 2017

The current PR tracks the vSphere Cloud Provider code refactoring which includes the following changes.

  • VCLib Package - A framework used by vSphere cloud provider for managing the vSphere entities. VCLib package mainly does the following:
    • Volume management on datastore (Create/Delete)
    • Volume management on Virtual Machines (Attach/Detach)
    • Storage Policy Management
  • vSphere Cloud Provider changes to implement the cloud provider interfaces by calling into VCLib package.
  • Modifications to e2e tests to accomodate the latest design changes.

@divyenpatel @rohitjogvmw @luomiao

vSphere cloud provider: vSphere cloud provider code refactoring

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 18, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @BaluDontu. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 18, 2017
@k8s-github-robot k8s-github-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. release-note-label-needed labels Jul 19, 2017
@BaluDontu
Copy link
Contributor Author

@brendandburns @saad-ali @jingxu97 : Can anyone of you run the tests? ("ok-to-test")

@jingxu97
Copy link
Contributor

jingxu97 commented Jul 19, 2017

/ok-to-test

@k8s-github-robot k8s-github-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Jul 19, 2017
@msau42
Copy link
Member

msau42 commented Jul 20, 2017

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 20, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 20, 2017
@BaluDontu BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch from ae8077d to 8f80be8 Compare July 20, 2017 23:25
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 20, 2017
@BaluDontu BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch from 8f80be8 to 9198ec0 Compare July 20, 2017 23:36
@BaluDontu
Copy link
Contributor Author

I have resolved the conflicts and rebased the branch to the latest.

Copy link
Member

@divyenpatel divyenpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a fix in vsphere.go -> newVSphere()

// Create context
ctx, cancel := context.WithCancel(context.TODO())
defer cancel()
err = vSphereConn.Connect(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the worker node we should not connect to vCenter if vm-name is specified in the vsphere.conf file.

Used following vsphere.conf file on Kubernetes 1.7.1, and restarted kubelet. Node was able to register successfully.

# cat vsphere.conf
[Global]
        vm-name = "node1"

When same file is used on this Refactored code. kubelet failed to restart with following errors

{"log":"E0721 23:34:39.438996   11722 connection.go:75] Failed to create new client. err: Post https:///sdk: http: no Host in request URL\n","stream":"stderr","time":"2017-07-21T23:34:39.439139608Z"}
{"log":"E0721 23:34:39.439033   11722 connection.go:41] Failed to create govmomi client. err: Post https:///sdk: http: no Host in request URL\n","stream":"stderr","time":"2017-07-21T23:34:39.439156709Z"}
{"log":"E0721 23:34:39.439041   11722 vsphere.go:202] Failed to connect to vSphere\n","stream":"stderr","time":"2017-07-21T23:34:39.439160254Z"}
{"log":"Error: failed to run Kubelet: could not init cloud provider \"vsphere\": Post https:///sdk: http: no Host in request URL\n","stream":"stderr","time":"2017-07-21T23:34:39.439163376Z"}

We should fix this as part of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it in latest commit. Now we don't connect to VC on worker nodes if we have vm-name available in vsphere.conf file.

if err != nil {
return nil, err
glog.Errorf("Failed to Profile ID by name: %s. err: %+v", storagePolicyName, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed to Profile ID by name => Failed to get Profile ID by name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed it in latest commit.

@BaluDontu BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch from 9198ec0 to b55543b Compare July 25, 2017 20:33
@BaluDontu
Copy link
Contributor Author

Fixed bazel errors and review comments from @divyenpatel .

@BaluDontu BaluDontu force-pushed the vSphereCloudProviderCodeRefactoring branch 2 times, most recently from f33d537 to 1c00005 Compare July 26, 2017 00:02
@BaluDontu
Copy link
Contributor Author

/retest

@BaluDontu
Copy link
Contributor Author

@brendandburns @saad-ali @jingxu97 : Can one of you guys review this PR ?

@BaluDontu
Copy link
Contributor Author

/retest

@divyenpatel
Copy link
Member

LGTM

@BaluDontu
Copy link
Contributor Author

/assign @jingxu97

@divyenpatel
Copy link
Member

@divyenpatel
Copy link
Member

LGTM

@divyenpatel
Copy link
Member

/lgtm

@k8s-ci-robot
Copy link
Contributor

@divyenpatel: changing LGTM is restricted to assignees, and only kubernetes org members may be assigned issues.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@divyenpatel
Copy link
Member

/approve

@BaluDontu
Copy link
Contributor Author

BaluDontu commented Aug 3, 2017

@childsb @saad-ali can you please take a look at this PR.

@divyenpatel
Copy link
Member

@brendandburns @davidopp @jingxu97 Can you please approve this PR?
We have other changes in the pipeline based on this PR.

All tests passed. PR needs approval label.

@luomiao
Copy link

luomiao commented Aug 9, 2017

LGTM
(I reviewed this PR during the internal vsphere cloud provider team review)

@luomiao
Copy link

luomiao commented Aug 9, 2017

/lgtm
/approve

@divyenpatel
Copy link
Member

@saad-ali we need approval from pkg/volume/vsphere_volume/OWNERS
Can you help?

@kerneltime
Copy link

/approve

@kerneltime
Copy link

/approve no-issue

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BaluDontu, divyenpatel, jingxu97, kerneltime, luomiao

Associated issue requirement bypassed by: kerneltime

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 9, 2017
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit a881405 into kubernetes:master Aug 10, 2017
k8s-github-robot pushed a commit that referenced this pull request Aug 24, 2017
Automatic merge from submit-queue

Attempt to Attach Volume fails for very first time on vSphere in 1.6 release

Before every attach, vSphere cloud provider checks if the SCSI controller is present on the VM. If not present, it will try to create one and attach the disk to that SCSI controller on VM. If already present, if will use the SCSI controller.

For the very first time when SCSI controller is not present on the VM, we try to create one and retrieve the SCSI controller for the disk to created on that SCSI controller. But in release 1.6, after successful creation of SCSI controller, we are not assigning back the SCSI controller to the existing "scsicontroller" variable. Because of this the very first time, attach of the disk will fail.

This problem is not observed on master, as we have taken care of this in vSphere cloud provider refactoring - #49164

@luomiao @divyenpatel @rohitjogvmw 

```release-note
vSphere: Fix attach volume failing on the first try.
```
k8s-github-robot pushed a commit that referenced this pull request Aug 25, 2017
Automatic merge from submit-queue

Attempt to Attach Volume fails for very first time on vSphere in 1.7 release

Before every attach, vSphere cloud provider checks if the SCSI controller is present on the VM. If not present, it will try to create one and attach the disk to that SCSI controller on VM. If already present, it will use the SCSI controller.

For the very first time when SCSI controller is not present on the VM, we try to create one and retrieve the SCSI controller for the disk to created on that SCSI controller. But in release 1.7, after successful creation of SCSI controller, we are not assigning back the SCSI controller to the existing "scsicontroller" variable. Because of this the very first time, attach of the disk will fail.

This problem is not observed on master, as we have taken care of this in vSphere cloud provider refactoring - #49164

@luomiao @divyenpatel @rohitjogvmw 

```release-note
vSphere: Fix attach volume failing on the first try.
```
@BaluDontu BaluDontu deleted the vSphereCloudProviderCodeRefactoring branch September 5, 2017 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants