Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attachdetach controller: attach volumes immediately when Pod's PVCs are bound #66863

Merged
merged 2 commits into from
Aug 14, 2018

Conversation

cofyc
Copy link
Member

@cofyc cofyc commented Aug 1, 2018

What this PR does / why we need it:

Let attachdetach controller to attach volumes immediately when Pod's PVCs are bound.

Current attachdetach controller calls util.ProcessPodVolume to add pod volumes into desiredStateOfWorld on these events:

  • podAdd event
  • podUpdate event
  • podDelete event
  • periodical desiredStateOfWorldPopulator.findAndAddActivePod

But if a pod is created with PVCs not bound, no volumes will be added into desiredStateOfWorld because PVCs not bound. When pv controller binds PVCs successfully, attachdetach controller will not add pod volumes immediately because it does not watch on PVC events.

It will wait until a pod update event is triggered (normally will not happen because no new status will be reported by kubelet) or desiredStateOfWorldPopulator.findAndAddActivePod is called (maybe 0~3 minutes later, see timer configs).

In bad case, pod start time will be very long (~3 minutes + ~2 minutes (kubelet max exponential backoff)), for example: #64549 (comment).

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #64549

Special notes for your reviewer:

Release note:

attachdetach controller attaches volumes immediately when Pod's PVCs are bound

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 1, 2018
@cofyc cofyc changed the title attachdetach controller: attach volumes immediately when Pod's PVCs are bound WIP: attachdetach controller: attach volumes immediately when Pod's PVCs are bound Aug 1, 2018
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 1, 2018
// Skip unbound PVCs.
return
}
pods, err := dswp.podLister.List(labels.Everything())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way - perhaps we can solve this is, by implementing work queues in attach/detach controller. The problem here is - any random PVC update will cause iteration of all pods in the cluster. One can perhaps DOS the cluster by constantly updating the pvc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I'll update it.

Copy link
Member Author

@cofyc cofyc Aug 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because PVC does not contain Pod references, but we can index pods by PVC locally, then we don't need to iterate all pods every time. For example, daemon controller indexes pods based on their NodeName.

I think this will improve performance in large cluster.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 2, 2018
@cofyc cofyc changed the title WIP: attachdetach controller: attach volumes immediately when Pod's PVCs are bound attachdetach controller: attach volumes immediately when Pod's PVCs are bound Aug 2, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 2, 2018
@cofyc
Copy link
Member Author

cofyc commented Aug 2, 2018

/assign @saad-ali
Ready to review, summary of changes:

  • Use queue to process PVCs on add/update events
  • Index pods by PVC key then we don't need to iterate to find pods
  • Add integration test for this feature

cc @jingxu97 @msau42

@cofyc
Copy link
Member Author

cofyc commented Aug 2, 2018

/test pull-kubernetes-integration
/test pull-kubernetes-kubemark-e2e-gce-big
/test pull-kubernetes-e2e-kops-aws

@redbaron
Copy link
Contributor

redbaron commented Aug 2, 2018

@cofyc , do you think it is cherry-pickable to 1.10, 1.11?

@msau42
Copy link
Member

msau42 commented Aug 2, 2018

@kubernetes/sig-storage-pr-reviews

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Aug 2, 2018
Copy link
Member

@gnufied gnufied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good to me.

UpdateFunc: func(old, new interface{}) {
adc.enqueuePVC(new)
},
DeleteFunc: nil,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think you need to specify DeleteFunc if you don't want to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@cofyc
Copy link
Member Author

cofyc commented Aug 3, 2018

@redbaron
Patch of attachdetach is easy to be cherry-picked, test part have some conflicts.
I'll try to cherry-pick it into old releases when it's accepted.

@gnufied
Copy link
Member

gnufied commented Aug 3, 2018

@cofyc can you please squash your commits?

@cofyc
Copy link
Member Author

cofyc commented Aug 4, 2018

@gnufied
Done. I split it into two parts:

waitForPodFuncInDSWP(t, ctrl.GetDesiredStateOfWorld(), 10*time.Second, "expected 1 pod in dsw", 1)
createPVForPVC(t, testClient, pvc)
waitForPodFuncInDSWP(t, ctrl.GetDesiredStateOfWorld(), 20*time.Second, "expected 2 pods in dsw after PVCs are bound", 2)
close(stopCh)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how well 10s and 20s setting will work? What is the typical delay under normal load, and heavy load?

Copy link
Member Author

@cofyc cofyc Aug 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attachdetach controller processes Pod's PVs to attach very fast, the delay should be almost same as apiserver watch event delay, because it does nothing except adding Pod's Volumes into DSWP (no api calls) on event.
In my local test, delay to attach PV is less than 100ms. The most delay is waiting PV controller to bound PVC, normally 4~5 seconds.
Perhaps we should increase timeout of waitForPodFuncInDSWP, and it will not increase execution time of test.
time.Sleep(10 * time.Second) is a hack to verify the second pod cannot be added into DSWP, because no object state or event can be checked to verify. 10 seconds should be enough here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout of waitForPodFuncInDSWP is increased to 60 seconds now.

Copy link
Member

@msau42 msau42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! I just have two nits

}
pvc, err := adc.pvcLister.PersistentVolumeClaims(namespace).Get(name)
if apierrors.IsNotFound(err) {
glog.V(2).Infof("error getting pvc %q from informer: %v", key, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really be log level 2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, increased to log level 4.

if !ok {
return []string{}, nil
}
// We are only interested in active pods with nodeName set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a comment that this cache is only used for attaching

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely LGTM, a couple of questions.

func (adc *attachDetachController) enqueuePVC(obj interface{}) {
key, err := kcache.DeletionHandlingMetaNamespaceKeyFunc(obj)
if err != nil {
runtime.HandleError(fmt.Errorf("Couldn't get key for object %+v: %v", obj, err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: Why runtime.HandleError? Why not just glog.Errorf?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed https://github.com/kubernetes/kubernetes/blob/v1.12.0-alpha.1/pkg/controller/daemon/daemon_controller.go#L316, and found sample-controller uses runtime.HandleError to log errors too. Perhaps because runtime.HandleError is dedicated to log non-user facing ignored errors and can be extended by adding error handlers.

return nil
}

objs, err := adc.podIndexer.ByIndex(pvcKeyIndex, key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to verify, if there are multiple pods using the same PVC, this will return all pods? If so make sure there is a test case for that.

Copy link
Member Author

@cofyc cofyc Aug 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, test has been updated. Created more pvc notbound pods for testing in 74cfffc.

cofyc added 2 commits August 14, 2018 01:03
…re bound

- Use queue to process PVCs on add/update events
- Index pods by PVC key then we don't need to iterate to find pods
…re bound

- Add integration test for this feature
@cofyc
Copy link
Member Author

cofyc commented Aug 13, 2018

Squashed into two commits:

@msau42
Copy link
Member

msau42 commented Aug 13, 2018

/test pull-kubernetes-integration

@saad-ali
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 13, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cofyc, saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 13, 2018
@redbaron
Copy link
Contributor

/test pull-kubernetes-integration

@redbaron
Copy link
Contributor

/retest

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

k8s-github-robot pushed a commit that referenced this pull request Aug 23, 2018
…upstream-release-1.11

Automatic merge from submit-queue.

Automated cherry pick of #66863: attachdetach controller: attach volumes immediately when

Cherry pick of #66863 on release-1.11.

#66863: attachdetach controller: attach volumes immediately when
k8s-github-robot pushed a commit that referenced this pull request Aug 23, 2018
…upstream-release-1.10

Automatic merge from submit-queue.

Automated cherry pick of #66863: attachdetach controller: attach volumes immediately when

Cherry pick of #66863 on release-1.10.

#66863: attachdetach controller: attach volumes immediately when
k8s-github-robot pushed a commit that referenced this pull request Aug 31, 2018
…upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #66863: attachdetach controller: attach volumes immediately when

Cherry pick of #66863 on release-1.9.

#66863: attachdetach controller: attach volumes immediately when
openstack-gerrit pushed a commit to openstack/openstack-helm-infra that referenced this pull request Oct 6, 2018
This PS moves to use k8s 1.10.8, which includes a couple of fixes
for PVC mounts.

* kubernetes/kubernetes#66863

Change-Id: Ica30950a8200f5755897b51fd2b4d24c69a10e61
Signed-off-by: Pete Birley <pete@port.direct>
@cofyc cofyc deleted the fix64549 branch May 4, 2019 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PV randomly take up to few minutes to attach
8 participants