Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DownwardAPI refresh race. #59873

Merged
merged 1 commit into from
Feb 16, 2018

Conversation

jsafrane
Copy link
Member

WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP) and DSWP should mark the volume to be remounted only when the new pod has been processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler, then DownwardAPI and Projected volumes of the pod are not refreshed with new content and they are updated after the next periodic sync (60-90 seconds).

Fixes #59813

/assign @jingxu97 @saad-ali
/sig storage
/sig node

None

WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP)
and DSWP should mark the volume to be remounted only when the new pod has been
processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler,
then DownwardAPI and Projected volumes of the pod are not refreshed with new
content and they are updated after the next periodic sync (60-90 seconds).
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 14, 2018
@errordeveloper
Copy link
Member

/lgtm
/ok-to-test

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 14, 2018
@jsafrane
Copy link
Member Author

/retest

@jingxu97
Copy link
Contributor

/lgtm

@jsafrane
Copy link
Member Author

/assign @derekwaynecarr
for approval. This should fix lot of flakes we see in DownwardAPI tests.

Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome to have a fix.

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, errordeveloper, jingxu97, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 16, 2018
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 59873, 59933, 59923, 59944, 59953). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit c7c5d89 into kubernetes:master Feb 16, 2018
openshift-merge-robot added a commit to openshift/origin that referenced this pull request Feb 16, 2018
Automatic merge from submit-queue.

Picks for volume manager

Thanks to @jsafrane for these fixes

kubernetes/kubernetes#59873
kubernetes/kubernetes#59923

59923 modified from upstream because some logging levels where already higher in 1.9

xref https://bugzilla.redhat.com/show_bug.cgi?id=1538216

Fixes #17605
Fixes #17556

@derekwaynecarr
openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this pull request Feb 27, 2018
Automatic merge from submit-queue.

Picks for volume manager

Thanks to @jsafrane for these fixes

kubernetes#59873
kubernetes#59923

59923 modified from upstream because some logging levels where already higher in 1.9

xref https://bugzilla.redhat.com/show_bug.cgi?id=1538216

Fixes openshift/origin#17605
Fixes openshift/origin#17556

@derekwaynecarr

Origin-commit: e4f2115102c01124cc7f168b7f1ae4c65f190875
@@ -302,6 +305,9 @@ func (dswp *desiredStateOfWorldPopulator) processPodVolumes(pod *v1.Pod) {
// some of the volume additions may have failed, should not mark this pod as fully processed
if allVolumesAdded {
dswp.markPodProcessed(uniquePodName)
// New pod has been synced. Re-mount all volumes that need it
// (e.g. DownwardAPI)
dswp.actualStateOfWorld.MarkRemountRequired(uniquePodName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this commit showed up in a bisect of scalability test regression of pod startup time (#60589 (comment)), and this change looks odd... does this force double mount setup for all pods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flow that leads to MarkRemountRequired call is:

  • kubelet.syncPod is called (on pod update or every 90s)

    • volumeManager.WaitForAttachAndMount is called and marks the pod for reprocessing
  • desiredStateOfWorldPopulator.findAndAddNewPods is called periodically

    • desiredStateOfWorldPopulator.processPodVolumes is called (only when the pod was marked for reprocessing in syncPod)
      • actualStateOfWorld.MarkRemountRequired is called
        • reconciler re-mounts the volumes (updates secrets, downward API, ...). Update of DownwardAPI is the reason why syncPod is involved.

Notice that before this PR, MarkRemountRequired was called directly by kubelet.syncPod, so the frequency of the calls did not change. With this PR, they are called only in the right order. I don't think it could affect pod startup time.

On the other hand, syncPod is called with every pod update and pod changes quite often during startup. It is possible that I missed something there.

KubeStacker added a commit to KubeStacker/kubernetes that referenced this pull request Jan 25, 2019
WaitForAttachAndMount should mark only pod in DesiredStateOfWorldPopulator (DSWP)
and DSWP should mark the volume to be remounted only when the new pod has been
processed.

Otherwise DSWP and reconciler race who gets the new pod first. If it's reconciler,
then DownwardAPI and Projected volumes of the pod are not refreshed with new
content and they are updated after the next periodic sync (60-90 seconds).

RelatedTo: kubernetes#59873
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants