Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update pod status only when it changes. #5714

Merged
merged 1 commit into from
Mar 24, 2015

Conversation

fgrzadkowski
Copy link
Contributor

  • Update pod status only when it changes.
  • Refactor syncing logic into a separate struct

This PR is ready for review, but I haven't run e2e tests. I'm sending this now so that I can address comments on Monday CET time.

This fixes #5624 and #5693

I plan to apply the same logic (update upon change) for node status updates.

@vmarmol @bgrant0607 @dchen1107 @smarterclayton

@vmarmol vmarmol self-assigned this Mar 20, 2015
@vmarmol
Copy link
Contributor

vmarmol commented Mar 20, 2015

Shippable is complaining about gofmt.

}

func (kl *Kubelet) generatePodStatus(podFullName string, uid types.UID) (api.PodStatus, error) {
func (kl *Kubelet) generatePodStatus(podFullName string) (api.PodStatus, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we should subsume this into the new status object too. Not for this PR though :) there is some more refactoring needed before that I think.

@fgrzadkowski
Copy link
Contributor Author

e2e tests pass (liveness.sh flaked - successful 3/4 times)

status api.PodStatus
}

type statusManager struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we comment that the implementation is thread-safe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@vmarmol
Copy link
Contributor

vmarmol commented Mar 23, 2015

Looks good, just two comments. Also needs a rebase it seems.

}
_, err = s.kubeClient.Pods(namespace).UpdateStatus(name, &status)
if err != nil {
glog.Warningf("Error updating status for pod %q: %v", name, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the apiserver request fails, would we lose this status transition completely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, we should retry on failure. Easy way is to remove the status from our map so that SetStatus() will re-send it. We won't be able to lock around the send in SetStatus() though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! That's a good catch! Thanks :) Fixed. I'm deleting cached value to make sure it is retried next time.

@fgrzadkowski fgrzadkowski force-pushed the update_status_on_change branch from 6af068a to f0615c1 Compare March 23, 2015 19:42
@fgrzadkowski
Copy link
Contributor Author

Rebased.

@fgrzadkowski
Copy link
Contributor Author

For some reason integration test is flaky. I'm still not sure whether it's related to this PR. I'll investigate tomorrow.

@smarterclayton
Copy link
Contributor

It's fixed in #5775

We were always terminating the podRunning loop early for mirror pods.

----- Original Message -----

For some reason integration test is flaky. I'm still not sure whether it's
related to this PR. I'll investigate tomorrow.


Reply to this email directly or view it on GitHub:
#5714 (comment)

@fgrzadkowski fgrzadkowski force-pushed the update_status_on_change branch from f0615c1 to c88c703 Compare March 23, 2015 21:40
@fgrzadkowski
Copy link
Contributor Author

Thanks. This helped!

if found {
glog.V(3).Infof("Returning cached status for %q", podFullName)
return cachedPodStatus, nil
}
return kl.generatePodStatus(podFullName, uid)
return kl.generatePodStatus(podFullName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubelet failed to retrieve cached PodStatus from statusManager, regenerate it from scratch. Shouldn't this update the cached one in statusManager too?

@dchen1107
Copy link
Member

LGTM overall except one small nit / question.

@fgrzadkowski
Copy link
Contributor Author

It seems that after my last rebase, e2e stopped working. It's pretty late here, so I'll investigate tomorrow.

@fgrzadkowski
Copy link
Contributor Author

FYI - they were passing before the rebase.

@vmarmol
Copy link
Contributor

vmarmol commented Mar 24, 2015

There were issues with the e2e tests, may be worth re-running after a rebase.

@dchen1107
Copy link
Member

You have to rebase first before merge.

@dchen1107
Copy link
Member

I thought I proposed having podWorker periodically update PodStatus before, and we agreed to have that in a separate PR later. But now with introducing statusManager to populate PodStatus for all pods, it looks like we take a different approach. Why we prefer this way?

@fgrzadkowski
Copy link
Contributor Author

@dchen1107 I don't see how is it different. The only thing that has changed is that syncing logic is outside of kubelet itself. We still populate pod status from pod worker via SyncPod method that is called in PodWorker.
Initially the logic was pretty simple - periodically update status. Now that we want to update it only upon change it gets slightly more complicated so I thought it's worth a separate class for this.

@fgrzadkowski
Copy link
Contributor Author

Rebased. It helped for for e2e tests that pass now.

@vmarmol Let's merge, before I need to rebase again :)

@vmarmol
Copy link
Contributor

vmarmol commented Mar 24, 2015

Oooh man, somehow it seems we'll need to rebase again! sorry @fgrzadkowski :( I'm here waiting by the merge button

@fgrzadkowski fgrzadkowski force-pushed the update_status_on_change branch from c88c703 to 0dd7743 Compare March 24, 2015 15:37
@fgrzadkowski
Copy link
Contributor Author

Rebased :)

* Refactor syncing logic into a separate struct
@fgrzadkowski fgrzadkowski force-pushed the update_status_on_change branch from 0dd7743 to 632ca50 Compare March 24, 2015 15:41
@fgrzadkowski
Copy link
Contributor Author

And rebased again. This time looks OK.

@dchen1107
Copy link
Member

@fgrzadkowski Here is the reason on why: Currently PodWorker create and run containers which might fail due to all kinds of reasons. Sync Status logic uses docker inspect to retrieve the reason for failure, but there is no easy way to find on why a container failed on creation. If PodWorker does both, we could propogate the creation failures much cleaner.

I am ok with what we are having here for now.

@vmarmol
Copy link
Contributor

vmarmol commented Mar 24, 2015

@dchen1107 I think we do what you'd expect today. We set the pod status in the syncWorker but we actually send it in an async thread. We will propagate the creation failure.

@vmarmol
Copy link
Contributor

vmarmol commented Mar 24, 2015

LGTM, will merge on green. Thanks @fgrzadkowski! You get the "most rebased" award :)

vmarmol added a commit that referenced this pull request Mar 24, 2015
Update pod status only when it changes.
@vmarmol vmarmol merged commit a2e2fea into kubernetes:master Mar 24, 2015
@dchen1107
Copy link
Member

@fgrzadkowski and @vmarmol ahh, I misread that code since the code is outside both PodWorker and Kubelet, but actually setter logic is invoked by PodWorker. Thanks! LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Pod status when it changes
6 participants