Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track the sources that the kubelet has seen #2994

Merged
merged 1 commit into from
Dec 18, 2014

Conversation

brendandburns
Copy link
Contributor

And only delete pods when every source has been seen at least once.

Addresses #1954

Needs more unit test coverage, but here for an initial review of the design.

@davidopp
Copy link
Member

Can you add a one-sentence explanation of how this addresses #1954 for people who don't want to read the whole PR? (Including what a "source" means in this context.) Thanks!

@brendandburns
Copy link
Contributor Author

Sure:

The kubelet receives configuration information from a variety of sources:

  • static file
  • http server
  • etcd/boundPods

There are three types of updates:

  • ADD Incremental update adding pod(s) to the complete set of pods for this source
  • SET complete overwrite of pods for that source
  • DELETE Incremental delete of pod(s) from the complete set of pods for this source

These are all mixed together into a single update stream.

The kubelet itself does the following loop:

for {
  select {
     // normal watch/update case
     case update := <- updates:
        handleUpdate(update)
     // periodic resync, in case updates are missed
     case <- time.Timeout(syncTimeout)
        handleResync()
}

The trouble occurs when the resync timeout occurs before complete information from all sources have been obtained. The set of active pods on the kubelet is incomplete, and the resync uses this incomplete information, and thus kills pods that are correctly scheduled on the host, but the kubelet is not currently aware of.

This PR fixes this, by refusing to delete pods until all sources have sent at least one SET message.

Hope that helps!

@brendandburns brendandburns force-pushed the exec branch 3 times, most recently from d531eb6 to b8df6a1 Compare December 17, 2014 20:55
when every source has been seen at least once.
@brendandburns
Copy link
Contributor Author

Ok, unit test added. Empirically verified also.
This is ready to merge.
ptal.

Thanks!
--brendan

@dchen1107
Copy link
Member

config is just one of sources through which kubelet receives information. We shouldn't have this SeenAllSources() implemented in config.go to decided if all sources are received for kubelet. This is part of kubelet control loop, why not implement it in Kubelet itself? I understand that you want to have the dependency injection here, but we still can have that at kubelet.go since it already have the factory-like methods for prod and test anyway.

@brendandburns
Copy link
Contributor Author

Actually, the trouble is that the config object abstracts across all of the
different sources, so the kubelet has no visibility about the various
sources.

Also, the config object is actually the only source of bound pod
information.

We could break the config abstraction and pull the sources all the way up
into the kubelet, but that's a large scale re-factor.

--brendan

On Wed, Dec 17, 2014 at 3:25 PM, Dawn Chen notifications@github.com wrote:

config is just one of sources through which kubelet receives information.
We shouldn't have this SeenAllSources() implemented in config.go to decided
if all sources are received for kubelet. This is part of kubelet control
loop, why not implement it in Kubelet itself? I understand that you want to
have the dependency injection here, but we still can have that at
kubelet.go since it already have the factory-like methods for prod and test
anyway.


Reply to this email directly or view it on GitHub
#2994 (comment)
.

@dchen1107
Copy link
Member

ACK and LGTM.

dchen1107 added a commit that referenced this pull request Dec 18, 2014
Track the sources that the kubelet has seen
@dchen1107 dchen1107 merged commit edfae86 into kubernetes:master Dec 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants