Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support restarting a pod on the same node when it goes down #37250

Closed
buchireddy opened this issue Nov 21, 2016 · 3 comments
Closed

Comments

@buchireddy
Copy link

buchireddy commented Nov 21, 2016

Is this a request for help?:

No.

What keywords did you search in Kubernetes issues before filing this one?

restart pod on the same node

Is this a BUG REPORT or FEATURE REQUEST?

Feature Request.

Use Case:
When running stateful applications like Kafka using PetSets and local storage (host volumes), it's preferable to restart a pod on the same node if it goes down (assuming the node itself is up and healthy). This saves replicating lot of data from leaders and helps to bring the application back into the cluster quickly. This problem doesn't exist if there is a network storage since the new pod can access the same network drive from another node though.

Idea:
Can we support restarting a pod on the same node where it was running before going down, either by using some kind of rescheduling policy or node/pod affinity? Are there any hooks in the scheduler, using which the user can pick/suggest a node on which the pod should be scheduled next?

Please note that this should only be a recommendation to the scheduler and pod should be scheduled on a different node if the last node doesn't meet the resource requirements anymore or isn't healthy. Thoughts?

@nebril
Copy link
Contributor

nebril commented Nov 22, 2016

I think you should find what you are looking for here: http://kubernetes.io/docs/user-guide/node-selection/ . Node affinity feature is in alpha right now, check it out.

@buchireddy
Copy link
Author

buchireddy commented Nov 22, 2016

@nebril I have read about Node affinity but that's not enough for the feature I've suggested. Node affinity only helps to schedule a pod on given node(s) but how do you make sure a restarted pod comes up on the same node where it was running prior to going down?

@bgrant0607
Copy link
Member

The ability to stay on the same node is called forgiveness #1574, which is in the progress of being added to tolerations.

There's also a proposal for "gravity" for local storage in the works. #7562

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants