-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Self-hosted Kubelet proposal #23343
docs: Self-hosted Kubelet proposal #23343
Conversation
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
1 similar comment
Can one of the admins verify that this patch is reasonable to test? (reply "ok to test", or if you trust the user, reply "add to whitelist") This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
|
||
To expand on this, we envision a flow similar to the following: | ||
|
||
1. Systemd (or $init_system) continually runs “bootstrap” Kubelet in “runonce” mode with a file lock until it pulls down a “self-hosted” Kubelet pod and runs it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assume --runonce-timeout results in a failure exit-code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there happens to be nothing scheduled to the node (hence hitting the --runonce-timeout) would we want that considered an error case?
In terms of the bootstrap kubelet / self-hosted kubelet pivot, I don't think coordinating around exit codes would be strictly necessary. Instead, the coordination point becomes "has another kubelet started", rather than "was I scheduled something". So $init could essentially loop on something like "if lockfile not acquired, start bootstrap kubelet; sleep X".
cc: @vishh @mikedanese @dchen1107 @aaronlevy To summarize the results of the in person meetings last Wednesday: As per @vishh 's feedback we decided that from a system administrator perspective, have a service continually restart several times would likely be taken as an indication of failure, even though it only represents essentially a "loop iteration" until the bootstrap kubelet can pull down the "self-hosted" kubelet (again, this is until we get Taints and Tolerations). So, instead of modifying the "runonce" code path to be able to contact an API Server, we will instead modify the default code path of the kubelet with a I will update the proposal to reflect this new approach. @vishh please let me know if I remembered anything about our discussion incorrectly :). |
@derekparker: Thanks for the summary. LGTM. During upgrades, the bootstrap kubelet might take over from the old version, before it let's the upgraded version run. It is possible that the bootstrap kubelet version is incompatible with the newer versions that were run in the node. For example, the cgroup cofigurations might be incompatible. |
@vishh, coming back to this I'm still not fully clear on upgrade scenario. bootstrap+upgrade scenario:
|
Wouldn't the health checks on the kubelet pod end up restarting the kubelet? |
Ah right, that seems like it should work. Thanks. |
I have updated the proposal to reflect the latest discussions. cc: @aaronlevy @dchen1107 @vishh @mikedanese @derekwaynecarr |
|
||
## Abstract | ||
|
||
In a self-hosted Kubernetes deployment, we have the initial bootstrap problem. This proposal presents a solution to the kubelet bootstrap, and assumes a functioning control plane, and a kubelet that can securely contact the API server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does a functioning control plane
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could you describe the bootstrap problem?
- When in the bootstrap process that a functioning control plan (assuming the apiserver) is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of the abstract, maybe we can elaborate a bit. Possibly something along the lines of:
When running self-hosted components, there needs to be a mechanism for pivoting from the initial bootstrap state to the kubernetes-managed (self-hosted) state. In the case of a self-hosted kubelet, this means pivoting from the initial kubelet defined & run on the host, to the kubelet pod which has been scheduled to the node.
This proposal presents a solution to the initial kubelet bootstrap, and the mechanism for pivoting to the self-hosted kubelet. This proposal assumes that the initial kubelet on the host is able to connect to a properly configured api-server.
Not sure if the above changes would answer the questions, but:
-
The bootstrap problem is essentially that we want the kubelet to be managed by kubernetes, but we need an initial kubelet to do that. So we need some mechanism for us to launch a kubelet, then give up control once a new kubelet has started.
-
A functioning apiserver would be required from the beginning (assuming no checkpointed pods I guess). Otherwise the initial kubelet would behave like any other kubelet without apiserver access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think it'd help explaining why we want kubelet to be managed by kubernetes.
- It's unclear from this proposal that who's going to start the initial apiserver. I assume the bootstrap kubelet will have to start the apiserver pod based on the manifest files? In that case, the apiserver wasn't actually functioning at the time the kubelet was being started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I was thinking about this from the perspective of "in terms of this proposal, assume a functioning apiserver" as a means of keeping the discussion more focused. Ultimately the apiserver could be a static pod, or just a binary run directly on the host, or docker container outside k8s, etc.
But you're right that you wouldn't strictly need an apiserver to demonstrate the same functionality. The kubelet pod could be a static pod as well (would be an odd use-case, but should work assuming the kubelet pod doesn't need secrets/configMap/etc). So maybe just drop that line, as it is somewhat orthogonal?
Also, agree that it would be helpful to add a "motivations" section to cover reasons we want self-hosted kubelet.
@derekparker Completed review pass. |
# Proposal: Self-hosted kubelet | ||
|
||
## Abstract | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: breaking up a paragraph with new lines will make commenting easier.
Proposal has been updated. |
5cb2e42
to
a7f4402
Compare
Updated proposal based on @vishh review. @dchen1107 @derekwaynecarr any thoughts on this? |
@derekparker - apologies for the delay reviewing. I am ok with this as described. |
@vishh Can you ask @dchen1107 to review this? Or perhaps we can proceed without her review as you and @derekwaynecarr think that the approach is OK. We have been waiting for 12 days for a review. |
GCE e2e build/test passed for commit a7f4402. |
Automatic merge from submit-queue |
@derekparker This PR should have been squashed before merge. Next time, please squash after LGTM. |
@bgrant0607 will do. FWIW I didn't really have much time / wasn't notified after lgtm tag was applied before the ok-to-merge tag was applied. |
adding |
@tmrts File an issue on the contrib repo for that. |
PR for self-hosted kubelet |
Would you provide an update on the status for the documentation for this feature as well as add any PRs as they are created? Not Started / In Progress / In Review / Done Thanks |
I do not believe any documentation has been started. Really the only work which has come out of this proposal thus far is adding the I'm happy to add a blurb about the flag functionality as it stands if I can get a pointer to the best place to document flag functionality. |
Is this a feature that would be used by anyone as is or that changes existing behavior? I see that the flag itself has been documented in the --help with |
From your response, it sounds like not doc changes are required for 1.3 |
Probably not. We should probably update this proposal to reflect the changed flag name though (s/--bootstrap/--exit-on-lock-contention) - but assume that isn't a blocker. |
…let-proposal Automatic merge from submit-queue docs: Self-hosted Kubelet proposal Provides a proposal for changes needed with Kubernetes to allow for a self-hosted Kubelet bootstrap.
Provides a proposal for changes needed with Kubernetes to allow for a
self-hosted Kubelet bootstrap.