-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add probe based mechanism for kubelet plugin discovery #63328
Add probe based mechanism for kubelet plugin discovery #63328
Conversation
@vikaschoudhary16: GitHub didn't allow me to request PR reviews from the following users: tengqm. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
/test pull-kubernetes-kubemark-e2e-gce-big |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks good, i have been trying to understand every line of code in this change in the past two days, i have two comments as below, thanks for the contribution.
glog.Errorf("Failed to get plugin info using RPC GetInfo at socket %s, err: %v", socketPath, err) | ||
return err | ||
} | ||
resp := w.invokeRegistrationCallbackAtHandler(infoResp, socketPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function has no timeout, potentially can stuck the pluginwatcher.
return ®isterapi.RegistrationStatus{PluginRegistered: true} | ||
} | ||
|
||
// Start watches for the creation of plugin sockets at the path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not about your patch, but about the design of watching fsnotify events,
this approach is relying on an assumption that: fsnofity will be reliably notify pluginwatcher.
but i am just playing evil here, this assumption maybe not 100% true by looking at number of open issue:
https://github.com/fsnotify/fsnotify/issues?page=3&q=is%3Aissue+is%3Aopen and my local test on Mac also show issue with fsnotify.
the other possible workaround is to traverse folder with interval, how do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the number of issue is a reliable metric as many issue seem to be questions , feature requests or refer to an old version :)
Additionally I believe fsnotify is production ready given the number of projects already using it (including docker and k8s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the number of plugins we expect is small, traversal might not be a bad choice. However inotify does provide better latency though and avoids unnecessary traversals. @vikaschoudhary16 it would be worth examining the open issues against inotify prior to taking a hard dependency on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @figo for pointing this out.
@vishh I took a look around this. It seems fsnotify is not being maintained actively. And reading through github issues, looks like people have been maintaining there forks as well for the some of their fixes which could not get merged. Here are the relevant links:
fsnotify/fsnotify#245 (comment)
On the other hand, what @RenaudWasTaken said is also true that so many projects are using it. And therefore i think there are chances that this project will likely be able to find a adopter soon. I am slightly in favor of keeping fsnotify, at the same time if majority thinks traversal might be a better option, open to make changes for traversal as well.
@jiayingz @saad-ali @thockin @derekwaynecarr @dchen1107 what do you guys think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not taking on a dependency that is not being maintained actively. We should either have a plan to contribute back to that project or fork it in k8s organization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to staying with fsnotify for now. Mostly we just need a go pkg for inotify syscall. The fsnotify pkg is already included in k8s so we are not adding a new dependency, and so far we haven't heard issues in the past for the included version. It will be a much easier change to switch to a more stable fsnotify go pkg in the future than building on top of a different model where the latency to detect a new plugin would be affected by how long and how often directory traverse happens.
require.Equal(t, PluginName, name, "Plugin name mismatched!!") | ||
require.Equal(t, []string{"v1beta1", "v1beta2"}, versions, "Plugin version mismatched!!") | ||
// Verifies the grpcServer is ready to serve services. | ||
_, conn, err := dial(socketPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be 'sockPath' instead of 'socketPath'
|
||
Here are the general rules that Kubelet plugin developers should follow: | ||
- Run as priviledged. Currently creating socket under PluginsSockDir requires | ||
priviledged access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's true? Doesn't that just require running as root & a HostPath mount? I think it's likely that plugins will still need to be privileged, but for different reasons...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 If all that's needed is access to a root owned directory, that could come in the form of running as gid 0
too right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on the access permissions, but I'd recommend requiring a root uid (i.e. 700
permissions). The k8s controls limit access to gid 0 are still alpha.
- Implements the Registration service specified in | ||
pkg/kubelet/apis/pluginregistration/v*/api.proto. | ||
- The plugin name sent during Registration.GetInfo grpc should be unique | ||
for the given plugin type (CSIPlugin or DevicePlugin). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there guidelines around namespacing? E.g. "mycompany.com/my-plugin"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a list of plugin APIs that kubelet natively supports whose cardinality is very limited. The names of individual plugins on the other hand needs to adhere to existing namespacing norms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there guidelines around namespacing? E.g. "mycompany.com/my-plugin"
Yes, it should be unique because socket name is expected to have one-to-one mapping with plugin name and it should not be hierarchical. Both points are mentioned.
- The socket path needs to be unique and doesn't conflict with the path chosen | ||
by any other potential plugins. Currently we only support flat fs namespace | ||
under PluginsSockDir but will soon support recursive inotify watch for | ||
hierarchical socket paths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
com.mycompany.my-plugin
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Current implementation does not support hierarchical paths like "mycompany.com/my-plugin". But is in the plan as mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the PluginsSockDir is flat, I think you should at least have PluginsSockDir/csi and PluginsSockDir/devices with two watcher instances. That way you can avoid name collisions between csi and device plugin names.
defer e.wg.Done() | ||
// Blocking call to accept incoming connections. | ||
err := e.grpcServer.Serve(lis) | ||
glog.Errorf("example server stopped serving: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err != nil ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, unit test will fail here and plugin registration will timeout. Another way could be to create a channel and listen for error in Stop(). But since this is only a example plugin to test the actual apis of plugin watcher, i thought implicit indication of failure through unit test failure should be good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just mentioning that you can add a check to see if the error is different than nil before printing it.
|
||
// Creates the plugin directory, if it doesn't already exist. | ||
func (w *Watcher) createPluginDir() error { | ||
glog.Infof("Ensuring Plugin directory at %s ", w.path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the right log level?
func (w *Watcher) createPluginDir() error { | ||
glog.Infof("Ensuring Plugin directory at %s ", w.path) | ||
err := w.fs.MkdirAll(w.path, 0755) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err := w.fs.MkdirAll(w.path, 0755); err != nil {
// Currently only supports flat fs namespace under the plugin directory. | ||
// TODO: adds support for hierarchical fs namespace. | ||
if !f.IsDir() && filepath.Base(f.Name())[0] != '.' { | ||
w.registerPlugin(path.Join(w.path, f.Name())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will have a significant impact on the Kubelet startup time, consider making the call in a goroutine.
Or how about sending a create event in the fsnotify channel (given that it's an buffered channel you'll probably have to do that in a goroutine though)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think this will involve significant latency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call is blocking with a timeout of 10s which means the worst case is n * 10s (n number of plugins in the directory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer the first one, making call goroutine at L#86. If we pass create events, would not they all be processed serially at L#182?
Or another option could be pass create events and then call registerPlugin
in a goroutine at L#182.
I still prefer first one because of simplicity. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would prefer the first one, making call goroutine at L#86. If we pass create events, would not they all be processed serially at L#182?
Isn't that already the case for all other plugins? i.e: If there are multiple plugins created at the same time they are already processed serially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that is not related to kubelet start time. Kubelet start time will be impacted by only those plugins, sockets of which are already there and if we register them serially. In the updated patch, registerPlugin
is being invoked in a goroutine. PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes no sense, what the original discussion was about is doing registration outside the main thread.
Whether we choose to do registration of multiple plugins serially or in parallel should not impact the Kubelet start time.
I don't have a preference for the option you choose. I do however think that registration of multiple plugin should follow the same pattern at startup or during runtime.
- Option 1 (sending an event in the event channel) has the nice property of having a single point of entry for both step of the lifecycle thus making refactoring easier.
- Option 2 (have registration of multiple plugins in parallel ) also seems like a sensible approach.
Additionally it seems like you could even combine both approach.
defer cancel() | ||
infoResp, err := client.GetInfo(ctx, ®isterapi.InfoRequest{}) | ||
if err != nil { | ||
glog.Errorf("Failed to get plugin info using RPC GetInfo at socket %s, err: %v", socketPath, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either return an error or log the error
return ®isterapi.RegistrationStatus{PluginRegistered: true} | ||
} | ||
|
||
// Start watches for the creation of plugin sockets at the path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the number of issue is a reliable metric as many issue seem to be questions , feature requests or refer to an old version :)
Additionally I believe fsnotify is production ready given the number of projects already using it (including docker and k8s)
require.Equal(t, []string{"v1beta1", "v1beta2"}, versions, "Plugin version mismatched!!") | ||
// Verifies the grpcServer is ready to serve services. | ||
_, conn, err := dial(socketPath) | ||
require.Nil(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require.NoError
?
@vikaschoudhary16 which model (identified in design doc) your code implemented 1 or 2? This should make code review easier. |
|
||
// Registration is the service advertised by the Plugins. | ||
service Registration { | ||
rpc GetInfo(InfoRequest) returns (PluginInfo) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can this API be just Info
?
|
||
Here are the general rules that Kubelet plugin developers should follow: | ||
- Run as priviledged. Currently creating socket under PluginsSockDir requires | ||
priviledged access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 If all that's needed is access to a root owned directory, that could come in the form of running as gid 0
too right?
- Implements the Registration service specified in | ||
pkg/kubelet/apis/pluginregistration/v*/api.proto. | ||
- The plugin name sent during Registration.GetInfo grpc should be unique | ||
for the given plugin type (CSIPlugin or DevicePlugin). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a list of plugin APIs that kubelet natively supports whose cardinality is very limited. The names of individual plugins on the other hand needs to adhere to existing namespacing norms.
// Currently only supports flat fs namespace under the plugin directory. | ||
// TODO: adds support for hierarchical fs namespace. | ||
if !f.IsDir() && filepath.Base(f.Name())[0] != '.' { | ||
w.registerPlugin(path.Join(w.path, f.Name())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think this will involve significant latency?
return ®isterapi.RegistrationStatus{PluginRegistered: true} | ||
} | ||
|
||
// Start watches for the creation of plugin sockets at the path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the number of plugins we expect is small, traversal might not be a bad choice. However inotify does provide better latency though and avoids unnecessary traversals. @vikaschoudhary16 it would be worth examining the open issues against inotify prior to taking a hard dependency on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for making design suggestions so late.
|
||
Here are the general rules that Kubelet plugin developers should follow: | ||
- Run as priviledged. Currently creating socket under PluginsSockDir requires | ||
priviledged access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further down, someone mentions namespacing the plugin names.. another place namespacing may make sense is under the PluginsSockDir
. Maybe separate plugins socket path based on type i.e. $PluginsSockDir/csi/<plugins-name>
or $PluginsSockDir/devices/<plugins-name>
etc.
- The socket path needs to be unique and doesn't conflict with the path chosen | ||
by any other potential plugins. Currently we only support flat fs namespace | ||
under PluginsSockDir but will soon support recursive inotify watch for | ||
hierarchical socket paths. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the PluginsSockDir is flat, I think you should at least have PluginsSockDir/csi and PluginsSockDir/devices with two watcher instances. That way you can avoid name collisions between csi and device plugin names.
under PluginsSockDir but will soon support recursive inotify watch for | ||
hierarchical socket paths. | ||
- A plugin should clean up its own socket upon exiting or when a new instance | ||
comes up. A plugin should NOT remove any sockets belonging to other plugins. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anyway to prevent a plugin from removing sockets dir from another plugin ? This could be a big issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont have a approach for this on top of my mind. Would love to hear suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option would be that of using a directory per plugin where a plugin can be restricted to only create sockets within a specific directory.
Unless plugins stop running as privileged or uid 0
, attempts to secure plugins from one-another would be moot.
different types of node-level plugins such as device plugins or CSI plugins. | ||
It discovers plugins by monitoring inotify events under the directory returned by | ||
kubelet.getPluginsDir(). Lets refer this directory as PluginsSockDir. | ||
For any discovered plugin, pluginwatcher issues Registration.GetInfo grpc call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it too late to turn watcher into a service? The plugin calls the watcher at a prescribed UDS location to register itself instead of the watcher doing inotify on a flat directory. That way the watcher can apply stricter registration rules including voiding name clashes, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i understood correctly, you are suggesting model 1, where plugin starts the communication. As explained the design doc, there is a problem in re-registration case with model 1. Therefore we decided to implement model 2.
flat directory is just for initial implementation and will be extended to hierarchical in successive PR.
I think in model 1 also onus of avoiding plugin name conflict is on plugins. I am not sure how model 1 is better than model 2 in avoiding name clashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vikaschoudhary16 I am concerned with plugin name clash between plugin types of CSI and DevicePlugin plugins. If PluginsSockDir is flat then a plugin name from CSI may clash with a plugin from DevicePlugin.
/retest Review the full test history for this PR. Silence the bot with an |
/status approved-for-milestone |
/test pull-kubernetes-kubemark-e2e-gce-big |
/test pull-kubernetes-integration |
/test pull-kubernetes-e2e-kops-aws |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
We can discuss channel vs interface in another PR / issue / on slack, I'll take care of it :)
) | ||
|
||
// RegisterCallbackFn is the type of the callback function that handlers will provide | ||
type RegisterCallbackFn func(pluginName string, endpoint string, versions []string, socketPath string) (error, chan bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An interface model might be more explicit? i.e:
type RegisterCallback interface {
Validate(...)
Register(...)
}
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jiayingz, RenaudWasTaken, vikaschoudhary16, vishh, vladimirvivien The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process @RenaudWasTaken @jiayingz @vikaschoudhary16 @vishh @vladimirvivien Pull Request Labels
|
/retest Review the full test history for this PR. Silence the bot with an |
Automatic merge from submit-queue (batch tested with PRs 63328, 64316, 64444, 64449, 64453). If you want to cherry-pick this change to another branch, please follow the instructions here. |
@vikaschoudhary16: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This feature is more developers oriented than users oriented, so simply mention it in the feature gate should be fine. In future, when the design doc is migrated from Google doc to the kubernetes/community repo, we can add links to it for users who want to dig deeper. Closes: kubernetes#9108 Xref: kubernetes/kubernetes#63328, kubernetes/kubernetes#64605
* Mention 'KubeletPluginsWatcher' feature This feature is more developers oriented than users oriented, so simply mention it in the feature gate should be fine. In future, when the design doc is migrated from Google doc to the kubernetes/community repo, we can add links to it for users who want to dig deeper. Closes: #9108 Xref: kubernetes/kubernetes#63328, kubernetes/kubernetes#64605 * Copyedit
* Mention 'KubeletPluginsWatcher' feature This feature is more developers oriented than users oriented, so simply mention it in the feature gate should be fine. In future, when the design doc is migrated from Google doc to the kubernetes/community repo, we can add links to it for users who want to dig deeper. Closes: #9108 Xref: kubernetes/kubernetes#63328, kubernetes/kubernetes#64605 * Copyedit
* Mention 'KubeletPluginsWatcher' feature This feature is more developers oriented than users oriented, so simply mention it in the feature gate should be fine. In future, when the design doc is migrated from Google doc to the kubernetes/community repo, we can add links to it for users who want to dig deeper. Closes: #9108 Xref: kubernetes/kubernetes#63328, kubernetes/kubernetes#64605 * Copyedit
* Seperate priority and preemption (#8144) * Doc about PID pressure condition. (#8211) * Doc about PID pressure condition. Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com> * "so" -> "too" * Update version selector for 1.11 * StorageObjectInUseProtection is GA (#8291) * Feature gate: StorageObjectInUseProtection is GA Update feature gate reference for 1.11 * Trivial commit to re-trigger Netlify * CRIContainerLogRotation is Beta in 1.11 (#8665) * Seperate priority and preemption (#8144) * CRIContainerLogRotation is Beta in 1.11 xref: kubernetes/kubernetes#64046 * Bring StorageObjectInUseProtection feature to GA (#8159) * StorageObjectInUseProtection is GA (#8291) * Feature gate: StorageObjectInUseProtection is GA Update feature gate reference for 1.11 * Trivial commit to re-trigger Netlify * Bring StorageObjectInUseProtection feature to GA StorageObjectInUseProtection is Beta in K8s 1.10. It's brought to GA in K8s 1.11. * Fixed typo and added feature state tags. * Remove KUBE_API_VERSIONS doc (#8292) The support to the KUBER_API_VERSIONS environment variable is completely dropped (no deprecation). This PR removes the related doc in release-1.11. xref: kubernetes/kubernetes#63165 * Remove InitialResources from admission controllers (#8293) The feature (was experimental) is dropped in 1.11. xref: kubernetes/kubernetes#58784 * Remove docs related to in-tree support to GPU (#8294) * Remove docs related to in-tree support to GPU The in-tree support to GPU is completely removed in release 1.11. This PR removes the related docs in release-1.11 branch. xref: kubernetes/kubernetes#61498 * Update content updated by PR to Hugo syntax Signed-off-by: Misty Stanley-Jones <mistyhacks@google.com> * Update the doc about extra volume in kubeadm config (#8453) Signed-off-by: Xianglin Gao <xianglin.gxl@alibaba-inc.com> * Update CRD Subresources for 1.11 (#8519) * coredns: update notes in administer-cluster/coredns.md (#8697) CoreDNS is installed by default in 1.11. Add notes on how to install kube-dns instead. Update notes about CoreDNS->CoreDNS upgrades as in 1.11 the Corefile is retained. Add example on upgrading from kube-dns to CoreDNS. * kubeadm-alpha: CoreDNS related changes (#8727) Update note about CoreDNS feature gate. This change also updates a tab as a kubeadm sub-command will change. It looks for a new generated file: generated/kubeadm_alpha_phase_addon_coredns.md instead of: generated/kubeadm_alpha_phase_addon_kube-dns.md * Update cloud controller manager docs to beta 1.11 (#8756) * Update cloud controller manager docs to beta 1.11 * Use Hugo shortcode for feature state * kubeadm-upgrade: include new command `kubeadm upgrade diff` (#8617) Also: - Include note that this was added in 1.11. - Modify the note about upgrade guidance. * independent: update CoreDNS mentions for kubeadm (#8753) Give CoreDNS instead of kube-dns examples in: - docs/setup/independent/create-cluster-kubeadm.md - docs/setup/independent/troubleshooting-kubeadm.md * update 1.11 --server-print info (#8870) * update 1.11 --server-print info * Copyedit * Mark ExpandPersistentVolumes feature to beta (#8778) * Update version selector for 1.11 * Mark ExpandPersistentVolumes Beta xref: kubernetes/kubernetes#64288 * fix shortcode, add placeholder files to fix deploy failures (#8874) * declare ipvs ga (#8850) * kubeadm: update info about CoreDNS in kubeadm-init.md (#8728) Add info to install kube-dns instead of CoreDNS, as CoreDNS is the default DNS server in 1.11. Add notes that kubeadm config images can be used to list and pull the required images in 1.11. * kubeadm: update implementation-details.md about CoreDNS (#8829) - Replace examples from kube-dns to CoreDNS - Add notes about the CoreDNS feature gate status in 1.11 - Add note that the service name for CoreDNS is also called `kube-dns` * Update block device support for 1.11 (#8895) * Update block device support for 1.11 * Copyedits * Fix typo 'fiber channel' (#8957) Signed-off-by: Misty Stanley-Jones <mistyhacks@google.com> * kubeadm-upgrade: add the 'node [config]' sub-command (#8960) - Add includes for the generated pages - Include placeholder generated pages * kubeadm-init: update the example for the MasterConfiguration (#8958) - include godocs link for MasterConfiguration - include example MasterConfiguration - add note that `kubeadm config print-default` can be used * kubeadm-config: include new commands (#8862) Add notes and includes for these new commands in 1.11: - kubeadm config print-default - kubeadm config migrate - kubeadm config images list - kubeadm config images pull Include placeholder generated files for the above. * administer-cluster/coredns: include more changes (#8985) It was requested that for this page a couple of methods should be outlined: - manual installation for CoreDNS explained at the Kubernetes section of the GitHub project for CoreDNS - installation and upgrade via kubeadm Make the above changes and also add a section "About CoreDNS". This commit also lowercases a section title. * Update CRD subresources doc for 1.11 (#8918) * Add docs for volume expansion and online resizing (#8896) * Add docs for volume expansion going beta * Copyedit * Address feedback * Update exec plugin docs with TLS credentials (#8826) * Update exec plugin docs with TLS credentials kubernetes/kubernetes#61803 implements TLS client credential support for 1.11. * Copyedit * More copyedits for clarification * Additional copyedit * Change token->credential * NodeRestriction admission prevents kubelet taint removal (#8911) * dns-custom-namerserver: break down the page into mutliple sections (#8900) * dns-custom-namerserver: break down the page into mutliple sections This page is currently about kube-dns and is a bit outdated. Introduce the heading `# Customizing kube-dns`. Introduce a separate section about CoreDNS. * Copyedits, fix headings for customizing DNS Hey Lubomir, I coypedited pretty heavily because this workflow is so much easier for docs and because I'm trying to help improve everything touching kubeadm as much as possible. But there's one outstanding issue wrt headings and intro content: you can't add a heading 1 to a topic to do what you wanted to do. The page title in the front matter is rendered as a heading 1 and everything else has to start at heading 2. (We still need to doc this better in the docs contributing content, I know.) Instead, I think we need to rewrite the top-of-page intro content to explain better the relationship between kube-dns and CoreDNS. I'm happy to write something, but I thought I'd push this commit first so you can see what I'm doing. Hope it's all clear -- ping here or on Slack with any questions ~ Jennifer * Interim fix for talking about CoreDNS * Fix CoreDNS details * PSP readOnly hostPath (#8898) * Add documentation for crictl (#8880) * Add documentation for crictl * Copyedit Signed-off-by: Misty Stanley-Jones <mistyhacks@google.com> * Final copyedit * VolumeSubpathEnvExpansion alpha feature (#8835) * Note that Heapster is deprecated (#8827) * Note that Heapster is deprecated This notes that Heapster is deprecated, and migrates the relevant docs to talk about metrics-server or other solutions by default. * Copyedits and improvements Signed-off-by: Misty Stanley-Jones <mistyhacks@google.com> * Address feedback * fix shortcode to troubleshoot deploy (#9057) * update dynamic kubelet config docs for v1.11 (#8766) * update dynamic kubelet config docs for v1.11 * Substantial copyedit * Address feedback * Reference doc for kubeadm (release-1.11) (#9044) * Reference doc for kubeadm (release-1.11) * fix shortcode to troubleshoot deploy (#9057) * Reference doc for kube-components (release-1.11) (#9045) * Reference doc for kube-components (release-1.11) * Update cloud-controller-manager.md * fix shortcode to troubleshoot deploy (#9057) * Documentation on lowercasing kubeadm init apiserver SANs (#9059) * Documentation on lowercasing kubeadm init apiserver SANs * fix shortcode to troubleshoot deploy (#9057) * Clarification in dynamic Kubelet config doc (#9061) * Promote sysctls to Beta (#8804) * Promote sysctls to Beta * Copyedits Signed-off-by: Misty Stanley-Jones <mistyhacks@google.com> * Review comments * Address feedback * More feedback * kubectl reference docs for 1.11 (#9080) * Update Kubernetes API 1.11 ref docs (#8977) * Update v1alpha1 to v1beta1. * Adjust left nav for 1.11 ref docs. * Trim list of old ref docs. * Update Federation API ref docs for 1.11. (#9064) * Update Federation API ref docs for 1.11. * Add titles. * Update definitions.html * CRD versioning Public Documentation (#8834) * CRD versioning Public Documentation * Copyedit Signed-off-by: Misty Stanley-Jones <mistyhacks@google.com> * Address feedback * More rewrites * Address feedback * Update main CRD page in light of versioning * Reorg CRD docs * Further reorg * Tweak title * CSI documentation update for raw block volume support (#8927) * CSI documetation update for raw block volume support * minor edits for "CSI raw block volume support" Some small grammar and style nits. * minor CSIBlockVolume edits * Update kubectl component ref page for 1.11. (#9094) * Update kubectl component ref page for 1.11. * Add title. Replace stevepe with username. * crd versioning doc: fix nits (#9142) * Update `DynamicKubeletConfig` feature to beta (#9110) xref: kubernetes/kubernetes#64275 * Documentation for dynamic volume limits based on node type (#8871) * add cos for storage limits * Update docs specific for aws and gce * fix some minor things * Update storage-limits.md * Add k8s version to feature-state shortcode * The Doc update for ScheduleDaemonSetPods (#8842) Signed-off-by: Da K. Ma <klaus1982.cn@gmail.com> * Update docs related to PersistentVolumeLabel admission control (#9109) The said admission controller is disabled by default in 1.11 (kubernetes/kubernetes#64326) and scheduled to be removed in future release. * client exec auth: updates for 1.11 (#9154) * Updates HA kubeadm docs (#9066) * Updates HA kubeadm docs Signed-off-by: Chuck Ha <ha.chuck@gmail.com> * kubeadm HA - Add stacked control plane steps * ssh instructions and some typos in the bash scripts Signed-off-by: Chuck Ha <ha.chuck@gmail.com> * Fix typos and copypasta errors * Fix rebase issues * Integrate more changes Signed-off-by: Chuck Ha <ha.chuck@gmail.com> * copyedits, layout and formatting fixes * final copyedits * Adds a sanity check for load balancer connection Signed-off-by: Chuck Ha <ha.chuck@gmail.com> * formatting fixes, copyedits * fix typos, formatting * Document the Pod Ready++ feature (#9180) Closes: #9107 Xref: kubernetes/kubernetes#64057 * Mention 'KubeletPluginsWatcher' feature (#9177) * Mention 'KubeletPluginsWatcher' feature This feature is more developers oriented than users oriented, so simply mention it in the feature gate should be fine. In future, when the design doc is migrated from Google doc to the kubernetes/community repo, we can add links to it for users who want to dig deeper. Closes: #9108 Xref: kubernetes/kubernetes#63328, kubernetes/kubernetes#64605 * Copyedit * Amend dynamic volume list docs (#9181) The dynamic volume list feature has been documented but the feature gate related was not there yet. Closes: #9105 * Document for service account projection (#9182) This adds docs for the service account projection feature. Xref: kubernetes/kubernetes#63819, kubernetes/community#1973 Closes: #9102 * Update pod priority and preemption user docs (#9172) * Update pod priority and preemption user docs * Copyedit * Documentation on setting node name with Kubeadm (#8925) * Documentation on setting node name with Kubeadm * copyedit * Add kubeadm upgrade docs for 1.11 (#9089) * Add kubeadm upgrade docs for 1.11 * Initial docs review feedback * Add 1-11 to outline * Fix formatting on tab blocks * Move file to correct location * Add `kubeadm upgrade node config` step * Overzealous ediffing * copyedit, fix lists and headings * clarify --force flag for fixing bad state * Get TOML ready for 1.11 release * Blog post for 1.11 release (#9254) * Blog post for 1.11 release * Update 2018-06-26-kubernetes-1.11-release-announcement.md * Update 2018-06-26-kubernetes-1.11-release-announcement.md * Update 2018-06-26-kubernetes-1.11-release-announcement.md
Which issue(s) this PR fixes
Fixes #56944
Design Doc
Notes For Reviewers:
Original PR is #59963. But because of too many comments(171) that PR does not open sometimes. Therefore this new PR is created to get the github link working.
Related PR is #58755
For review efficiency, separating out of the commits or original PR here.
/sig node
/area hw-accelerators
/cc @jiayingz @RenaudWasTaken @vishh @ScorpioCPH @sjenning @derekwaynecarr @jeremyeder @lichuqiang @tengqm @saad-ali @chakri-nelluri @ConnorDoyle @vladimirvivien