-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add software versions to "kubectl get nodes -o wide" output #25579
Comments
I think we agreed to at least put OS and arch in SystemInfo and labels. @bgrant0607 didn't like the idea of automatically populating everything from SystemInfo into labels, but my opinion is that we could use namespacing to make that user-friendly/not-overwhelming. But for 1.3 we got agreement that at least these two can go in SystemInfo and labels. |
/cc |
@dchen1107 How exactly do you intended to use the information? We definitely need a way to surface more node attributes, from the node, in a customizable way. See also #9044. Should attributes be represented as API fields, as labels, or both? If our systems are going to depend on the presence of specific information, the attributes should be represented as fields or as annotations. For instance, I could imagine taking Kubelets offline or updating them based on the reported Kubelet version, so it makes sense for that to be a field. Fields are also clearly under the API contract (e.g., we won't delete them without notice) and are versioned. Fields have the additional benefit that they are easily discoverable because they appear in auto-generated docs, swagger, client libraries, etc. OTOH, for the most part, I'd like labels to not carry inherent semantics. Of course, they have meaning to users, but that's what they were intended for. Consumers should use selectors to perform the desired matching. Ok, so API clients should consume information via fields and a user specifying a node constraint should use labels, at least today. Does that mean we should duplicate all of the information in both places? I don't think that's necessary, and it also causes problems. As a practical matter, not all attributes are easily expressible as labels. We deliberately constrained the label syntax:
Even the key names use different conventions (camelCase vs. hyphenated lowercase). In Borg we exported lots of constrainable attributes by default and while some people found it useful, it was also an attractive nuisance -- the existence of the constraints created risk and increased operational complexity. A lot of the attributes are low-level properties that are likely to change as the cluster evolves: software is upgraded, OS distros become obsolete and are replaced, machines are resized, etc. Therefore, unnecessary constraints on such attributes cause problems when the constraints can no longer be satisfied. Additionally, the large number of attributes makes node information harder to consume (signal/noise ratio). So we need to be selective about what properties are exposed as labels and how they are represented, and we need to be clear about what the API contract is for those labels, since they are unversioned at present. If the properties aren't necessarily needed for scheduling constraints, then I'd recommend we implement field selectors instead. #1362, #18801, #19804 If selectors are only needed for a few fields, the ad hoc approach we've used for other selectors could be used. |
I agree that a good criteria is to only put things in labels if they might reasonably be used in scheduling. For the debugging use case Dawn mentioned, we can just put all of NodeStatus.NodeSystemInfo in 'kubectl describe node' (if it's not there already) -- @dchen1107 would that address the debugging use case? |
Some of the properties could be added to |
I am not suggesting to having those for scheduler at all. I understand how much pain to use those labels for scheduling. Here are the scenario which I think should be very useful, especially for a large cluster which has mixed versions (kubelet, kube-proxy, docker, etc...) Cluster admin is alerted that many nodes went to bad, and they want to do a quick scan on the nodes based on node attributes: os image, kernel version, kubelet version, ... kubectl lsnode --groupbylabel=kernel_full_version $not_ready_nodes Then they can reasonably think the alert is caused by the new kernel release rollout. They can make a quick decision on pause, rollback, or patch the node while investigating the root cause. In this case, 'kubectl describe node' doesn't help, but kubectl get nodes -o wide should be fine except ppl used to complain the output is too wide and hard to process for human being. |
Node currently has only 3 columns. 3 more doesn't seem like a problem. |
I didn't see corresponding versions in the "kubectl get nodes -o wide" right now(code from master branch), and I'd like to investigate on adding the kubelet version/osImage/kernel version to the command's output. Please let me know if this is invalid now. |
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to "kubectl get nodes -o wide" output. This will help to provide more information for user to locate or debug issues. See discussion in ticket kubernetes#25579
Add these node attributes to node wide output sounds reasonable to me. @xingzhou Thanks for this PR. |
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to "kubectl get nodes -o wide" output. This will help to provide more information for user to locate or debug issues. See discussion in ticket kubernetes#25579
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to "kubectl get nodes -o wide" output. This will help to provide more information for user to locate or debug issues. See discussion in ticket kubernetes#25579
Automatic merge from submit-queue (batch tested with PRs 37845, 39439, 39514, 39457, 38866) Add software versions to "kubectl get nodes -o wide" output. Added "OS-IMAGE" and "KERNEL-VERSION" two columns to "kubectl get nodes -o wide" output. This will help to provide more information for user to locate or debug issues. See discussion in ticket #25579
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to "kubectl get nodes -o wide" output. This will help to provide more information for user to locate or debug issues. See discussion in ticket kubernetes#25579
This has been fixed in #38866. We can close it. |
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to "kubectl get nodes -o wide" output. This will help to provide more information for user to locate or debug issues. See discussion in ticket kubernetes#25579
Forked from #23684 ...
There are debate over that pr on system automatically publishs the fields in NodeStatus.NodeSystemInfo. I don't fully understand the concern, but in my opinion, the advantage of publishing those labels will help debugging and analyzing the issues related to different OSImage, RuntimeVersion, KubeletVersion, etc. especially for a large cluster (> 100 nodes).
cc/ @bgrant0607 @thockin @davidopp @vishh
The text was updated successfully, but these errors were encountered: