Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add software versions to "kubectl get nodes -o wide" output #25579

Closed
dchen1107 opened this issue May 13, 2016 · 10 comments
Closed

Add software versions to "kubectl get nodes -o wide" output #25579

dchen1107 opened this issue May 13, 2016 · 10 comments
Labels
area/kubectl priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@dchen1107
Copy link
Member

Forked from #23684 ...

There are debate over that pr on system automatically publishs the fields in NodeStatus.NodeSystemInfo. I don't fully understand the concern, but in my opinion, the advantage of publishing those labels will help debugging and analyzing the issues related to different OSImage, RuntimeVersion, KubeletVersion, etc. especially for a large cluster (> 100 nodes).

cc/ @bgrant0607 @thockin @davidopp @vishh

@davidopp
Copy link
Member

I think we agreed to at least put OS and arch in SystemInfo and labels. @bgrant0607 didn't like the idea of automatically populating everything from SystemInfo into labels, but my opinion is that we could use namespacing to make that user-friendly/not-overwhelming. But for 1.3 we got agreement that at least these two can go in SystemInfo and labels.

@luxas
Copy link
Member

luxas commented May 13, 2016

/cc

@bgrant0607
Copy link
Member

@dchen1107 How exactly do you intended to use the information?

We definitely need a way to surface more node attributes, from the node, in a customizable way. See also #9044.

Should attributes be represented as API fields, as labels, or both?

If our systems are going to depend on the presence of specific information, the attributes should be represented as fields or as annotations. For instance, I could imagine taking Kubelets offline or updating them based on the reported Kubelet version, so it makes sense for that to be a field. Fields are also clearly under the API contract (e.g., we won't delete them without notice) and are versioned.

Fields have the additional benefit that they are easily discoverable because they appear in auto-generated docs, swagger, client libraries, etc.

OTOH, for the most part, I'd like labels to not carry inherent semantics. Of course, they have meaning to users, but that's what they were intended for. Consumers should use selectors to perform the desired matching.

Ok, so API clients should consume information via fields and a user specifying a node constraint should use labels, at least today. Does that mean we should duplicate all of the information in both places?

I don't think that's necessary, and it also causes problems.

As a practical matter, not all attributes are easily expressible as labels. We deliberately constrained the label syntax:

const qnameCharFmt string = "[A-Za-z0-9]"
const qnameExtCharFmt string = "[-A-Za-z0-9_.]"
const QualifiedNameFmt string = "(" + qnameCharFmt + qnameExtCharFmt + "*)?" + qnameCharFmt
const QualifiedNameMaxLength int = 63

Even the key names use different conventions (camelCase vs. hyphenated lowercase).

In Borg we exported lots of constrainable attributes by default and while some people found it useful, it was also an attractive nuisance -- the existence of the constraints created risk and increased operational complexity. A lot of the attributes are low-level properties that are likely to change as the cluster evolves: software is upgraded, OS distros become obsolete and are replaced, machines are resized, etc. Therefore, unnecessary constraints on such attributes cause problems when the constraints can no longer be satisfied. Additionally, the large number of attributes makes node information harder to consume (signal/noise ratio).

So we need to be selective about what properties are exposed as labels and how they are represented, and we need to be clear about what the API contract is for those labels, since they are unversioned at present.

If the properties aren't necessarily needed for scheduling constraints, then I'd recommend we implement field selectors instead. #1362, #18801, #19804

If selectors are only needed for a few fields, the ad hoc approach we've used for other selectors could be used.
https://github.com/kubernetes/kubernetes/blob/master/pkg/api/field_constants.go
https://github.com/kubernetes/kubernetes/blob/master/pkg/api/v1/conversion.go#L137

@bgrant0607 bgrant0607 added sig/node Categorizes an issue or PR as relevant to SIG Node. team/api labels May 14, 2016
@davidopp
Copy link
Member

If the properties aren't necessarily needed for scheduling constraints, then I'd recommend we implement field selectors instead

I agree that a good criteria is to only put things in labels if they might reasonably be used in scheduling. For the debugging use case Dawn mentioned, we can just put all of NodeStatus.NodeSystemInfo in 'kubectl describe node' (if it's not there already) -- @dchen1107 would that address the debugging use case?

@bgrant0607 bgrant0607 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label May 16, 2016
@bgrant0607
Copy link
Member

Some of the properties could be added to kubectl get nodes -o wide also.

@dchen1107
Copy link
Member Author

I am not suggesting to having those for scheduler at all. I understand how much pain to use those labels for scheduling.

Here are the scenario which I think should be very useful, especially for a large cluster which has mixed versions (kubelet, kube-proxy, docker, etc...)

Cluster admin is alerted that many nodes went to bad, and they want to do a quick scan on the nodes based on node attributes: os image, kernel version, kubelet version, ...
not_ready_nodes=$(kubect get nodes -o wide | cut ...) ==> get a list of not_ready node names.

kubectl lsnode --groupbylabel=kernel_full_version $not_ready_nodes
3.16.7-ckt20-1+deb8u4 1
3.18.1 913

Then they can reasonably think the alert is caused by the new kernel release rollout. They can make a quick decision on pause, rollback, or patch the node while investigating the root cause.

In this case, 'kubectl describe node' doesn't help, but kubectl get nodes -o wide should be fine except ppl used to complain the output is too wide and hard to process for human being.

@bgrant0607
Copy link
Member

Node currently has only 3 columns. 3 more doesn't seem like a problem.

@bgrant0607 bgrant0607 changed the title Discuss: Automatically publish fields in NodeStatus.NodeSystemInfo as labels Add software versions to "kubectl get nodes -o wide" output May 31, 2016
@xingzhou
Copy link
Contributor

xingzhou commented Dec 5, 2016

I didn't see corresponding versions in the "kubectl get nodes -o wide" right now(code from master branch), and I'd like to investigate on adding the kubelet version/osImage/kernel version to the command's output. Please let me know if this is invalid now.

xingzhou added a commit to xingzhou/kubernetes that referenced this issue Dec 16, 2016
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to
"kubectl get nodes -o wide" output. This will help to provide
more information for user to locate or debug issues. See discussion
in ticket kubernetes#25579
@adohe-zz
Copy link

Add these node attributes to node wide output sounds reasonable to me. @xingzhou Thanks for this PR.

xingzhou added a commit to xingzhou/kubernetes that referenced this issue Dec 27, 2016
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to
"kubectl get nodes -o wide" output. This will help to provide
more information for user to locate or debug issues. See discussion
in ticket kubernetes#25579
xingzhou added a commit to xingzhou/kubernetes that referenced this issue Jan 9, 2017
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to
"kubectl get nodes -o wide" output. This will help to provide
more information for user to locate or debug issues. See discussion
in ticket kubernetes#25579
k8s-github-robot pushed a commit that referenced this issue Jan 9, 2017
Automatic merge from submit-queue (batch tested with PRs 37845, 39439, 39514, 39457, 38866)

Add software versions to "kubectl get nodes -o wide" output.

Added "OS-IMAGE" and "KERNEL-VERSION" two columns to
"kubectl get nodes -o wide" output. This will help to provide
more information for user to locate or debug issues. See discussion
in ticket #25579
jayunit100 pushed a commit to jayunit100/kubernetes that referenced this issue Jan 13, 2017
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to
"kubectl get nodes -o wide" output. This will help to provide
more information for user to locate or debug issues. See discussion
in ticket kubernetes#25579
@xiangpengzhao
Copy link
Contributor

This has been fixed in #38866. We can close it.

cc @thockin @dchen1107

@luxas luxas closed this as completed Jun 3, 2017
berryjam pushed a commit to berryjam/kubernetes that referenced this issue Aug 18, 2017
Added "OS-IMAGE" and "KERNEL-VERSION" two columns to
"kubectl get nodes -o wide" output. This will help to provide
more information for user to locate or debug issues. See discussion
in ticket kubernetes#25579
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubectl priority/backlog Higher priority than priority/awaiting-more-evidence. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

7 participants