Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't show cluster addons by default #9356

Closed
bgrant0607 opened this issue Jun 6, 2015 · 88 comments
Closed

Don't show cluster addons by default #9356

bgrant0607 opened this issue Jun 6, 2015 · 88 comments
Assignees
Labels
area/kubectl priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Milestone

Comments

@bgrant0607
Copy link
Member

We've discussed using labels or even namespaces so that cluster addons don't show up in kubectl output by default. Users appear confused by this when they get pods, rcs, services.

Addons are labeled, so this filtering should be possible.

@bgrant0607 bgrant0607 added priority/backlog Higher priority than priority/awaiting-more-evidence. area/kubectl team/cluster sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Jun 6, 2015
@davidopp
Copy link
Member

davidopp commented Jun 6, 2015

But users are also confused when the node has less available resources than the machine's capacity. I think hiding this stuff may be more confusing than not.

@bgrant0607
Copy link
Member Author

The problem is most severe when working through an example:

  • create one service and RC and see half a dozen of each
  • create 3 pods and see a dozen

Perhaps describe node should indicate what's running on it -- at least number of pods and amount of resources reserved. Probably would need to add that to node status since users won't be able to get all the pods when authz is enabled.

@davidopp
Copy link
Member

davidopp commented Jun 6, 2015

I really like the idea of "describe node" showing all the pods running on it, and their limits/requests/whatever. As you know, we have this in Borg, and it's definitely the most useful debugging page we have. I think that would make it more reasonable to hide the cluster addon pods.

But as a secondary issue, are you implying that when we have authz, you won't be able to see pods from other namespaces when you use kubectl and the API directly? If so, I really hope that behavior is optional (configed by administrator). I think most organizations will want to prevent users/groups from mutating cluster state belonging to other users/groups, but seeing the cluster state is a different matter. Allowing users to see all the details of what other users are running is tremendously helpful for self-service problem debugging (i.e. I don't think saying just admins can see everything is sufficient). Just aggregating and displaying the limit/request of pods-not-belonging-to-you isn't sufficient either IMO (though it's better than nothing) because users still may want to see everything to better understand resource interference effects from other pods on the same machine, etc.

@erictune
Copy link
Member

erictune commented Jun 8, 2015

The two authorization policies that make the most sense to me are:

  • typical users can read node objects. - Borg model.
  • typical users cannot read node objects. - GCE model.
    Brian seems to be implying some middle ground which I hadn't considered.

In either model, it makes sense to me to put the addons in a different namespace than is used by a typical user. This has nothing to do with authorization, and just has to do with ensuring that I don't have a name collision with objects I don't care about.

Hiding them with labels seems bad because you don't see them but you can still have a name collision. Unlikely, but it feels like the wrong solution.

@bgrant0607
Copy link
Member Author

I prefer putting the addons in a different namespace. The main reason they aren't now is lack of convergence of previous rounds of discussion.

@bgrant0607
Copy link
Member Author

One of the previous discussions (though I think main discussion was offline): #4417 (comment)

@bgrant0607
Copy link
Member Author

cc @piosz

@erictune
Copy link
Member

erictune commented Jun 8, 2015

cc @thockin

@davidopp
Copy link
Member

davidopp commented Jun 8, 2015

I'm not really objecting to putting them in a single namespace, I'm just saying that the Borg experience suggests that it's absolutely crucial that the user be able to see everything that's running on the machine, and the associated resource request/usage. So I like Brian's suggestion of putting this in "describe node" (and, if we really must have some super-secretive mode, which honestly I don't see the value of, then we could do as Brian suggested and just aggregate the request/usage information of all of the pods that don't belong to the user, into one opaque bucket--but this should not be the default way the system works).

@dchen1107
Copy link
Member

"kubectl describe nodes" already display which pods on it, but lack of namespace and resource related information.

@dchen1107
Copy link
Member

But I don't think it is easy to retrieve resource usage related information for pods today. node team did export such information long time back, but removed due to API refactory later. Also please note that there is no Pod level resource request at our API yet.

@bgrant0607
Copy link
Member Author

@davidopp They don't all need to go into a single namespace, just not in the default namespace.

@bgrant0607 bgrant0607 added this to the v1.0-candidate milestone Jun 8, 2015
@bgrant0607
Copy link
Member Author

@dchen1107 At the moment, we don't need actual resource usage. We only need resource limits, since that's the only info used in scheduling.

@derekwaynecarr
Copy link
Member

@davidopp - the user can list namespaces, so if I understand this correctly, it's not that the user will not be able to find the pods, they just have to get namespaces, and then get pods in that namespace.

Is there a proposed name of the namespace that holds add-ons?

@dchen1107
Copy link
Member

@bgrant0607 I agreed that we don't need resource usage now. But for resource limit, it is only at container level, no pod level. Are we going to show all containers too when describe node?

@davidopp
Copy link
Member

davidopp commented Jun 8, 2015

@erictune pointed out that the "describe nodes" model where it only shows your own containers, and nothing about other containers (even aggregate request or usage), is important for people who use k8s to run a hosting provider. So I could see an admin wanting to configure any of the three options we've discussed here (Borg model, GCE model, GCE model but showing you only aggregate stats about stuff that doesn't belong to you). @bgrant0607 pointed out that the right way to define "your own" in this context is "your namespace and all other namespaces you have access to" (and needs to specifically include the namespace that we put the addons into, if we put addons in a separate namespace. which I now realize means that each other's addons would need to be in a separate namespace, as opposed to putting all addons in one namespace.)

We also discussed the main question, of whether the right way to avoid showing them by default is to put them in a separate namespace or just use labels and then by default don't show stuff with those labels. The former seems more principled and seems to be what people are leaning towards, but I think we need to think through the implications for the features that are namespace-based, like admission control, quota, LimitRange, billing (I assume we'll have that eventually), etc. None of the stuff the user sets up for their own pods (with respect to these features) would apply by default to the cluster addons if we put the cluster addons in a separate namespace. This might be a good or bad thing.

@dchen1107 I think it would be good to show all containers in "describe node" (for the same reasons it's useful that we show in-alloc tasks in Borg machine debug output, not just allocs).

@dchen1107
Copy link
Member

Please don't put all containers related information when describe a node. I believe very soon we are going to get people complaining about describe nodes useless like what we had for kubectl get pod. I spent a lot of time on arguing with ppl that you can use describe pod to get detail information about a pod, please don't make get pod too verbose, but failed. Now we reset everything back to the original state. I don't want to get into the same situation again with describe node.

But I do feel there are several missing pieces for the user to figure out what's going on with the node. #9356 is filed to include namespace for each pod running on a node, so that the user can easily get / describe that pod for more information. #9435 is filed to check namespace specific quota and other resources belong to the same namespace.

There is no pod-level resource limit at API level, but we could have it by summing up all container's limits.

So here is the complete proposal:

kubectl describe node id will include

  1. all information from NodeStatus, which includes resource capacity, system info like daemon versions, creation timestamp, NodeCondition, etc.
  2. pods and corresponding namespace, and pod-level resource request / limit
  3. node level events

kubectl describe pod id will include

  1. all information from PodStatus: Condition, Phase, namespace, etc.
  2. all user-defined container (ruled out network container) including per container's resource request / limit
  3. pod and container level events.

With the fix, I am ok to put addson into a separate namespace then. :-)

@davidopp
Copy link
Member

davidopp commented Jun 8, 2015

@dchen1107 mentioned the more verbose output suggestion I made is related to #3500 and could be handled by a separate command as suggested there.

@davidopp
Copy link
Member

davidopp commented Jun 8, 2015

sorry I meant the verbose output suggestion Brian made :-) (but I agree with it)

@bgrant0607
Copy link
Member Author

Yes, kubernetes.

Mirror pods should take their namespace from their config files on Kubelet, and Kubelet should reject the config if the namespace is missing. We should update our static pods to put them into the kubernetes namespace. @yujuhong

@yujuhong
Copy link
Contributor

Mirror pods should take their namespace from their config files on Kubele

We already do that, but would use the default namespace if it's not specified.

Kubelet should reject the config if the namespace is missing.

Are we enforcing namespaces for all config files? What's wrong with using the "default" (or any other default) namespace if user doesn't specify one? Isn't it the same as the regular pods created through the apiserver?

@thockin
Copy link
Member

thockin commented Jun 24, 2015

Yeah, we could just audit all our own file-based configs to specify
namespace.

On Wed, Jun 24, 2015 at 11:39 AM, Yu-Ju Hong notifications@github.com
wrote:

Mirror pods should take their namespace from their config files on Kubele

We already do that, but would use the default namespace if it's not
specified.

Kubelet should reject the config if the namespace is missing.

Are we enforcing namespaces for all config files? What's wrong with using
the "default" (or any other default) namespace if user doesn't specify one?
Isn't it the same as the regular pods created through the apiserver?


Reply to this email directly or view it on GitHub
#9356 (comment)
.

@satnam6502
Copy link
Contributor

Brendan is already only it so I have re-assigned.

@smarterclayton
Copy link
Contributor

@derekwaynecarr @deads2k our admission controllers and kubelet acl should change to reflect this in our next major - we should only grant pod create to kubelet clients in the Kubernetes namespace

On Jun 24, 2015, at 2:34 PM, Brian Grant notifications@github.com wrote:

Yes, kubernetes.

Mirror pods should take their namespace from their config files on Kubelet, and Kubelet should reject the config if the namespace is missing. We should update our static pods to put them into the kubernetes namespace. @yujuhong


Reply to this email directly or view it on GitHub.

@bgrant0607
Copy link
Member Author

@yujuhong Ok, fair enough, we could default to default. If that's what we do now, that's fine.

@alex-mohr
Copy link
Contributor

#10487 is the current PR, with discussion as to whether we're going to try to get this into v1 or whether we should push it to later.

@zmerlynn
Copy link
Member

zmerlynn commented Jul 5, 2015

Re-requesting triage on this. We were unable to get PR #10487 merged, and unless it gets merged maybe early tomorrow (Sunday) morning, there's not really enough time to shake out whether it needs to be rolled back for a Monday release cut.

@goltermann
Copy link
Contributor

@zmerlynn Triaged today - quorum was to push this in before today's release, and get test cycles in this week.

@zmerlynn
Copy link
Member

zmerlynn commented Jul 6, 2015

This can't go in right before a release is cut. It needs 2-3 good Jenkins cycles before we can cut a release.

@zmerlynn
Copy link
Member

zmerlynn commented Jul 6, 2015

(Or, y'know, at least one - and maybe a sidelong glance at scalability.)

@zmerlynn
Copy link
Member

zmerlynn commented Jul 6, 2015

But let's get some leads on here to LGTM up this PR.

@thockin
Copy link
Member

thockin commented Jul 6, 2015

Discussed this morning - I think we're OK to go - @brendandburns @bgrant0607

On Mon, Jul 6, 2015 at 10:06 AM, Zach Loafman notifications@github.com
wrote:

But let's get some leads on here to LGTM up this PR.


Reply to this email directly or view it on GitHub
#9356 (comment)
.

@zmerlynn
Copy link
Member

zmerlynn commented Jul 6, 2015

If someone gives it another review, I'll happily warn the oncall.

@bgrant0607 bgrant0607 added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jul 6, 2015
@roberthbailey
Copy link
Contributor

#10817 is still outstanding to finish this off.

@bgrant0607
Copy link
Member Author

@satnam6502
Copy link
Contributor

Argh! Sorry about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubectl priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests