Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RBAC authorization API group and authorizer #25634

Merged
merged 7 commits into from
May 26, 2016

Conversation

ericchiang
Copy link
Contributor

@ericchiang ericchiang commented May 15, 2016

Alpha version of "Role Based Access Control" API.

This PR updates #23396 by adding the necessary registry and client code for storage of the openshift authorization types added in #24900.

Still need to add restrictions to role bindings to prevent privilege escalation. See discussion [0] and openshift implementation[1]

EDIT: Added an implementation of rule escalation prevention similar to [2]

Adding an API group is extremely poorly documented, so there's a good chance I've missed something. For now I've been going off discussion in #25562, but please let me know if there's anything wrong.

TODO: lost of testing

cc @erictune @sym3tri @philips @bobbyrullo @liggitt @deads2k

[0] #24900 (comment)
[1] https://github.com/openshift/origin/blob/388478c40e751c4295dcb9a44dd69e5ac65d0e3b/pkg/authorization/registry/rolebinding/registry.go#L31-L34
[2] https://github.com/openshift/origin/blob/master/pkg/authorization/rulevalidation/user_covers.go

@k8s-github-robot k8s-github-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. release-note-label-needed labels May 16, 2016
@bprashanth
Copy link
Contributor

failure seems legit

./kube-apiserver
./e2e_node.test
panic: NegotiatedSerializer is required when initializing a RESTClient

goroutine 1 [running]:
panic(0x1ad3a20, 0xc820330b50)
    /usr/local/go/src/runtime/panic.go:464 +0x3e6
k8s.io/kubernetes/pkg/client/unversioned.NewOrDie(0xc82050d948, 0xc8200a4dc0)
    /var/lib/jenkins/workspace/node-pull-build-e2e-test/go/src/k8s.io/kubernetes/pkg/client/unversioned/helper.go:226 +0x68
k8s.io/kubernetes/test/e2e_node.NewDefaultFramework(0x1e6f020, 0xe, 0x4444ce)
    /var/lib/jenkins/workspace/node-pull-build-e2e-test/go/src/k8s.io/kubernetes/test/e2e_node/util.go:35 +0x75
k8s.io/kubernetes/test/e2e_node.glob.func5.1()
    /var/lib/jenkins/workspace/node-pull-build-e2e-test/go/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:43 +0x33
k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/internal/suite.(*Suite).PushContainerNode(0xc82038a780, 0x20a7e80, 0x2a, 0x2241b88, 0x0, 0x2afc489, 0x6a, 0x51, 0xc8200a5080, 0x28d)
    /var/lib/jenkins/workspace/node-pull-build-e2e-test/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/internal/suite/suite.go:132 +0x235
k8s.io/kubernetes/vendor/github.com/onsi/ginkgo.Context(0x20a7e80, 0x2a, 0x2241b88, 0xc820230af0)
    /var/lib/jenkins/workspace/node-pull-build-e2e-test/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/ginkgo_dsl.go:300 +0xaf
k8s.io/kubernetes/test/e2e_node.glob.func5()
    /var/lib/jenkins/workspace/node-pull-build-e2e-test/go/src/k8s.io/kubernetes/test/e2e_node/kubelet_test.go:81 +0x38
k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/internal/suite.(*Suite).PushContainerNode(0xc82038a780, 0xc820330160, 0x10, 0x2241ba0, 0x0, 0x2af85a9, 0x6c, 0x1fc, 0xc820356f20, 0x15f)
    /var/lib/jenkins/workspace/node-pull-build-e2e-test/go/src/k8s.io/kubernetes/vendor/github.com/onsi/ginkgo/internal/suite/suite.go:132 +0x235
k8s.io/kubernetes/vendor/github.com/onsi/ginkgo.Describe(0xc820330160, 0x10, 0x2241ba0, 0x1e3e

@ericchiang
Copy link
Contributor Author

@batikanu thanks! Updated. The tests should (hopefully) be happier now.

@ericchiang
Copy link
Contributor Author

Working on the integration test failure but otherwise this is ready to review.

@ericchiang ericchiang changed the title Add RBAC authorization API group: wip Add RBAC authorization API group May 17, 2016
@erictune
Copy link
Member

So, can you or @liggitt say more about how ConfirmNoEscalation looks at multiple objects but doesn't need to multi-object transaction?

GetRoleReferenceRules(ctx api.Context, roleRef api.ObjectReference, namespace string) ([]rbac.PolicyRule, error)

// GetEffectivePolicyRules returns the list of rules that apply to a given user in a given namespace and error. If an error is returned, the slice of
// PolicyRules may not be complete, but it contains all retrievable rules. This is done because policy rules are purely additive and policy determinations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt may not be complete because of eventual consistency in the storage system? Or something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of a dead/missing/unobserved roleref

@ericchiang
Copy link
Contributor Author

@erictune is this possible yet without etcd v3 support?[0] Or is this already supported in the REST storage interface?

[0]#2904

@liggitt
Copy link
Member

liggitt commented May 17, 2016

in openshift, we have an authorizer cache that watches clusterroles/roles and bindings. Even authorization decisions can lag. The no escalation check just ensures the user had equivalent powers within the watch lag. If a user had a role removed and instantly attempted to grant a role with those removed permissions, it could succeed if the authz cache hadn't seen the role removal yet (just like any other API call trying to use those removed permissions could prior to the cache observing the update)

@@ -0,0 +1,59 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do complex validation for job, I put that in the Strategy's Validate function.
Why is this implemented as a Storage rather than a Strategy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any other types that have their own storage.go, but I see many that define a custom Strategy. I think you should use that approach if possible. See pkg/registry/secret/strategy.go for example. Unless there is some reason it needs to be at the Storage layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AuthorizationRuleResolver needs a storage, so it's hard to initialize a storage without and existing one. That's the main logic for having this logic separated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also what openshift does (not to say that that makes it appropriate for kubernetes) https://github.com/openshift/origin/blob/388478c40e751c4295dcb9a44dd69e5ac65d0e3b/pkg/authorization/registry/role/policybased/virtual_storage.go

We created our storage long before strategies existed. Without virtual resources, I don't see a strong case for keeping them. I think that Validate and ValidateUpdate provide sufficient power to express the ConfirmNoEscalation case. The cyclical dependency problem due to a lack of separation between a read-only storage and a read-write storage is problematic though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, my mistake, s/validation/storage/

Copy link
Contributor

@smarterclayton smarterclayton May 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been following this PR terribly closely, but I think one
significant disadvantage is that you can't be certain a role change takes
effect across the entire cluster easily (can't manage centrally). That's
certainly worrying for an administrator.

Multilevel security policy is not uncommon - I would hesitate to say that a
tested multilevel system is no more at risk than a tested flattened system.

Are there other advantages to propagation that inclines you to prefer it?

If we wanted to make it so that NamespacedRoleBindings could not reference a ClusterRole, then we could have a controller that watches for new namespace (like token controller does) and then automatically creates Roles in each new namespace when the namespace. Those roles would explicitly include the namespace that they are restricted to, rather than it being implied, which might somewhat reduce the chance of bugs. This would have about the same level of convenience. It would also have the property that the per-namespace default policy would not automatically update when the cluster is upgraded, which could be good, but could also be bad if it means that you have to update each namespace's policy when a new Kind is added to the system that you want to add for certain roles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#25787 seems to have broken the ability to make the type assertions these checks rely on[0] since the runtime.Object is now a rest.UpdatedObjectInfo which hides the underlying object being updated to.

Would it be reasonable to pull the privilege escalation checks from this PR and revisit a more robust multilevel security system for beta? Based on the conversation here (and the above issue) it seems like there might be a better home for these checks anyway.

cc @erictune

[0] https://github.com/kubernetes/kubernetes/pull/25634/files#diff-0ddbb384845118d5b907e3498b76455cR66

Copy link
Member

@liggitt liggitt May 25, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object is still available, it just has to be obtained by passing in the existing object (which should be done inside the innermost storage during the GuaranteedUpdate call).

Wanting to add checks/transformations/behavior on the new object seems like a common enough pattern that I opened #26240 to add a helper for it. If you want to use that, you could do something like this, which just moves your checks into a function that gets chained onto the end of the call that retrieves the new object:

func (s *Storage) Update(ctx api.Context, name string, objInfo rest.UpdatedObjectInfo) (runtime.Object, bool, error) {
    if user, ok := api.UserFrom(ctx); ok {
        if s.superUser != "" && user.GetName() == s.superUser {
            return s.StandardStorage.Update(ctx, name, objInfo)
        }
    }

    nonescalatingInfo := rest.WrapUpdatedObjectInfo(objInfo, func(ctx api.Context, obj, oldObj runtime.Object) (runtime.Object, error) {
        clusterRoleBinding, ok := obj.(*rbac.ClusterRoleBinding)
        if !ok {
            ... return internal cast error ...
        }
        rules, err := s.ruleResolver.GetRoleReferenceRules(ctx, clusterRoleBinding.RoleRef, clusterRoleBinding.Namespace)
        if err != nil {
            return nil, err
        }
        if err := validation.ConfirmNoEscalation(ctx, s.ruleResolver, rules); err != nil {
            return nil, err
        }
        return obj, nil
    })

    return s.StandardStorage.Update(ctx, name, nonescalatingInfo)
}

That also has the benefit of giving you the old object, so in the future, escalation checks could take into account whether you were actually adding any bindings, or just removing/annotating, etc.

Would it be reasonable to pull the privilege escalation checks from this PR and revisit a more robust multilevel security system for beta? Based on the conversation here (and the above issue) it seems like there might be a better home for these checks anyway.

I still think storage (or something connected to it, like the strategy) is the best place for it. I'd prefer we start with it here or in the strategy, not punted to admission or omitted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that would work. Thanks, I'll update this in the morning.

@erictune
Copy link
Member

This looks good so far.

Things that are missing and should go in this PR or the next are:

  • tests for the Authorizer
  • finishing the authorizer
  • a flag to enable it
  • some e2es
  • some kind of initial policy
  • a story on how the initial policy is allowed to be created

I would like @kubernetes/sig-api-machinery to have a day to look at it, if they want to. I have reviewed everything to the limits of my knowledge of API machinery.

@liggitt
Copy link
Member

liggitt commented May 25, 2016

seeing a test failure:

/go/src/k8s.io/kubernetes/api/swagger-spec is out of date. Please run hack/update-swagger-spec.sh
FAILED   ./hack/../hack/verify-swagger-spec.sh  59s

@ericchiang
Copy link
Contributor Author

yeah, updated those and am running the verification suite locally

@ericchiang
Copy link
Contributor Author

Am going to do a squash while I'm at it.

@xiang90 xiang90 added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 25, 2016
@alex-mohr
Copy link
Contributor

Because submit queue is blocked, I'm going to kick off re-tests of the top 5 PRs in the queue, then merge them if they pass.
@k8s-bot test this please: issue #IGNORE

@k8s-bot
Copy link

k8s-bot commented May 26, 2016

GCE e2e build/test passed for commit 36ecec5.

@alex-mohr alex-mohr merged commit 6f919dc into kubernetes:master May 26, 2016
@kfox1111
Copy link

Is there any documentation to go along with this? I'm very interested in figuring out how this works, but there is a lot of code to try and read through.

@ericchiang
Copy link
Contributor Author

@kfox1111 not yet. I'll plan to following up in the next few days with lots.

@harryge00
Copy link
Contributor

harryge00 commented Jun 13, 2016

@kfox1111 There has been similar documentation from openshift? https://docs.openshift.com/enterprise/3.0/architecture/additional_concepts/authorization.html
https://github.com/openshift/origin/blob/master/docs/proposals/policy.md
I am also interested in this PR and need it in production...

@harryge00
Copy link
Contributor

harryge00 commented Jul 4, 2016

How to run in RBAC mode?
I tried running kube-apiserver with: --authorization-mode=RBAC --authorization-policy-file=/etc/kubernetes/rbac.json --authorization-rbac-super-user=admin.
And my rbac.json:

{
    "kind": "role",
    "name": "view",
    "namespace": "*",
    "rules": [
        { "verbs": ["get", "list", "watch"], "resources": ["resourcegroup:exposedopenshift", "resourcegroup:allkube"] }
    ]
}
{
    "kind": "role",
    "name": "edit",
    "namespace": "*",
    "rules": [
        { "verbs": ["get", "list", "watch", "create", "update", "delete"], "resources": ["resourcegroup:exposedopenshift", "resourcegroup:exposedkube"] }
        { "verbs": ["get", "list", "watch"], "resources": ["resourcegroup:allkube"] }
    ]
}
{
    "kind": "role",
    "name": "admin",
    "namespace": "*",
    "rules": [
        { "verbs": ["get", "list", "watch", "create", "update", "delete"], "resources": ["resourcegroup:exposedopenshift", "resourcegroup:granter", "resourcegroup:exposedkube"] }
        { "verbs": ["get", "list", "watch"], "resources": ["resourcegroup:policy", "resourcegroup:allkube"] }
    ]
}
{
    "kind": "role",
    "name": "cluster-admin",
    "namespace": "*",
    "rules": [
        { "verbs": ["*"], "resources": ["*"] }
    ],
    "userNames": ["admin"]
}
{
    "kind": "roleBinding",
    "name": "ProjectAdmins",
    "namespace": "*",
    "roleRef": {
        "namespace": "*",
        "name": "cluster-admin"
    }
    "userNames": ["admin"]
}
  1. I don't know how to bind users with roles...Should I add userNames to role or roleBinding?
  2. I want to have a "cluster-admin" who can do anything to cluster. How to define a user can access all resources in K8s? Is "resouces":["*"] right?

@liggitt
Copy link
Member

liggitt commented Jul 5, 2016

RBAC does not use a policy file, it uses objects stored in the API to define clusterroles and roles (defining permissions) and clusterrolebindings and rolebindings granting those roles within namespaces or cluster wide.

@jimmycuadra
Copy link
Contributor

Is there an issue I can subscribe to for follow up design on RBAC bootstrapping? The --authorization-rbac-super-user flag is problematic for my use case, where the pod manifests for the apiserver are immutable and removing that flag after bootstrapping is complete would require destroying and recreating servers.

@erictune
Copy link
Member

erictune commented Jul 8, 2016

@harryge00
it looks like you have to use CURL for now to create the resources, and you need to do one at a time. For example:

curl --request POST -i -H "Content-Type: application/json" -d '{"kind": "ClusterRole", "apiVersion": "rbac.authorization.k8s.io/v1alpha1", "metadata": {"name": "view"}, "rules": [{ "verbs": ["get", "list", "watch"], "resources": ["pod"] }]}' localhost:8001/apis/rbac.authorization.k8s.io/v1alpha1/clusterroles
HTTP/1.1 201 Created
Content-Length: 540
Content-Type: application/json
Date: Fri, 08 Jul 2016 20:37:03 GMT

{
  "kind": "ClusterRole",
  "apiVersion": "rbac.authorization.k8s.io/v1alpha1",
  "metadata": {
    "name": "view",
    "selfLink": "/apis/rbac.authorization.k8s.io/v1alpha1/clusterroles/view",
    "uid": "ba8613cc-454b-11e6-b2ea-42010a800002",
    "resourceVersion": "296840",
    "creationTimestamp": "2016-07-08T20:37:03Z"
  },
  "rules": [
    {
      "verbs": [
        "get",
        "list",
        "watch"
      ],
      "attributeRestrictions": null,
      "apiGroups": null,
      "resources": [
        "pod"
      ]
    }
  ]
}

works for me.

You cannot use the "resource groups". You need to list all resources. We didn't upstream the resource groups support from OpenShift.

@ericchiang can you document how to create basic viewer and writer roles for 1.3 somewhere?

@erictune
Copy link
Member

erictune commented Jul 8, 2016

@jimmycuadra how will you authenticate users? Can you use a credential for the bootstrap account that expires and is not renewed?

@jimmycuadra
Copy link
Contributor

@erictune

We authenticate using client certificates. We have a custom agent that runs on each server that populates and refreshes the latest CA/certs/keys from a central source of truth (etcd, we're on CoreOS). The systemd unit that starts the apiserver is baked into the master servers' cloud-config data, however, and is never changed without cycling the whole server. It wouldn't necessarily help to have a super user account that expires, because there'd be nothing to prevent a new cert/key pair for the same username from being generated using our CA. Perhaps another way bootstrapping could work is by the super user making an API call to set a flag (persisted in etcd) that disables their own access, and apiserver would ignore the --authorization-rbac-super-user flag after that point.

I don't want to steer this discussion off course since this PR has already been merged, so I was really just asking if there was an existing issue to continue talking about it. It'd been mentioned previously in the thread that --authorization-rbac-super-user was an initial approach and might change before stabilization. (I can start a new issue if there isn't one yet.)

@harryge00
Copy link
Contributor

harryge00 commented Jul 11, 2016

Another question: I bind a user with a clusterrole, but when accessing the api with this user, it says:

error: failed to negotiate an api version; server supports: map[], client supports: map[v1:{} metrics/v1alpha1:{} extensions/v1beta1:{} componentconfig/v1alpha1:{} batch/v1:{} autoscaling/v1:{} authorization.k8s.io/v1beta1:{}]

http://stackoverflow.com/questions/38299105/how-to-use-rbac-in-kubernetes.

I think the user is the same as in the token file token,user,uid,"group1,group2,group3", right?

And maybe in the later version, we can dynamically add users through api instead of writing new users to the static token file?

@ericchiang
Copy link
Contributor Author

ericchiang commented Jul 11, 2016

@harryge00 merged PRs are not a good place for general questions or debugging.

Please read the new authorization documentation[0], open a new issue if you feel you have a bug, or use the #kuberenetes-users or #sig-auth channels on the kubernetes slack[1], we're generally very responsive there.

[0] http://kubernetes.io/docs/admin/authorization/
[1] https://kubernetes.slack.com/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.