Set controllerRef in RCs owned by DC #14322

tnozicka · 2017-05-24T13:29:26Z

TODO:

Add unit tests
Finalize Export BaseControllerRefManager kubernetes/kubernetes#45894

Goals

The main goal of ControllerRef (controller reference) is to solve the problem
of controllers that fight over controlled objects due to overlapping selectors.
Fighting controllers can destabilize the apiserver,
thrash objects back-and-forth,
or cause controller operations to hang.
A secondary goal of ControllerRef is to provide back-links from a given object
to the controller that manages it, which can be used for:
- Efficient object->controller lookup, without having to list all controllers.
- Generic object grouping (e.g. in a UI), without having to know about all
  third-party controller types in advance.
By being a subtype of OwnerReference it fixes a bug when the underlying RC are not removed when the object is deleted using API.

0xmichalis · 2017-05-24T13:38:48Z

pkg/cmd/server/bootstrappolicy/policy.go

@@ -631,7 +631,7 @@ func GetOpenshiftBootstrapClusterRoles() []authorizationapi.ClusterRole {
 				},
 			},
 			Rules: []authorizationapi.PolicyRule{
-				authorizationapi.NewRule("get", "list", "watch", "update").Groups(kapiGroup).Resources("replicationcontrollers").RuleOrDie(),
+				authorizationapi.NewRule("get", "list", "watch", "update", "delete").Groups(kapiGroup).Resources("replicationcontrollers").RuleOrDie(),


Why does the deployer need to delete RCs now?

@Kargakis you can't set controller ref on a resource you aren't allowed to delete

Why is the deployer setting owner references?

doesn't it create RCs?

No, the RCs are created by the DC controller.

you are right; I forgot which SA am I talking about. This one needs it because otherwise I get

oc logs po/busyapp-1-deploy --> Scaling busyapp-1 to 5 error: couldn't scale busyapp-1 to 5: Scaling the resource failed with: replicationcontrollers "busyapp-1" is forbidden: cannot set an ownerRef on a resource you can't delete: User "system:serviceaccount:test:deployer" cannot delete replicationcontrollers in project "test", <nil>; Current resource version 848

As it is not creating the resource and RC already has ownerRef set it might be because it is trying to remove it. This might happen if it forgets to do a deepcopy while updating since this is a pointer field. I will investigate it further

@Kargakis what I've found out so far is that openshift-deploy tries to do a PUT on RC without the ownerReferences and gets 403

https://github.com/openshift/origin/blob/master/pkg/cmd/server/bootstrappolicy/controller_policy.go#L67 seems to be fixed in HEAD

@mfojtik actually I think that's a different one and that one needs the "delete" as well

error: couldn't scale busyapp-1 to 5: Scaling the resource failed with: replicationcontrollers "busyapp-1" is forbidden: cannot set an ownerRef on a resource you can't delete: User "system:serviceaccount:test:deployer" cannot delete replicationcontrollers in project "test", <nil>; Current resource version 848

say whaaat?

0xmichalis · 2017-05-24T13:39:30Z

pkg/cmd/server/bootstrappolicy/controller_policy.go

@@ -64,7 +64,7 @@ func init() {
 		ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraDeployerControllerServiceAccountName},
 		Rules: []rbac.PolicyRule{
 			rbac.NewRule("create", "get", "list", "watch", "patch", "delete").Groups(kapiGroup).Resources("pods").RuleOrDie(),
-			rbac.NewRule("get", "list", "watch", "update").Groups(kapiGroup).Resources("replicationcontrollers").RuleOrDie(),
+			rbac.NewRule("get", "list", "watch", "update", "delete").Groups(kapiGroup).Resources("replicationcontrollers").RuleOrDie(),


Not sure that you need this permission. The owner references should be set by the DC controller.

this is fine to investigate later, please create issue (most likely you need delete to create ownerRef)

mfojtik · 2017-05-24T13:47:53Z

@tnozicka @Kargakis we need to ship this today or we will have to wait for DCUT to pass.

mfojtik · 2017-05-24T18:03:53Z

[test]

tnozicka · 2017-05-24T20:31:23Z

[test]

tnozicka · 2017-05-24T21:02:25Z

[test]

tnozicka · 2017-05-24T21:39:23Z

@mfojtik @Kargakis any other comments except "delete" in RBAC which I will address in a separate PR?

tnozicka · 2017-05-25T09:44:06Z

@Kargakis @mfojtik I believe I've found the issue with openshift-deploy. After I've copied the newly build binary from my machine into the image and it worked. As also does the openshift/origin-deployer:latest image. The issue is that we don't reference latest, but v3.6.0-alpha.1 which is 6 weeks old. Running with this image causes the problem because it was compiled before the kube rebase was updated to know about ownerReferences (https://github.com/openshift/origin/blame/master/vendor/k8s.io/kubernetes/staging/src/k8s.io/apimachinery/pkg/apis/meta/v1/types.go#L258). This would mean that it is compiled without knowing about ownerReferences and when it does an update the issue manifest itself:

it decodes RC from wire
skips fields it doesn't know about (controllerRef) while deserializing
updates what it needs on that struct (to scale)
serializes the object with those changes
sends PUT for that RC!
And the controllerRef is lost! Because it is trying to modify the controllerRef in the PUT (from a valid record to nil - that means indirect delete by garbage collector) it is rightfully denied such PUT with 403.
Using PATCH would also mitigate the issue.

(It seemed to work with elevated permissions because DC controller created the controllerRef, than Deployer deleted it and DC controller adopted it because it didn't have any controllerRef and matched the selector.)

0xmichalis · 2017-05-25T09:49:24Z

@tnozicka nice. This means that if we don't provide the permissions in the deployer sa, we are going to break older images :(

0xmichalis · 2017-05-25T09:52:46Z

This lacks an extended test and you also need to set finalizers like in kubernetes/kubernetes#45764

0xmichalis · 2017-05-25T09:53:37Z

s/extended/integration/

mfojtik · 2017-05-25T10:04:49Z

@Kargakis i'm fine with adding more permissions there with reasonable explanation/todo to remove them after all users are migrated. Having more escalated privileges is still better than having no privileges at all.

I would defer all tests as follow ups as we need to unblock the UI team and web console. I think this PR will qualify as blocker bug fix.

tnozicka · 2017-05-25T13:21:51Z

Ok, I have looked into it and it seems like the deployer image version and name come from master config. That kind of complicates things because we might not get rid those old images any time soon.

But the real issues seems to be that the deployer images are not backwards compatible (or forwards - depends how you look at it). Meaning: if you do a forwards compatible API change they will delete the new data and act as nothing happened. This is a data loss. In this case it gets filled back when adopted back by DC but it might be more serious in other cases.

The reason is the PUT call. Whenever you do GET+PUT on a resource (and you don't want to completely overwrite it) you break compatibility! You loose any fields you don't know about. Especially with deployer images which are decoupled from the master binary and you run different versions. But this is also applicable elsewhere - if you do a rollback of infra and use PUT on such resources you loose the data that has been added by the newer version.

My take: don't use PUT, but PATCH.

As we are already in that situation, here is a proposed path forward:

Leave the elevated privileges here with TODO and a comment referencing this PR for an explanation. The issue with deleting ownerReferences with every rollout will still be there but that deleted ref gets repared by adoption literally immediately.
Add finalizers (https://github.com/kubernetes/kubernetes/pull/45764/files)
Get it merged. It fixes a bug when RCs are not deleted when DC is and it unblocks UI team.
Send a PR moving deployer to use PATCH (+ add patch to RBAC)
Send PR with tests
Create a card to remove unneeded privileges in a few releases

@mfojtik @Kargakis WDYT

0xmichalis · 2017-05-25T13:27:40Z

If all the deployer does is scaling the RC then it should use its scale endpoint. The fact that we are adding priviledges because we need to keep old images working worries me but I don't see any other solution now. cc: @smarterclayton

mfojtik · 2017-05-25T13:29:27Z

@tnozicka agree, lets move this forward, don't have to be perfect but we need to unblock UI team and have the owner refs. We can fiddle the permissions and tests later.

tnozicka · 2017-05-25T17:17:39Z

I added the comments and I've tried to setup the finalizer but I've got:

metadata.finalizers[0]: Invalid value: "foregroundDeletion": name is neither a standard finalizer name nor is it fully qualified

Tried prefixing it with kubernetes.io/ it just doesn't do anything. We migh be missing that finalizer in origin but I wasn't able to confirm it yet.

tnozicka · 2017-05-26T09:47:40Z

Ok I got to the bottom of this - seems like upstream is broken https://github.com/kubernetes/kubernetes/blob/58167fcfa10551cc092ee7484e8b41082d0ac223/pkg/api/helper/helpers.go#L224

They are missing that finalizer in standardFinalizers. It works perfectly if add metav1.FinalizerDeleteDependents right after it.

I will send an PR and see.

tnozicka · 2017-05-26T10:24:01Z

@mfojtik @Kargakis should we have admissionControl to add metav1.FinalizerDeleteDependents to every DC so the objects are deleted in correct order? Or we don't care?

Right now it deletes DC, then PODs (that may take long time - default is 30 s) and then RCs. If in the interim user does oc get all he sees just PODs terminating and RCs waiting for it; no DC.

tnozicka · 2017-05-26T12:03:32Z

I pushed literally just those 2 comments.
Flake #14367

tnozicka · 2017-05-26T13:25:39Z

[test]

tnozicka · 2017-05-29T09:38:26Z

flake #13980
[test]

mfojtik · 2017-05-29T09:48:36Z

pkg/cmd/server/bootstrappolicy/controller_policy.go

+			// "delete" is required here for compatibility with older deployer images
+			// (see https://github.com/openshift/origin/pull/14322#issuecomment-303968976)
+			// TODO: remove "delete" few releases after 3.6
+			rbac.NewRule("get", "list", "watch", "update", "delete").Groups(kapiGroup).Resources("replicationcontrollers").RuleOrDie(),


nit: I would make this an extra rule just with delete

rbac.NewRule("delete").Group(kapi).Resources("replicationcontrollers")

mfojtik · 2017-05-29T09:48:58Z

pkg/cmd/server/bootstrappolicy/policy.go

+				// "delete" is required here for compatibility with older deployer images
+				// (see https://github.com/openshift/origin/pull/14322#issuecomment-303968976)
+				// TODO: remove "delete" few releases after 3.6
+				authorizationapi.NewRule("get", "list", "watch", "update", "delete").Groups(kapiGroup).Resources("replicationcontrollers").RuleOrDie(),


same here, i would separate this from the rest of rbac

…rDeleteDependents (Note: it is in different files from upstream because they moved helpers.go into helper/helpers.go)

openshift-bot · 2017-05-30T14:13:20Z

Evaluated for origin test up to 3ace21a

openshift-bot · 2017-05-30T14:13:29Z

continuous-integration/openshift-jenkins/test Running (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/1841/) (Base Commit: 39664e5)

openshift-bot · 2017-05-30T15:52:57Z

continuous-integration/openshift-jenkins/test SUCCESS

tnozicka · 2017-05-31T09:36:19Z

@mfojtik good to go?

For the record: GC client seems to work fine as tried with upgrade from 1.4; order of deletion will be decided on separately.

mfojtik · 2017-05-31T09:43:45Z

tnozicka yeah, please make sure you have follow up issue for the tests for this... I think this is good base and we can build on this.

Mo removed tag to make bot happy

this unblocks the UI team
create ownerrefs for resources that used to be reaped. #14133 (P1, 3.6)

enj · 2017-05-31T10:18:11Z

[merge][severity: blocker]

See previous comments.

mfojtik · 2017-05-31T11:06:25Z

[merge] against (seems like tagging screwup)

tnozicka · 2017-05-31T11:08:02Z

Flake #14122

tnozicka · 2017-05-31T14:17:17Z

Flake #14418

mfojtik · 2017-06-01T08:40:45Z

[merge]

openshift-bot · 2017-06-01T08:45:27Z

Evaluated for origin merge up to 3ace21a

openshift-bot · 2017-06-01T09:21:30Z

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/866/) (Base Commit: b6fd9db) (Extended Tests: blocker) (Image: devenv-rhel7_6290)

0xmichalis · 2017-06-02T13:33:53Z

pkg/deploy/controller/deploymentconfig/deploymentconfig_controller.go

@@ -125,6 +148,15 @@ func (c *DeploymentConfigController) Handle(config *deployapi.DeploymentConfig)
 				if err != nil {
 					return err
 				}
+				// We need to make sure we own that RC or adopt it if possible
+				isOurs, err := cm.ClaimReplicationController(rc)


You have already claimed those RCs above, no?

@Kargakis the one we get with the Get could be a different one. What if it no longer matches the selector and was/should be released or it could have been changed, deleted/+created and we don't own it

Not sure about having the RC diverge just a couple of Go code after the initial adoption and this is a cache lookup but maybe this doesn't hurt as a last line of defense? What happens if the cache copy is not yet updated with the owner reference you set above? Will the 2nd patch simply succeed because there is no diff with the server copy or is there an actual error returned?

Now I am thinking that we never supported adopting RCs for DCs, before owner refs, so I am wondering if we are going to break users that have fiddled with RC labels. Maybe warn in the release notes since we are departing from the old way of selecting RCs.

Now I am thinking that we never supported adopting RCs for DCs, before owner refs, so I am wondering if we are going to break users that have fiddled with RC labels. Maybe warn in the release notes since we are departing from the old way of selecting RCs.

Nevermind this one, rechecked and you merely add owner refs on top - should be fine.

@Kargakis

What happens if the cache copy is not yet updated with the owner reference you set above?

This is in RetryOnConflict block so in the worst case caches should propagate while doing the backoff, right? We could use an uncached client here.

What happens if the cache copy is not yet updated with the owner reference you set above?

I think that the second patch should result in noop (and succeed).

Not sure about having the RC diverge just a couple of Go code after the initial adoption and this is a cache lookup but maybe this doesn't hurt as a last line of defense?

Well not doing this and patching a RC we don't own due to race condition would break 2. or 3. controller rule

But that also gets me thinking that we can break it under some conditions. There is still race condition in the patches we use (taken from upstream). I think they should really use optimistic concurrency in form of resourceVersion to be completely correct. Hope that works with PATCH. I'll take the discussion upstream to figure it out because the do the same.

0xmichalis · 2017-06-02T13:35:04Z

pkg/deploy/controller/deploymentconfig/deploymentconfig_controller.go

@@ -239,6 +287,18 @@ func (c *DeploymentConfigController) reconcileDeployments(existingDeployments []
 				if err != nil {
 					return err
 				}
+				// We need to make sure we own that RC or adopt it if possible
+				isOurs, err := cm.ClaimReplicationController(rc)


Same comment as above

same as well :)

0xmichalis · 2017-06-02T15:34:00Z

pkg/deploy/controller/deploymentconfig/deploymentconfig_controller.go

+				// If the deployment was already created, just move on. The cache could be
+				// stale, or another process could have already handled this update.
+				return c.updateStatus(config, existingDeployments)
+			} else {


No need for else

I know, I have changed it 2 times while writing it but somehow I wanted to explicitly state that this is for the case where it isn't ours :) and to be resilient to future changes

Automatic merge from submit-queue (batch tested with PRs 46648, 46500, 46238, 46668, 46557) Fix standardFinalizers - add missing metav1.FinalizerDeleteDependents **What this PR does / why we need it**: It adds [FinalizerDeleteDependents](https://github.com/kubernetes/kubernetes/blob/58167fcfa10551cc092ee7484e8b41082d0ac223/staging/src/k8s.io/apimachinery/pkg/apis/meta/v1/types.go#L77) to [standardFinalizers](https://github.com/kubernetes/kubernetes/blob/58167fcfa10551cc092ee7484e8b41082d0ac223/pkg/api/helper/helpers.go#L222) otherwise this finalizer is unusable because apiserver will fail validation because it is not fully qualified name - but it is a standard Kubernetes finalizer [used by garbage collector](https://github.com/kubernetes/kubernetes/blob/58167fcfa10551cc092ee7484e8b41082d0ac223/pkg/controller/garbagecollector/garbagecollector.go#L389) but it can't be set. It's sibling [FinalizerOrphanDependents](https://github.com/kubernetes/kubernetes/blob/58167fcfa10551cc092ee7484e8b41082d0ac223/staging/src/k8s.io/apimachinery/pkg/apis/meta/v1/types.go#L76) is already [there](https://github.com/kubernetes/kubernetes/blob/58167fcfa10551cc092ee7484e8b41082d0ac223/pkg/api/helper/helpers.go#L224). I suppose this is a bug because otherwise `FinalizerDeleteDependents` is unusable. Fixes openshift/origin#14322 Might fix #45764 **Not for the reviewer:** [This same definition is also in staging.](https://github.com/kubernetes/kubernetes/blob/58167fcfa10551cc092ee7484e8b41082d0ac223/staging/src/k8s.io/client-go/pkg/api/helper/helpers.go#L222) Does it get propagated to staging automatically? Editing the same file twice doesn't seem like the intended option.

0xmichalis reviewed May 24, 2017

View reviewed changes

tnozicka force-pushed the set-owner-refs-in-rcs-owned-by-dc branch from 70f8e15 to 5e54a0d Compare May 24, 2017 17:00

tnozicka changed the title ~~WIP - Set owner refs in RCs owned by DC~~ Set owner refs in RCs owned by DC May 24, 2017

tnozicka force-pushed the set-owner-refs-in-rcs-owned-by-dc branch from 5e54a0d to c29a4af Compare May 24, 2017 20:30

tnozicka force-pushed the set-owner-refs-in-rcs-owned-by-dc branch from c29a4af to f8c5f25 Compare May 24, 2017 20:45

tnozicka changed the title ~~Set owner refs in RCs owned by DC~~ Set controllerRef in RCs owned by DC May 25, 2017

tnozicka force-pushed the set-owner-refs-in-rcs-owned-by-dc branch from f8c5f25 to 6e11d12 Compare May 25, 2017 13:56

tnozicka mentioned this pull request May 26, 2017

Fix standardFinalizers - add missing metav1.FinalizerDeleteDependents kubernetes/kubernetes#46500

Merged

mfojtik reviewed May 29, 2017

View reviewed changes

tnozicka force-pushed the set-owner-refs-in-rcs-owned-by-dc branch from 444b402 to 84cbbfa Compare May 30, 2017 14:06

tnozicka added 4 commits May 30, 2017 16:10

UPSTREAM: 45894: Export BaseControllerRefManager

48e69a6

UPSTREAM: 46500: Fix standardFinalizers - add missing metav1.Finalize…

3ace21a

…rDeleteDependents (Note: it is in different files from upstream because they moved helpers.go into helper/helpers.go)

Fix deployment related SA's permissions

eeaf7bc

Add DC controllerRef to RC

13326de

tnozicka force-pushed the set-owner-refs-in-rcs-owned-by-dc branch from 84cbbfa to 13326de Compare May 30, 2017 14:10

tnozicka mentioned this pull request May 31, 2017

Add tests for RC->DC controllerRef #14416

Closed

spadgett mentioned this pull request May 31, 2017

Remove now incorrect warning when deleting DCs openshift/origin-web-console#1619

Merged

This was referenced May 31, 2017

FAILURE: BUILD AN OPENSHIFT-ANSIBLE RELEASE #14122

Closed

When deleting DC, DC should stay "visible" until cascadeDelete removes all underlying resources (RCs, Pods) #14417

Closed

openshift-bot merged commit 1bf8a17 into openshift:master Jun 1, 2017

tnozicka mentioned this pull request Jun 1, 2017

create ownerrefs for resources that used to be reaped. #14133

Closed

6 tasks

0xmichalis reviewed Jun 2, 2017

View reviewed changes

tnozicka deleted the set-owner-refs-in-rcs-owned-by-dc branch June 7, 2017 12:40

tnozicka mentioned this pull request Jun 12, 2017

RC and pods left after garbage collection #14115

Closed

Set controllerRef in RCs owned by DC #14322

Set controllerRef in RCs owned by DC #14322

Conversation

tnozicka commented May 24, 2017 • edited Loading

Goals

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnozicka May 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfojtik commented May 24, 2017

mfojtik commented May 24, 2017

tnozicka commented May 24, 2017

tnozicka commented May 24, 2017

tnozicka commented May 24, 2017

tnozicka commented May 25, 2017

0xmichalis commented May 25, 2017

0xmichalis commented May 25, 2017

0xmichalis commented May 25, 2017

mfojtik commented May 25, 2017

tnozicka commented May 25, 2017 • edited Loading

0xmichalis commented May 25, 2017

mfojtik commented May 25, 2017

tnozicka commented May 25, 2017

tnozicka commented May 26, 2017

tnozicka commented May 26, 2017

tnozicka commented May 26, 2017 • edited Loading

tnozicka commented May 26, 2017

tnozicka commented May 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-bot commented May 30, 2017

openshift-bot commented May 30, 2017

openshift-bot commented May 30, 2017

tnozicka commented May 31, 2017

mfojtik commented May 31, 2017 • edited by enj Loading

enj commented May 31, 2017

mfojtik commented May 31, 2017

tnozicka commented May 31, 2017

tnozicka commented May 31, 2017 • edited Loading

mfojtik commented Jun 1, 2017

openshift-bot commented Jun 1, 2017

openshift-bot commented Jun 1, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tnozicka commented May 24, 2017 •

edited

Loading

tnozicka May 24, 2017 •

edited

Loading

tnozicka commented May 25, 2017 •

edited

Loading

tnozicka commented May 26, 2017 •

edited

Loading

mfojtik commented May 31, 2017 •

edited by enj

Loading

tnozicka commented May 31, 2017 •

edited

Loading

openshift-bot commented Jun 1, 2017 •

edited

Loading