Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor controller initialization (final round?) #14633

Merged
merged 3 commits into from
Jun 21, 2017

Conversation

mfojtik
Copy link
Contributor

@mfojtik mfojtik commented Jun 14, 2017

Follow up for #14293 and #14317 with more controllers. This should be the final pull after which all controllers should be switched to new initialization.

This time:

  • unidling controller
  • resource quota / cluster quota
  • SDN controller
  • and more.

(this is based on #14317 the first 3 commits will go away after? it merges)

@mfojtik
Copy link
Contributor Author

mfojtik commented Jun 14, 2017

@deads2k can you check the cluster quota/quota stuff, I did some nasty things there.

@enj this should be the final pull \o/

addControllerRole(rbac.ClusterRole{
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraResourceQuotaControllerServiceAccountName},
Rules: []rbac.PolicyRule{
// TODO: Add rules
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k @enj will need help with these :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upstream has:

Rules: []rbac.PolicyRule{
	// quota can count quota on anything for reconcilation, so it needs full viewing powers
	rbac.NewRule("list", "watch").Groups("*").Resources("*").RuleOrDie(),
	rbac.NewRule("update").Groups(legacyGroup).Resources("resourcequotas/status").RuleOrDie(),
	eventsRule(),
}

We need our own version of k8s.io/kubernetes/plugin/pkg/auth/authorizer/rbac/bootstrappolicy/controller_policy_test.go:

// rolesWithAllowStar are the controller roles which are allowed to contain a *.  These are
// namespace lifecycle and GC which have to delete anything.  If you're adding to this list
// tag sig-auth
var rolesWithAllowStar = sets.NewString(
	saRolePrefix+"namespace-controller",
	saRolePrefix+"generic-garbage-collector",
	saRolePrefix+"resourcequota-controller",
)

// TestNoStarsForControllers confirms that no controller role has star verbs, groups,
// or resources.  There are two known exceptions, namespace lifecycle and GC which have to
// delete anything
func TestNoStarsForControllers(t *testing.T) {
	for _, role := range ControllerRoles() {
		if rolesWithAllowStar.Has(role.Name) {
			continue
		}

		for i, rule := range role.Rules {
			for j, verb := range rule.Verbs {
				if verb == "*" {
					t.Errorf("%s.Rule[%d].Verbs[%d] is star", role.Name, i, j)
				}
			}
			for j, group := range rule.APIGroups {
				if group == "*" {
					t.Errorf("%s.Rule[%d].APIGroups[%d] is star", role.Name, i, j)
				}
			}
			for j, resource := range rule.Resources {
				if resource == "*" {
					t.Errorf("%s.Rule[%d].Resources[%d] is star", role.Name, i, j)
				}
			}
		}
	}
}

func TestControllerRoleLabel(t *testing.T) {
	roles := ControllerRoles()
	for i := range roles {
		role := roles[i]
		accessor, err := meta.Accessor(&role)
		if err != nil {
			t.Fatalf("unexpected error: %v", err)
		}
		if got, want := accessor.GetLabels(), map[string]string{"kubernetes.io/bootstrapping": "rbac-defaults"}; !reflect.DeepEqual(got, want) {
			t.Errorf("ClusterRole: %s GetLabels() = %s, want %s", accessor.GetName(), got, want)
		}
	}

	rolebindings := ControllerRoleBindings()
	for i := range rolebindings {
		rolebinding := rolebindings[i]
		accessor, err := meta.Accessor(&rolebinding)
		if err != nil {
			t.Fatalf("unexpected error: %v", err)
		}
		if got, want := accessor.GetLabels(), map[string]string{"kubernetes.io/bootstrapping": "rbac-defaults"}; !reflect.DeepEqual(got, want) {
			t.Errorf("ClusterRoleBinding: %s GetLabels() = %s, want %s", accessor.GetName(), got, want)
		}
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enj ideally I would like to merge upstream rules with origin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enj ideally I would like to merge upstream rules with origin

We aren't going to modify any upstream rules. If more permissions are required, we'll do that with a delta role in openshift and a separate binding.

@mfojtik mfojtik force-pushed the controller-init-4 branch 2 times, most recently from 336c159 to 3d90e8d Compare June 14, 2017 10:20
@mfojtik
Copy link
Contributor Author

mfojtik commented Jun 14, 2017

@deads2k how about these:

kc.RunPersistentVolumeController
kc.RunPersistentVolumeAttachDetachController
kc.RunServiceLoadBalancerController

they are kube controllers, don't touch?

@mfojtik mfojtik force-pushed the controller-init-4 branch from e4fd5bb to 969e05c Compare June 14, 2017 10:40
@deads2k
Copy link
Contributor

deads2k commented Jun 14, 2017

they are kube controllers, don't touch?

They need touching. We should have a method on the cmd/server/kubernetes/masterconfig that produces init functions for all of kube, but creating the kubes, stomping these on top, and converting to the openshift style.

@mfojtik
Copy link
Contributor Author

mfojtik commented Jun 14, 2017

@deads2k rgr, thanks

@enj
Copy link
Contributor

enj commented Jun 15, 2017

@mfojtik could you add a test like https://github.com/openshift/origin/pull/13579/files?w=1#diff-3052c7b8d594aa465f7a354d294a8e07R201 and see if it still flakes? I believe the race condition has been fixed (would happen one to ten percent of the time locally).

@mfojtik mfojtik force-pushed the controller-init-4 branch 3 times, most recently from 0f90f5a to 42e8666 Compare June 15, 2017 12:11
addControllerRole(rbac.ClusterRole{
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraClusterQuotaReconciliationControllerServiceAccountName},
Rules: []rbac.PolicyRule{
// TODO: Add rules
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surely this doesn't work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still WIP, i'm looking for suggestions :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still WIP, i'm looking for suggestions :-)

Is this currently broken in the pull or do you work without it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

work without it (server start), but I want to revisit, maybe @enj can help here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k caught 403 failures in integration test and added rules.

recorder := eventcast.NewRecorder(kapi.Scheme, kclientv1.EventSource{Component: bootstrappolicy.InfraPersistentVolumeControllerServiceAccountName})
eventcast.StartRecordingToSink(
&kv1core.EventSinkImpl{
Interface: ctx.ClientBuilder.ClientGoClientOrDie(bootstrappolicy.InfraPersistentVolumeControllerServiceAccountName).CoreV1().Events(""),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm which namespace this is in. If its in kube-system, then the role you've defined above may not be used at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, once this merges, open an issue for our storage team. These cause rebase pain every time. I'd like some comments explaining why we are special snowflakes.

Interface: ctx.ClientBuilder.ClientGoClientOrDie(bootstrappolicy.InfraPersistentVolumeControllerServiceAccountName).CoreV1().Events(""),
},
)
recyclerSA := bootstrappolicy.InfraPersistentVolumeRecyclerControllerServiceAccountName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm which namespace this is in. If its in kube-system, then the role you've defined above may not be used at all.

})

addControllerRole(rbac.ClusterRole{
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraPersistentVolumeControllerServiceAccountName},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to be the delta we need in openshift, not an entire role

})

// scheduler
addControllerRole(rbac.ClusterRole{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're going to use this in 3.6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i will nuke it

})

addControllerRole(rbac.ClusterRole{
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraServiceLoadBalancerControllerServiceAccountName},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question about deltas here

"github.com/openshift/origin/pkg/cmd/server/bootstrappolicy"
)

type PersistentVolumeControllerConfig struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already moving. Give us a controller_volume.go and segregate by functional area.

}

type SchedulerControllerConfig struct {
PrivilegedClient kclientset.Interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get a TODO on this and an issue. I'd like to move closer to upstream.

Do we have a kubeorigin label yet? I think we need to start tracking all the various refactors we need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo added, spawned #14697

c.HardPodAffinitySymmetricWeight,
)

if _, err := os.Stat(c.SchedulerConfigFile); err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blech. Prexisting? We don't want to be reading files down at this level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can move it to higher level and pass it down here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can move it to higher level and pass it down here.

It would be good. If this is pre-existing, I can live with an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i fixed this

//configData, err = ioutil.ReadFile(c.SchedulerServer.PolicyConfigFile)
configData, err = ioutil.ReadFile(c.PolicyConfigFile)
if err != nil {
return false, fmt.Errorf("unable to read scheduler config: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was true, error. They had a file, we can't read it, intent is not fulfilled. We should die early.

ctx.InformerFactory.Core().V1().Nodes(),
ctx.InformerFactory.Extensions().V1beta1().DaemonSets(),
c.CloudProvider,
// TODO: Do we need openshift service account here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would our service account be special? This one exists upstream, rigth?

"k8s.io/kubernetes/pkg/serviceaccount"

configapi "github.com/openshift/origin/pkg/cmd/server/api"
"github.com/openshift/origin/pkg/cmd/server/crypto"
"github.com/openshift/origin/pkg/cmd/server/origin/controller"
)

func (c *MasterConfig) NewKubernetesControllerInitalizers(kc *kubernetes.MasterConfig) (map[string]kubecontroller.InitFunc, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the shape I was looking for. More arguments than I may have liked, but exactly the right shape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k in upstream they pass some options via ControllerContext, maybe we can add stuff we need there and nuke this argument.

ret := map[string]kubecontroller.InitFunc{}

persistentVolumeController := kubernetes.PersistentVolumeControllerConfig{
OpenShiftInfrastructureNamespace: c.Options.PolicyConfig.OpenShiftInfrastructureNamespace,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open a 3.6 blocker issue. We agreed to make this non-configurable, we need to add a validation warning and sweep for usage to eliminate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k what should I set this to? hardcode "openshift-infra" ?

// TODO: In 3.7 this is renamed to 'Cloud' and is part of kubernetes ControllerContext
CloudProvider: kc.CloudProvider,
}
ret["k8s.io/persistent-volume"] = persistentVolumeController.RunController
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these controllers perfectly parallel upstream in function, let's use their keys and actually stomp them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

"k8s.io/kubernetes/pkg/serviceaccount"

configapi "github.com/openshift/origin/pkg/cmd/server/api"
"github.com/openshift/origin/pkg/cmd/server/crypto"
"github.com/openshift/origin/pkg/cmd/server/origin/controller"
)

func (c *MasterConfig) NewKubernetesControllerInitalizers(kc *kubernetes.MasterConfig) (map[string]kubecontroller.InitFunc, error) {
ret := map[string]kubecontroller.InitFunc{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function can do the stitching together of the upstream map with our overrides. Put yourself in the rebasers shoes. You want the new controllers, plus our existing overrides without having to do anything specific, but you want one map to look at for rebasing purposes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, ptal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works, thanks

}
ret["openshift.io/sdn"] = sdnController.RunController

ret["openshift.io/resource-quota-manager"] = controller.RunResourceQuotaManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run this under its kube name for now and open an issue to split upstream from downstream. Given shared informers, the watch cost in negligible if the update conflicts can be borne.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k where I can find kube names for these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvmd. already sorted out.

}

func (c *SDNControllerConfig) RunController(ctx ControllerContext) (bool, error) {
// TODO: Switch SDN to use client.Interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get issues open for these. Individual issues will help us distribute the pain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clusterQuotaMappingController := controller.ClusterQuotaMappingControllerConfig{
ClusterQuotaMappingController: c.ClusterQuotaMappingController,
}
ret["openshift.io/cluster-quota-mapping"] = clusterQuotaMappingController.RunController
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this result in us running twice? I think that we may have to accept in the MasterConfig as an optional bit for the medium term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k no i plumbed this via ClusterQuotaMappingController option for now, deserves issue as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k spawned #14698 to track this and eventually fix this.

@smarterclayton
Copy link
Contributor

Still failing

TASK [Reconcile Cluster Roles] *************************************************
task path: /usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/upgrades/upgrade_control_plane.yml:188
fatal: [localhost]: FAILED! => {
    "changed": false, 
    "cmd": [
        "oc", 
        "adm", 
        "--config=/etc/origin/master/admin.kubeconfig", 
        "policy", 
        "reconcile-cluster-roles", 
        "--additive-only=true", 
        "--confirm", 
        "-o", 
        "name"
    ], 
    "delta": "0:00:11.389966", 
    "end": "2017-06-20 11:37:34.235158", 
    "failed": true, 
    "generated_timestamp": "2017-06-20 11:37:34.257628", 
    "rc": 1, 
    "start": "2017-06-20 11:37:22.845192", 
    "stderr": [
        "The connection to the server ip-172-18-13-0.ec2.internal:8443 was refused - did you specify the right host or port?"
    ], 
    "stdout": []
}

@smarterclayton
Copy link
Contributor

Same problem ish:

Jun 20 11:36:59 ip-172-18-13-0.ec2.internal systemd[1]: origin-master.service holdoff time over, scheduling restart.                                                                                                                                                                    │
Jun 20 11:36:59 ip-172-18-13-0.ec2.internal systemd[1]: Starting Origin Master Service...                                                                                                                                                                                               │
Jun 20 11:37:01 ip-172-18-13-0.ec2.internal systemd[1]: Started Origin Master Service.                                                                                                                                                                                                  │
Jun 20 11:37:18 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:18.369419    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:18 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:18.590750    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:19 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:19.372660    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:19 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:19.593912    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:20 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:20.376062    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:20 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:20.598243    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:21 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:21.379350    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:21 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:21.601789    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:22 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:22.382535    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:22 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:22.605092    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:23 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:23.385576    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:23 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:23.608335    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:24 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:24.388587    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:24 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:24.611334    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:25 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:25.391476    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:25 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:25.614358    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:26 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:26.394628    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:26 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:26.617475    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:27 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:27.397689    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:27 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:27.620409    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:28 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:28.400832    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:28 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:28.623449    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:29 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:29.403960    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:29 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:29.626498    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:30 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:30.407086    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:30 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:30.629554    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:31 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:31.410150    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:31 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:31.633190    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:32 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:32.413271    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:32 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:32.636240    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:33 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:33.416286    4575 reflector.go:201] github.com/openshift/origin/vendor/k8s.io/kubernetes/pkg/controller/garbagecollector/graph_builder.go:214: Failed to list <nil>: User "system:serviceaccount:kube-syste│
Jun 20 11:37:33 ip-172-18-13-0.ec2.internal origin-master[4575]: E0620 11:37:33.638955    4575 reflector.go:201] github.com/openshift/origin/pkg/serviceaccounts/controllers/docker_registry_service.go:123: Failed to list *v1.Service: User "system:serviceaccount:openshift-infra:ser│
Jun 20 11:37:34 ip-172-18-13-0.ec2.internal origin-master[4575]: F0620 11:37:34.042104    4575 start_master.go:846] Error starting "openshift.io/sdn" (User "system:serviceaccount:openshift-infra:sdn-controller" cannot get clusternetworks at the cluster scope)

Are you positive you didn't break role setup?

@openshift openshift deleted a comment from mfojtik Jun 20, 2017
@smarterclayton
Copy link
Contributor

I'm taking merge tag off this until I see it go green in testing.

@mfojtik mfojtik force-pushed the controller-init-4 branch from ebe91a0 to 22bcd21 Compare June 20, 2017 16:37
@mfojtik
Copy link
Contributor Author

mfojtik commented Jun 20, 2017

@deads2k guess the sdn plugin should not block (should be wrapper in go routine) and the origin to rbac convertor should run first?

@smarterclayton
Copy link
Contributor

smarterclayton commented Jun 20, 2017 via email

@smarterclayton
Copy link
Contributor

smarterclayton commented Jun 21, 2017 via email

@mfojtik
Copy link
Contributor Author

mfojtik commented Jun 21, 2017

@smarterclayton yeah... User "system:serviceaccount:openshift-infra:unidling-controller" cannot get deploymentconfigs.apps.openshift.io i was sure I scrape all client calls in undling.

@mfojtik mfojtik force-pushed the controller-init-4 branch from 22bcd21 to aa151e1 Compare June 21, 2017 08:55
@mfojtik mfojtik force-pushed the controller-init-4 branch from aa151e1 to 2c0a1e0 Compare June 21, 2017 10:49
@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 2c0a1e0

@openshift-bot
Copy link
Contributor

openshift-bot commented Jun 21, 2017

continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 1

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 2c0a1e0

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2458/) (Base Commit: a192b3e) (PR Branch Commit: 2c0a1e0)

@deads2k
Copy link
Contributor

deads2k commented Jun 21, 2017

Tests are green, nothing merged after the test started. Head of the queue isn't conflicting with this. Pushing it for @mfojtik

@deads2k deads2k merged commit 2fd8eef into openshift:master Jun 21, 2017
@mfojtik
Copy link
Contributor Author

mfojtik commented Jun 21, 2017

gif happy dance

Rules: []rbac.PolicyRule{
rbac.NewRule("get", "create", "update").Groups(networkGroup, legacyNetworkGroup).Resources("clusternetworks").RuleOrDie(),
rbac.NewRule("get", "list").Groups(networkGroup, legacyNetworkGroup).Resources("egressnetworkpolicies").RuleOrDie(),
rbac.NewRule("get", "list", "create", "delete").Groups(networkGroup, legacyNetworkGroup).Resources("hostsubnets").RuleOrDie(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not correct - the controller needed watch on hostsubnets. How did our tests not catch this @openshift/networking?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The master only needs to watch on hostsubnets to handle the case where an admin manually creates a hostsubnet for F5 router integration, and there are no end-to-end tests of that.

(Looks like an error was getting logged about the watch failing, but that doesn't get returned to anywhere we'd see it.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and there are no end-to-end tests of that.

though it would have been caught by QE before the release went out

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #14968, added watch access to the other needed resources.

@@ -144,375 +127,6 @@ func init() {
InfraSAs.saToRole = map[string]authorizationapi.ClusterRole{}

err = InfraSAs.addServiceAccount(
InfraPersistentVolumeRecyclerControllerServiceAccountName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything added here gets auto-reconciled on server start. moving these broke upgrades for controllers that cause server exits if they are missing permissions (because the ansible reconciliation can't run until the server is up). see https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3303/testReport/junit/(root)/Reconcile%20Cluster%20Roles%20and%20Cluster%20Role%20Bindings%20and%20Security%20Context%20Constraints/_localhost__Reconcile_Cluster_Roles/

@deads2k
Copy link
Contributor

deads2k commented Jul 20, 2017

everything added here gets auto-reconciled on server start. moving these broke upgrades for controllers that cause server exits if they are missing permissions (because the ansible reconciliation can't run until the server is up).

The fix isn't too bad, but those controllers should be updated.

@mfojtik
Copy link
Contributor Author

mfojtik commented Jul 20, 2017

@deads2k do we have issue to track what needs to be updated?

@deads2k
Copy link
Contributor

deads2k commented Jul 20, 2017

@deads2k do we have issue to track what needs to be updated?

I didn't make one. You'll need to pick #15364 or something like it.

@mfojtik mfojtik deleted the controller-init-4 branch September 5, 2018 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants