-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor controller initialization (final round?) #14633
Conversation
addControllerRole(rbac.ClusterRole{ | ||
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraResourceQuotaControllerServiceAccountName}, | ||
Rules: []rbac.PolicyRule{ | ||
// TODO: Add rules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upstream has:
Rules: []rbac.PolicyRule{
// quota can count quota on anything for reconcilation, so it needs full viewing powers
rbac.NewRule("list", "watch").Groups("*").Resources("*").RuleOrDie(),
rbac.NewRule("update").Groups(legacyGroup).Resources("resourcequotas/status").RuleOrDie(),
eventsRule(),
}
We need our own version of k8s.io/kubernetes/plugin/pkg/auth/authorizer/rbac/bootstrappolicy/controller_policy_test.go
:
// rolesWithAllowStar are the controller roles which are allowed to contain a *. These are
// namespace lifecycle and GC which have to delete anything. If you're adding to this list
// tag sig-auth
var rolesWithAllowStar = sets.NewString(
saRolePrefix+"namespace-controller",
saRolePrefix+"generic-garbage-collector",
saRolePrefix+"resourcequota-controller",
)
// TestNoStarsForControllers confirms that no controller role has star verbs, groups,
// or resources. There are two known exceptions, namespace lifecycle and GC which have to
// delete anything
func TestNoStarsForControllers(t *testing.T) {
for _, role := range ControllerRoles() {
if rolesWithAllowStar.Has(role.Name) {
continue
}
for i, rule := range role.Rules {
for j, verb := range rule.Verbs {
if verb == "*" {
t.Errorf("%s.Rule[%d].Verbs[%d] is star", role.Name, i, j)
}
}
for j, group := range rule.APIGroups {
if group == "*" {
t.Errorf("%s.Rule[%d].APIGroups[%d] is star", role.Name, i, j)
}
}
for j, resource := range rule.Resources {
if resource == "*" {
t.Errorf("%s.Rule[%d].Resources[%d] is star", role.Name, i, j)
}
}
}
}
}
func TestControllerRoleLabel(t *testing.T) {
roles := ControllerRoles()
for i := range roles {
role := roles[i]
accessor, err := meta.Accessor(&role)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if got, want := accessor.GetLabels(), map[string]string{"kubernetes.io/bootstrapping": "rbac-defaults"}; !reflect.DeepEqual(got, want) {
t.Errorf("ClusterRole: %s GetLabels() = %s, want %s", accessor.GetName(), got, want)
}
}
rolebindings := ControllerRoleBindings()
for i := range rolebindings {
rolebinding := rolebindings[i]
accessor, err := meta.Accessor(&rolebinding)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if got, want := accessor.GetLabels(), map[string]string{"kubernetes.io/bootstrapping": "rbac-defaults"}; !reflect.DeepEqual(got, want) {
t.Errorf("ClusterRoleBinding: %s GetLabels() = %s, want %s", accessor.GetName(), got, want)
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enj ideally I would like to merge upstream rules with origin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enj ideally I would like to merge upstream rules with origin
We aren't going to modify any upstream rules. If more permissions are required, we'll do that with a delta role in openshift and a separate binding.
336c159
to
3d90e8d
Compare
@deads2k how about these: kc.RunPersistentVolumeController they are kube controllers, don't touch? |
e4fd5bb
to
969e05c
Compare
They need touching. We should have a method on the cmd/server/kubernetes/masterconfig that produces init functions for all of kube, but creating the kubes, stomping these on top, and converting to the openshift style. |
@deads2k rgr, thanks |
@mfojtik could you add a test like https://github.com/openshift/origin/pull/13579/files?w=1#diff-3052c7b8d594aa465f7a354d294a8e07R201 and see if it still flakes? I believe the race condition has been fixed (would happen one to ten percent of the time locally). |
0f90f5a
to
42e8666
Compare
addControllerRole(rbac.ClusterRole{ | ||
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraClusterQuotaReconciliationControllerServiceAccountName}, | ||
Rules: []rbac.PolicyRule{ | ||
// TODO: Add rules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
surely this doesn't work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still WIP, i'm looking for suggestions :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still WIP, i'm looking for suggestions :-)
Is this currently broken in the pull or do you work without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
work without it (server start), but I want to revisit, maybe @enj can help here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k caught 403 failures in integration test and added rules.
recorder := eventcast.NewRecorder(kapi.Scheme, kclientv1.EventSource{Component: bootstrappolicy.InfraPersistentVolumeControllerServiceAccountName}) | ||
eventcast.StartRecordingToSink( | ||
&kv1core.EventSinkImpl{ | ||
Interface: ctx.ClientBuilder.ClientGoClientOrDie(bootstrappolicy.InfraPersistentVolumeControllerServiceAccountName).CoreV1().Events(""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirm which namespace this is in. If its in kube-system, then the role you've defined above may not be used at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, once this merges, open an issue for our storage team. These cause rebase pain every time. I'd like some comments explaining why we are special snowflakes.
Interface: ctx.ClientBuilder.ClientGoClientOrDie(bootstrappolicy.InfraPersistentVolumeControllerServiceAccountName).CoreV1().Events(""), | ||
}, | ||
) | ||
recyclerSA := bootstrappolicy.InfraPersistentVolumeRecyclerControllerServiceAccountName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirm which namespace this is in. If its in kube-system, then the role you've defined above may not be used at all.
}) | ||
|
||
addControllerRole(rbac.ClusterRole{ | ||
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraPersistentVolumeControllerServiceAccountName}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect this to be the delta we need in openshift, not an entire role
}) | ||
|
||
// scheduler | ||
addControllerRole(rbac.ClusterRole{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we're going to use this in 3.6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i will nuke it
}) | ||
|
||
addControllerRole(rbac.ClusterRole{ | ||
ObjectMeta: metav1.ObjectMeta{Name: saRolePrefix + InfraServiceLoadBalancerControllerServiceAccountName}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question about deltas here
"github.com/openshift/origin/pkg/cmd/server/bootstrappolicy" | ||
) | ||
|
||
type PersistentVolumeControllerConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're already moving. Give us a controller_volume.go
and segregate by functional area.
} | ||
|
||
type SchedulerControllerConfig struct { | ||
PrivilegedClient kclientset.Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get a TODO on this and an issue. I'd like to move closer to upstream.
Do we have a kubeorigin
label yet? I think we need to start tracking all the various refactors we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todo added, spawned #14697
c.HardPodAffinitySymmetricWeight, | ||
) | ||
|
||
if _, err := os.Stat(c.SchedulerConfigFile); err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blech. Prexisting? We don't want to be reading files down at this level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can move it to higher level and pass it down here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can move it to higher level and pass it down here.
It would be good. If this is pre-existing, I can live with an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i fixed this
//configData, err = ioutil.ReadFile(c.SchedulerServer.PolicyConfigFile) | ||
configData, err = ioutil.ReadFile(c.PolicyConfigFile) | ||
if err != nil { | ||
return false, fmt.Errorf("unable to read scheduler config: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was true, error
. They had a file, we can't read it, intent is not fulfilled. We should die early.
ctx.InformerFactory.Core().V1().Nodes(), | ||
ctx.InformerFactory.Extensions().V1beta1().DaemonSets(), | ||
c.CloudProvider, | ||
// TODO: Do we need openshift service account here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would our service account be special? This one exists upstream, rigth?
"k8s.io/kubernetes/pkg/serviceaccount" | ||
|
||
configapi "github.com/openshift/origin/pkg/cmd/server/api" | ||
"github.com/openshift/origin/pkg/cmd/server/crypto" | ||
"github.com/openshift/origin/pkg/cmd/server/origin/controller" | ||
) | ||
|
||
func (c *MasterConfig) NewKubernetesControllerInitalizers(kc *kubernetes.MasterConfig) (map[string]kubecontroller.InitFunc, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the shape I was looking for. More arguments than I may have liked, but exactly the right shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k in upstream they pass some options via ControllerContext, maybe we can add stuff we need there and nuke this argument.
pkg/cmd/server/origin/controller.go
Outdated
ret := map[string]kubecontroller.InitFunc{} | ||
|
||
persistentVolumeController := kubernetes.PersistentVolumeControllerConfig{ | ||
OpenShiftInfrastructureNamespace: c.Options.PolicyConfig.OpenShiftInfrastructureNamespace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open a 3.6 blocker issue. We agreed to make this non-configurable, we need to add a validation warning and sweep for usage to eliminate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k what should I set this to? hardcode "openshift-infra" ?
pkg/cmd/server/origin/controller.go
Outdated
// TODO: In 3.7 this is renamed to 'Cloud' and is part of kubernetes ControllerContext | ||
CloudProvider: kc.CloudProvider, | ||
} | ||
ret["k8s.io/persistent-volume"] = persistentVolumeController.RunController |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these controllers perfectly parallel upstream in function, let's use their keys and actually stomp them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
"k8s.io/kubernetes/pkg/serviceaccount" | ||
|
||
configapi "github.com/openshift/origin/pkg/cmd/server/api" | ||
"github.com/openshift/origin/pkg/cmd/server/crypto" | ||
"github.com/openshift/origin/pkg/cmd/server/origin/controller" | ||
) | ||
|
||
func (c *MasterConfig) NewKubernetesControllerInitalizers(kc *kubernetes.MasterConfig) (map[string]kubecontroller.InitFunc, error) { | ||
ret := map[string]kubecontroller.InitFunc{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function can do the stitching together of the upstream map with our overrides. Put yourself in the rebasers shoes. You want the new controllers, plus our existing overrides without having to do anything specific, but you want one map to look at for rebasing purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, ptal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works, thanks
pkg/cmd/server/origin/controller.go
Outdated
} | ||
ret["openshift.io/sdn"] = sdnController.RunController | ||
|
||
ret["openshift.io/resource-quota-manager"] = controller.RunResourceQuotaManager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run this under its kube name for now and open an issue to split upstream from downstream. Given shared informers, the watch cost in negligible if the update conflicts can be borne.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k where I can find kube names for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvmd. already sorted out.
} | ||
|
||
func (c *SDNControllerConfig) RunController(ctx ControllerContext) (bool, error) { | ||
// TODO: Switch SDN to use client.Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's get issues open for these. Individual issues will help us distribute the pain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clusterQuotaMappingController := controller.ClusterQuotaMappingControllerConfig{ | ||
ClusterQuotaMappingController: c.ClusterQuotaMappingController, | ||
} | ||
ret["openshift.io/cluster-quota-mapping"] = clusterQuotaMappingController.RunController |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this result in us running twice? I think that we may have to accept in the MasterConfig
as an optional bit for the medium term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k no i plumbed this via ClusterQuotaMappingController option for now, deserves issue as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still failing
|
Same problem ish:
Are you positive you didn't break role setup? |
I'm taking merge tag off this until I see it go green in testing. |
ebe91a0
to
22bcd21
Compare
@deads2k guess the sdn plugin should not block (should be wrapper in go routine) and the origin to rbac convertor should run first? |
Looks like a flake, [test]
|
Looks like this broke the idling controller?
|
@smarterclayton yeah... |
22bcd21
to
aa151e1
Compare
aa151e1
to
2c0a1e0
Compare
Evaluated for origin test up to 2c0a1e0 |
continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 1 |
Evaluated for origin merge up to 2c0a1e0 |
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2458/) (Base Commit: a192b3e) (PR Branch Commit: 2c0a1e0) |
Tests are green, nothing merged after the test started. Head of the queue isn't conflicting with this. Pushing it for @mfojtik |
Rules: []rbac.PolicyRule{ | ||
rbac.NewRule("get", "create", "update").Groups(networkGroup, legacyNetworkGroup).Resources("clusternetworks").RuleOrDie(), | ||
rbac.NewRule("get", "list").Groups(networkGroup, legacyNetworkGroup).Resources("egressnetworkpolicies").RuleOrDie(), | ||
rbac.NewRule("get", "list", "create", "delete").Groups(networkGroup, legacyNetworkGroup).Resources("hostsubnets").RuleOrDie(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not correct - the controller needed watch on hostsubnets. How did our tests not catch this @openshift/networking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The master only needs to watch on hostsubnets to handle the case where an admin manually creates a hostsubnet for F5 router integration, and there are no end-to-end tests of that.
(Looks like an error was getting logged about the watch failing, but that doesn't get returned to anywhere we'd see it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and there are no end-to-end tests of that.
though it would have been caught by QE before the release went out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in #14968, added watch access to the other needed resources.
@@ -144,375 +127,6 @@ func init() { | |||
InfraSAs.saToRole = map[string]authorizationapi.ClusterRole{} | |||
|
|||
err = InfraSAs.addServiceAccount( | |||
InfraPersistentVolumeRecyclerControllerServiceAccountName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything added here gets auto-reconciled on server start. moving these broke upgrades for controllers that cause server exits if they are missing permissions (because the ansible reconciliation can't run until the server is up). see https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3303/testReport/junit/(root)/Reconcile%20Cluster%20Roles%20and%20Cluster%20Role%20Bindings%20and%20Security%20Context%20Constraints/_localhost__Reconcile_Cluster_Roles/
The fix isn't too bad, but those controllers should be updated. |
@deads2k do we have issue to track what needs to be updated? |
Follow up for #14293 and #14317 with more controllers. This should be the final pull after which all controllers should be switched to new initialization.
This time:
(this is based on #14317 the first 3 commits will go away after? it merges)