Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow PriorityClass names with 'system-' prefix for user defined priority classes #59382

Merged
merged 2 commits into from
Feb 9, 2018

Conversation

bsalamat
Copy link
Member

@bsalamat bsalamat commented Feb 6, 2018

What this PR does / why we need it:
This PR changes our Priority admission controller to disallow PriorityClass names with 'system-' prefix for user defined priority classes. Please refer to #59381 for reasons why we need this.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #59381

Release note:

Disallow PriorityClass names with 'system-' prefix for user defined priority classes.

ref #57471
/sig scheduling
/assign @liggitt

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 6, 2018
@dixudx
Copy link
Member

dixudx commented Feb 6, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2018
@liggitt
Copy link
Member

liggitt commented Feb 7, 2018

@kubernetes/sig-scheduling-api-reviews

@k8s-ci-robot k8s-ci-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Feb 7, 2018
@@ -203,6 +207,9 @@ func (p *PriorityPlugin) validatePriorityClass(a admission.Attributes) error {
if pc.Value > HighestUserDefinablePriority {
return admission.NewForbidden(a, fmt.Errorf("maximum allowed value of a user defined priority is %v", HighestUserDefinablePriority))
}
if strings.HasPrefix(pc.Name, SystemPriorityClassPrefix) {
Copy link
Member

@liggitt liggitt Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a priority class starting with "system-" is created while this plugin is not enabled, does this prevent update/deleting it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, validate in api server?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a priority class starting with "system-" is created while this plugin is not enabled, does this prevent update/deleting it?

It does not prevent deletion, but it does prevent updates. I think this should be fine. We don't allow updating names of priority classes anyway.
I move the new logic to the API side.

@@ -203,6 +207,9 @@ func (p *PriorityPlugin) validatePriorityClass(a admission.Attributes) error {
if pc.Value > HighestUserDefinablePriority {
return admission.NewForbidden(a, fmt.Errorf("maximum allowed value of a user defined priority is %v", HighestUserDefinablePriority))
}
if strings.HasPrefix(pc.Name, SystemPriorityClassPrefix) {
return admission.NewForbidden(a, fmt.Errorf("priority class names with '%v' prefix are reserved for system use only: %v", SystemPriorityClassPrefix, pc.Name))
}
if _, ok := SystemPriorityClasses[pc.Name]; ok {
return admission.NewForbidden(a, fmt.Errorf("the name of the priority class is a reserved name for system use only: %v", pc.Name))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this is my first time in here. Below, this is attempting to prevent multiple priority class objects that say they are the default, but that's racy/best effort. What happens if multiple defaults end up in the system? Does this prevent those from being updated or deleted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the invalidateCachedDefaultPriority call wouldn't handle HA apiserver scenarios

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if multiple defaults end up in the system? Does this prevent those from being updated or deleted?

It does not prevent deletion, but it does not allow updates, which is fine in my opinion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the invalidateCachedDefaultPriority call wouldn't handle HA apiserver scenarios

Is this the scenario that is not handled correctly:

  1. Server 1 is the leader. Admission controller populates its cache with default priority.
  2. Server 1 loses its leadership and server 2 becomes leader. Server 2 populates its admission controller cache.
  3. Global defalt priority class is deleted/updated.
  4. Server 2 loses its leadership and server 1 becomes leader again.
  5. Server 1's cache is stale!

This is certainly an issue we need to address. A possible solution is to not cache default priority, but it could cause performance regression. Is there a better solution?

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 8, 2018
Copy link
Member Author

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @liggitt. PTAL.

@@ -203,6 +207,9 @@ func (p *PriorityPlugin) validatePriorityClass(a admission.Attributes) error {
if pc.Value > HighestUserDefinablePriority {
return admission.NewForbidden(a, fmt.Errorf("maximum allowed value of a user defined priority is %v", HighestUserDefinablePriority))
}
if strings.HasPrefix(pc.Name, SystemPriorityClassPrefix) {
return admission.NewForbidden(a, fmt.Errorf("priority class names with '%v' prefix are reserved for system use only: %v", SystemPriorityClassPrefix, pc.Name))
}
if _, ok := SystemPriorityClasses[pc.Name]; ok {
return admission.NewForbidden(a, fmt.Errorf("the name of the priority class is a reserved name for system use only: %v", pc.Name))
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if multiple defaults end up in the system? Does this prevent those from being updated or deleted?

It does not prevent deletion, but it does not allow updates, which is fine in my opinion.

@@ -203,6 +207,9 @@ func (p *PriorityPlugin) validatePriorityClass(a admission.Attributes) error {
if pc.Value > HighestUserDefinablePriority {
return admission.NewForbidden(a, fmt.Errorf("maximum allowed value of a user defined priority is %v", HighestUserDefinablePriority))
}
if strings.HasPrefix(pc.Name, SystemPriorityClassPrefix) {
return admission.NewForbidden(a, fmt.Errorf("priority class names with '%v' prefix are reserved for system use only: %v", SystemPriorityClassPrefix, pc.Name))
}
if _, ok := SystemPriorityClasses[pc.Name]; ok {
return admission.NewForbidden(a, fmt.Errorf("the name of the priority class is a reserved name for system use only: %v", pc.Name))
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the invalidateCachedDefaultPriority call wouldn't handle HA apiserver scenarios

Is this the scenario that is not handled correctly:

  1. Server 1 is the leader. Admission controller populates its cache with default priority.
  2. Server 1 loses its leadership and server 2 becomes leader. Server 2 populates its admission controller cache.
  3. Global defalt priority class is deleted/updated.
  4. Server 2 loses its leadership and server 1 becomes leader again.
  5. Server 1's cache is stale!

This is certainly an issue we need to address. A possible solution is to not cache default priority, but it could cause performance regression. Is there a better solution?

@@ -203,6 +207,9 @@ func (p *PriorityPlugin) validatePriorityClass(a admission.Attributes) error {
if pc.Value > HighestUserDefinablePriority {
return admission.NewForbidden(a, fmt.Errorf("maximum allowed value of a user defined priority is %v", HighestUserDefinablePriority))
}
if strings.HasPrefix(pc.Name, SystemPriorityClassPrefix) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a priority class starting with "system-" is created while this plugin is not enabled, does this prevent update/deleting it?

It does not prevent deletion, but it does prevent updates. I think this should be fine. We don't allow updating names of priority classes anyway.
I move the new logic to the API side.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2018
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 8, 2018
@k82cn
Copy link
Member

k82cn commented Feb 9, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 9, 2018
@liggitt
Copy link
Member

liggitt commented Feb 9, 2018

validation change looks reasonable (and is acceptable imo because priority feature was alpha in 1.9)... the error message from the admission plugin when an unknown system-... priority is used is misleading ("no PriorityClass with name %v was found") since no PriorityClass with that name would be allowed, but that can be improved as a follow up

/approve

@@ -44,6 +44,8 @@ const (
)

// SystemPriorityClasses defines special priority classes which are used by system critical pods that should not be preempted by workload pods.
// NOTE: In order to avoid conflict of names with user-defined priority classes, all the names must
// start with scheduling.SystemPriorityClassPrefix which is by default "system-".
var SystemPriorityClasses = map[string]int32{
"system-cluster-critical": SystemCriticalPriority,
"system-node-critical": SystemCriticalPriority + 1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we manage the magic numbers like 1000? At least we should indicate why it is 1000.
WDYT, Bobby?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgliang, the exact value of priority does not have any significance, only the relative value of priority compared to other priorites matters. So, if system-node-critical value was SystemCriticalPriority + 1, it would have had the same effect as SystemCriticalPriority + 1000, but we choose to use a larger number so that we can add more priority classes between the two existing ones if there is a need in the future.

@bsalamat
Copy link
Member Author

bsalamat commented Feb 9, 2018

@deads2k Could you please approve the change under /admission? It is only a change in the comment.

@deads2k
Copy link
Contributor

deads2k commented Feb 9, 2018

Tightening validation sets of my spidey-sense, but it looks like api-approvers were already here.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, deads2k, dixudx, k82cn, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these OWNERS Files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 9, 2018
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disallow PriorityClass names with 'system-' prefix for user defined priority classes
8 participants