Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PodDeletionCost occasionally doesn't work #126138

Open
chymy opened this issue Jul 17, 2024 · 16 comments
Open

PodDeletionCost occasionally doesn't work #126138

chymy opened this issue Jul 17, 2024 · 16 comments
Labels
area/controller-manager kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling.

Comments

@chymy
Copy link
Contributor

chymy commented Jul 17, 2024

What happened?

I created a ReplicationController with 1 replicas:
analysis-3-8hplt

Then I set the replicas of the rc to 2 using the scale command, Created a new pod: analysis-3-fbcdl

(2024-07-11T22:38:10.280375Z )Next, I set the annotation controller.kubernetes.io/pod-deletion-cost: "-1" for the pod: analysis-3-8hplt

After setting the annotation successfully, (2024-07-11T22:38:10.904066Z)Then I set the replicas of the rc to 1 using the scale command, but analysis-3-fbcdl was scaled.

What did you expect to happen?

analysis-3-8hplt pod was scaled.

How can we reproduce it (as minimally and precisely as possible)?

Refer to the description above

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
1.22 

Cloud provider

no

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@chymy chymy added the kind/bug Categorizes issue or PR as related to a bug. label Jul 17, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 17, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@chymy
Copy link
Contributor Author

chymy commented Jul 17, 2024

/area controller-manager

@k8s-ci-robot
Copy link
Contributor

@chymy: The label(s) sig/appa cannot be applied, because the repository doesn't have them.

In response to this:

/sig appa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@chymy
Copy link
Contributor Author

chymy commented Jul 17, 2024

/sig apps

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 17, 2024
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Apps Jul 17, 2024
@chymy
Copy link
Contributor Author

chymy commented Jul 17, 2024

/cc @ahg-g

@tamilselvan1102
Copy link

/sig autoscaling

@k8s-ci-robot k8s-ci-robot added the sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. label Jul 17, 2024
@tamilselvan1102
Copy link

Pod deletion cost does not offer any guarantees on pod deletion order.

@tamilselvan1102
Copy link

@chymy
Copy link
Contributor Author

chymy commented Jul 17, 2024

Pod deletion cost does not offer any guarantees on pod deletion order.

I can't understand why, is it because the cache update is slow?

@Adarsh-verma-14
Copy link
Contributor

Hi @chymy,
if you set the annotation controller.kubernetes.io/pod-deletion-cost with positive value for the pod: analysis-3-8hplt then this pod(analysis-3-fbcdl) is preferred to be deleted before pod(analysis-3-8hplt) with deletion cost.

@Adarsh-verma-14
Copy link
Contributor

Adarsh-verma-14 commented Jul 24, 2024

So @chymy Can you briefly explain more about, Why do you expect the pod:analysis-3-8hplt to be deleted

@chymy
Copy link
Contributor Author

chymy commented Jul 25, 2024

So @chymy Can you briefly explain more about, Why do you expect the pod:analysis-3-8hplt to be deleted

Because I set the pod-deletion-cost annotation for analysis-3-8hplt, I plan to scale down analysis-3-8hplt first. My environment has three master nodes. I think the reason might be that the time interval between setting the annotation and scaling down is too short, causing some apiserver caches to not fully synchronize.

@chymy
Copy link
Contributor Author

chymy commented Jul 25, 2024

I set the controller.kubernetes.io/pod-deletion-cost: "-1" annotation for analysis-3-8hplt

@cannonpalms
Copy link

Pod deletion cost does not offer any guarantees on pod deletion order.

Taking "best effort basis" here to mean, "support may or may not be added to older controllers like ReplicationController," is quite the stretch. "Best effort" indicates that there may be situations where it is not possible to provide a guarantee.

If the RC controller is lacking support for pod deletion costs, then a best effort has not been made. There may indeed be a timing issue, as @chymy posited. The bottom line is that "best effort basis" still requires effort; hiding behind "best effort" instead of engaging with a bug report is not an effective long-term strategy.


So @chymy Can you briefly explain more about, Why do you expect the pod:analysis-3-8hplt to be deleted

@chymy Is correct. The default pod deletion cost is 0 in the case where the annotation has not been set. Pods with lower deletion cost are preferred to be deleted before pods with higher deletion cost. Out of two replicas with the following pod deletion costs, analysis-3-8hplt should have been deleted first, with analysis-3-fbcdl as the 2nd preference:

pod pod deletion cost deletion priority
analysis-3-8hplt -1 1
analysis-3-fbcdl 0 2

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 8, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller-manager kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling.
Projects
Status: Needs Triage
Development

No branches or pull requests

6 participants