Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment: label already present as selector on RC results in max number of 'Terminating' pods. #24039

Closed
jpprins1 opened this issue Apr 8, 2016 · 8 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@jpprins1
Copy link

jpprins1 commented Apr 8, 2016

  1. Tried the deployment example at: http://kubernetes.io/docs/user-guide/deployments/
  2. RC was already present with label selector "nginx"
  3. Got lots of pods hanging on "Terminating" status (and some 'Pending' pods.)

Note: The easy solution of course is to not use the same label, however after removing the deployment the 'Terminating' pods still are present and in my case manually removing them did not help to get 'pending' pods to get 'Running'. (In the end I needed to reboot the cluster)

@vishh
Copy link
Contributor

vishh commented Apr 8, 2016

  1. RC was already present with label selector "nginx"

Are you referring to Replication Controllers or Replica Sets?

@bgrant0607 bgrant0607 added the kind/support Categorizes issue or PR as a support question. label Apr 8, 2016
@bgrant0607
Copy link
Member

cc @janetkuo @Kargakis

@bgrant0607
Copy link
Member

@jpprins1 What release of K8s? What cloudprovider? What os distro?

@janetkuo
Copy link
Member

janetkuo commented Apr 8, 2016

I can reproduce it on HEAD.

$ kubectl run nginx --image=nginx --generator="run/v1"
replicationcontroller "nginx" created
$ kubectl get pods 
NAME          READY     STATUS    RESTARTS   AGE
nginx-g9f7n   1/1       Running   0          6s
$ kubectl run nginx --image=nginx 
deployment "nginx" created
$ kubectl get pods 
NAME                     READY     STATUS        RESTARTS   AGE
nginx-2040093540-0v8ez   0/1       Terminating   0          2s
nginx-2040093540-1j5r2   0/1       Terminating   0          2s
nginx-2040093540-1yn41   0/1       Terminating   0          1s
nginx-2040093540-6ens4   0/1       Terminating   0          1s
nginx-2040093540-aqh5o   0/1       Terminating   0          2s
nginx-2040093540-dhfcn   0/1       Terminating   0          2s
nginx-2040093540-e8mxp   0/1       Pending       0          0s
nginx-2040093540-k1yap   0/1       Terminating   0          2s
nginx-2040093540-lzjt6   0/1       Terminating   0          1s
nginx-2040093540-npzqs   0/1       Terminating   0          2s
nginx-2040093540-pcprw   0/1       Terminating   0          1s
nginx-2040093540-qs07o   0/1       Terminating   0          2s
nginx-2040093540-uemc8   0/1       Terminating   0          1s
nginx-2040093540-vcio0   0/1       Terminating   0          2s
nginx-2040093540-w0bpq   0/1       Terminating   0          2s
nginx-g9f7n              1/1       Running       0          19s
...

This will happen when the deployment and replication controller (but not replica set) have the same label selectors.

@jpprins1
Copy link
Author

jpprins1 commented Apr 8, 2016

  1. RC = Replication Controller.

  2. Version info:

Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.1",
GitCommit:"50809107cd47a1f62da362bccefdd9e6f7076145",
GitTreeState:"clean"}Server Version: version.Info{Major:"1", Minor:"2",
GitVersion:"v1.2.0", GitCommit:"5cb86ee022267586db386f62781338b0483733b3",
GitTreeState:"clean"}

  1. OS info:

Master: CoreOS beta (991.1.0)
Node 1&2: CoreOS beta (991.2.0)

Got similar output as Janet.

2016-04-09 0:11 GMT+02:00 Janet Kuo notifications@github.com:

I can reproduce it on HEAD.

$ kubectl run nginx --image=nginx --generator="run/v1"replicationcontroller "nginx" created
$ kubectl get pods NAME READY STATUS RESTARTS AGEnginx-g9f7n 1/1 Running 0 6s
$ kubectl run nginx --image=nginx deployment "nginx" created
$ kubectl get pods NAME READY STATUS RESTARTS AGEnginx-2040093540-0v8ez 0/1 Terminating 0 2snginx-2040093540-1j5r2 0/1 Terminating 0 2snginx-2040093540-1yn41 0/1 Terminating 0 1snginx-2040093540-6ens4 0/1 Terminating 0 1snginx-2040093540-aqh5o 0/1 Terminating 0 2snginx-2040093540-dhfcn 0/1 Terminating 0 2snginx-2040093540-e8mxp 0/1 Pending 0 0snginx-2040093540-k1yap 0/1 Terminating 0 2snginx-2040093540-lzjt6 0/1 Terminating 0 1snginx-2040093540-npzqs 0/1 Terminating 0 2snginx-2040093540-pcprw 0/1 Terminating 0 1snginx-2040093540-qs07o 0/1 Terminating 0 2snginx-2040093540-uemc8 0/1 Terminating 0 1snginx-2040093540-vcio0 0/1 Terminating 0 2snginx-2040093540-w0bpq 0/1 Terminating 0 2snginx-g9f7n 1/1 Running 0 19s...

This will happen when the deployment and replication controller (but not
replica set) have the same label selectors.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#24039 (comment)

@janetkuo
Copy link
Member

janetkuo commented Apr 8, 2016

It's because the RC controls a set of "nginx" pods (label: {run=nginx}), and then the deployment generates an RS that controls another set of "nginx" pods (label: {run=nginx, pod-template-hash=xxx}).

The RC can see and control the RS's pods but the RS can't control (and doesn't see) the RC's pods. The RC is fighting with the RS -- whenever RS scales up its own pod, the RC sees it and scales it down, and then the RS will find it doesn't have enough replicas and then create more, and so on.

@janetkuo
Copy link
Member

janetkuo commented Apr 9, 2016

Note: The easy solution of course is to not use the same label, however after removing the deployment the 'Terminating' pods still are present and in my case manually removing them did not help to get 'pending' pods to get 'Running'. (In the end I needed to reboot the cluster)

The pods are not hanging. They're actually Terminating (by default the termination grace period is 30s). After deleting the Deployment and wait for a while you should be able to see the number of pods starts decreasing (run something like kubectl get pods | wc -l to see the number of pods) and eventually you'll see only the pods controlled by the RC.

@bgrant0607
Copy link
Member

This will be mitigated by #24946. We also plan to eventually autogenerate unique labels/selectors, as we do for Job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

4 participants