Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler crash on 0.12.0 for rabbitmq example #5593

Closed
jvkersch opened this issue Mar 18, 2015 · 3 comments
Closed

Scheduler crash on 0.12.0 for rabbitmq example #5593

jvkersch opened this issue Mar 18, 2015 · 3 comments

Comments

@jvkersch
Copy link

The rabbitmq example in the docs specifies a "cpu" keyword that doesn't seem to be listed in https://developers.google.com/compute/docs/containers/container_vms#container_manifest.

After creating the rabbitmq controller from the example, the rabbitmq pod is shown as "Pending" indefinitely. At the same time, creating the rabbitmq controller leads to the following stacktrace in /var/log/scheduler.log:

I0318 13:15:11.501964    4582 factory.go:156] About to try and schedule pod rabbitmq-controller-hh2gd
I0318 13:15:11.505716    4582 util.go:68] Recovered from panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/util/util.go:62
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/util/util.go:53
/usr/src/go/src/runtime/asm_amd64.s:401
/usr/src/go/src/runtime/panic.go:387
/usr/src/go/src/runtime/panic.go:42
/usr/src/go/src/runtime/sigpanic_unix.go:26
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/scheduler/predicates.go:112
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/scheduler/predicates.go:138
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/scheduler/generic_scheduler.go:110
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/scheduler/generic_scheduler.go:63
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/scheduler.go:75
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/plugin/pkg/scheduler/scheduler.go:69
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/util/util.go:107
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/util/util.go:108
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/pkg/util/util.go:92
/usr/src/go/src/runtime/asm_amd64.s:2232

Deleting the "cpu" entry from the rabbitmq controller file and resubmitting results in the pod being scheduled without any trouble. My questions:

  1. Does the "cpu" keyword have any meaning in the context of the controller file, or should it be removed?
  2. The crash seems to be caused by the scheduler not being able to find a minion with the desired characteristics. I haven't looked into this into too much detail, but it looks like it would make sense to have the scheduler log an error rather than crashing.

I'm using Kubernetes 0.12.0 and I've deployed a 4-minion cluster to GCE. Our changes to cluster/gce/config-default.sh are the following:

ZONE=${KUBE_GCE_ZONE:-europe-west1-b}
MASTER_SIZE=g1-small
MINION_SIZE=n1-highmem-2
MINION_DISK_TYPE=pd-ssd
MINION_DISK_SIZE=210GB
@zmerlynn
Copy link
Member

I believe this is fixed in #5544, which is in 0.13.1 (it's the only thing besides some odd catch-up git work that I cherry picked into release)

@zmerlynn
Copy link
Member

(But please reopen if not!)

@jvkersch
Copy link
Author

Thanks @zmerlynn, that looks like it will fix things!

Is there any value in removing the cpu keyword from the RabbitMQ example, since it doesn't look like it carries any significance? Please let me know if I've misunderstood this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants