Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set timeout for docker image pull calls #26300

Closed
yujuhong opened this issue May 25, 2016 · 3 comments
Closed

Set timeout for docker image pull calls #26300

yujuhong opened this issue May 25, 2016 · 3 comments
Assignees
Labels
area/reliability sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@yujuhong
Copy link
Contributor

yujuhong commented May 25, 2016

We recently removed the timeout for image pulls because the default is too low (i.e. 2 min), and it's hard to gauge what time will work for the users.

However, it's important that we set "a certain" timeout for such requests to resolve the cases where the request hangs indefinitely.
Off the top of my head, there are two options:

  1. Add the timeout back. Set the default to something longer (e.g., 10 minutes) and make it configurable through kubelet flags.
  2. Detect whether the pull request is making progress by leveraging the pull progress reporter (Kubelet: Periodically reporting image pulling progress in log #26145) and cancel the request if the progress has stalled over a threshold.

/cc @kubernetes/sig-node

@yujuhong yujuhong added area/reliability sig/node Categorizes an issue or PR as relevant to SIG Node. labels May 25, 2016
@yujuhong
Copy link
Contributor Author

/cc @Random-Liu

@Random-Liu
Copy link
Member

I'll try option 2 once #26145 gets merged.

@ncdc
Copy link
Member

ncdc commented May 25, 2016

+1 to option 2

@Random-Liu Random-Liu self-assigned this Jun 1, 2016
k8s-github-robot pushed a commit that referenced this issue Jun 3, 2016
Automatic merge from submit-queue

Add timeout for image pulling

Fix #26300.

With this PR, if image pulling makes no progress for *1 minute*, the operation will be cancelled. Docker reports progress for every 512kB block (See [here](https://github.com/docker/docker/blob/3d13fddd2bc4d679f0eaa68b0be877e5a816ad53/pkg/progress/progressreader.go#L32)), *512kB/min* means the throughput is *<= 8.5kB/s*, which should be kind of abnormal?

It's a little hard to write unit test for this, so I just manually tested it. If I set the `defaultImagePullingStuckTimeout` to 0s, and `defaultImagePullingProgressReportInterval` to 1s, image pulling will be cancelled.
```
E0601 18:48:29.026003   46185 kube_docker_client.go:274] Cancel pulling image "nginx:latest" because of no progress for 0, latest progress: "89732b811e7f: Pulling fs layer "
E0601 18:48:29.026308   46185 manager.go:2110] container start failed: ErrImagePull: net/http: request canceled
```

/cc @kubernetes/sig-node 
[![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/reliability sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

3 participants