Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image pull progress should be exposed #19077

Closed
stuartbassett opened this issue Dec 24, 2015 · 58 comments
Closed

Image pull progress should be exposed #19077

stuartbassett opened this issue Dec 24, 2015 · 58 comments
Labels
area/kubectl area/usability lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@stuartbassett
Copy link

If a container is waiting for an image to be pulled before it can start, it would be nice to see the progress of that pull in kubectl, so that the user can know if they have time for another cup of coffee.

An api endpoint to give a progress update, possibly with a watch option, would be ideal.
It would also be helpful to include this in the container information in each pod.

For example, running kubectl desribe pod/<pod> should return, in addition to all info currently returned, a field containing the % pulled (or number of bytes) of the image that each container uses.

Additionally, running kubectl pull-progress pod/<pod> should return a json encoded summary of each image being pulled in order to start a pod. This should also support a watch option, to notify the client of changes in the progress. There should be an equivalent HTTP API endpoint for this.

I'm interested in using this capability to provide loading bars on a UI.

@maclof
Copy link
Contributor

maclof commented Dec 26, 2015

I have a similar usecase for this, so I would also like to see this added :)

@fgrzadkowski
Copy link
Contributor

@kubernetes/goog-ux

@bgrant0607 bgrant0607 added area/kubectl sig/node Categorizes an issue or PR as relevant to SIG Node. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jan 29, 2016
@bgrant0607
Copy link
Member

See also #19695

cc @vishh

@smarterclayton
Copy link
Contributor

Yeah, very common request. We've ended up adding heuristics "If PodStatus Pending and no containers, display message to user Probably Pulling, but we aren't sure the node is there", etc.

@vishh
Copy link
Contributor

vishh commented Feb 11, 2016

We generate events for this purpose. We currently have a pulling and
pulled

event.

AFAIK, docker does not surface image pull progress. Did that change
recently?

On Sun, Jan 31, 2016 at 1:33 PM, Clayton Coleman notifications@github.com
wrote:

Yeah, very common request. We've ended up adding heuristics "If PodStatus
Pending and no containers, display message to user Probably Pulling, but
we aren't sure the node is there", etc.


Reply to this email directly or view it on GitHub
#19077 (comment)
.

@smarterclayton
Copy link
Contributor

Practically speaking, people use pod status to figure out what the pod is
doing. The vast majority of the time between "user creates pod" and "user
sees success" is going to be spent pulling. The fact that we have no
discrete status indicating that appears to be a bug to people. The event
is useful, but if we're updating the status anyway we should know whether
we're pulling immediately or not when we write the pod status.

On Thu, Feb 11, 2016 at 4:33 PM, Vish Kannan notifications@github.com
wrote:

We generate events for this purpose. We currently have a [pulling and
pulled](

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/container/event.go#L29
)
event.

AFAIK, docker does not surface image pull progress. Did that change
recently?

On Sun, Jan 31, 2016 at 1:33 PM, Clayton Coleman <notifications@github.com

wrote:

Yeah, very common request. We've ended up adding heuristics "If PodStatus
Pending and no containers, display message to user Probably Pulling, but
we aren't sure the node is there", etc.


Reply to this email directly or view it on GitHub
<
#19077 (comment)

.


Reply to this email directly or view it on GitHub
#19077 (comment)
.

@vishh
Copy link
Contributor

vishh commented Feb 11, 2016

By Status are you referring to the output of kubectl describe or
PodStatus? If users have insight into when the pod was accepted by the
kubelet and when it starts and completes (or fails) to pull an image, would
that suffice?

On Thu, Feb 11, 2016 at 1:46 PM, Clayton Coleman notifications@github.com
wrote:

Practically speaking, people use pod status to figure out what the pod is
doing. The vast majority of the time between "user creates pod" and "user
sees success" is going to be spent pulling. The fact that we have no
discrete status indicating that appears to be a bug to people. The event
is useful, but if we're updating the status anyway we should know whether
we're pulling immediately or not when we write the pod status.

On Thu, Feb 11, 2016 at 4:33 PM, Vish Kannan notifications@github.com
wrote:

We generate events for this purpose. We currently have a [pulling and
pulled](

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/container/event.go#L29
)
event.

AFAIK, docker does not surface image pull progress. Did that change
recently?

On Sun, Jan 31, 2016 at 1:33 PM, Clayton Coleman <
notifications@github.com

wrote:

Yeah, very common request. We've ended up adding heuristics "If
PodStatus
Pending and no containers, display message to user Probably Pulling,
but
we aren't sure the node is there", etc.


Reply to this email directly or view it on GitHub
<

#19077 (comment)

.


Reply to this email directly or view it on GitHub
<
#19077 (comment)

.


Reply to this email directly or view it on GitHub
#19077 (comment)
.

@samsabed
Copy link
Contributor

The requester seems to want a progress measure 30% pulled or x/y

@smarterclayton
Copy link
Contributor

smarterclayton commented Feb 11, 2016 via email

@stuartbassett
Copy link
Author

#25032

@vishh
Copy link
Contributor

vishh commented May 2, 2016

@smarterclayton

Right now there is no distinction between waiting
for schedule and pull, which is a common state people get into.

The output of kubectl describe pod today includes events that provide the information you seek. Here is an example:

Events:
  FirstSeen LastSeen    Count   From                    SubobjectPath           Type        Reason      Message
  --------- --------    -----   ----                    -------------           --------    ------      -------
  8s        8s      1   {default-scheduler }                            Normal      Scheduled   Successfully assigned busybox-573201948-x4rg4 to kubernetes-minion-31zg
  8s        7s      2   {kubelet kubernetes-minion-31zg}    spec.containers{busybox}    Normal      Pulling     pulling image "busybox"
  7s        7s      1   {kubelet kubernetes-minion-31zg}    spec.containers{busybox}    Normal      Created     Created container with docker id 7ac36eac5dc5
  7s        7s      1   {kubelet kubernetes-minion-31zg}    spec.containers{busybox}    Normal      Started     Started container with docker id 7ac36eac5dc5
  7s        6s      2   {kubelet kubernetes-minion-31zg}    spec.containers{busybox}    Normal      Pulled      Successfully pulled image "busybox"
  6s        6s      1   {kubelet kubernetes-minion-31zg}    spec.containers{busybox}    Normal      Created     Created container with docker id 8a1d39975f1c
  6s        6s      1   {kubelet kubernetes-minion-31zg}    spec.containers{busybox}    Normal      Started     Started container with docker id 8a1d39975f1c

We can possibly add more events that include the progress, if an image pull were to take longer than expected.

@stuartbassett

I don't see why #25032 is needed yet.

@smarterclayton
Copy link
Contributor

smarterclayton commented May 2, 2016 via email

@vishh vishh closed this as completed May 2, 2016
@vishh vishh reopened this May 2, 2016
@vishh
Copy link
Contributor

vishh commented May 2, 2016

Wouldn't UIs include events as well, at-least the critical ones?

@smarterclayton
Copy link
Contributor

smarterclayton commented May 2, 2016 via email

@Random-Liu
Copy link
Member

Random-Liu commented May 3, 2016

@smarterclayton It is hard to update the image pulling progress in pod status for now. Because pod status is updated before each SyncPod, while the whole image pulling process happens during SyncPod.

The plan for now is to periodically (maybe every 5, 10 senconds) send event telling what is the current image pulling progress, which will at least tell user whether the image pulling is stuck there.

@smarterclayton
Copy link
Contributor

Doesn't pulling then effectively block all sync progress for other
containers in the same pod? Or does it merely block parallel startup of
the containers.

On Tue, May 3, 2016 at 6:08 PM, Lantao Liu notifications@github.com wrote:

@smarterclayton https://github.com/smarterclayton It is hard to update
the image pulling progress in pod status for now. Pod status is updated
before each SyncPod, while the whole image pulling process happens during
SyncPod.

The plan for now is to periodically (maybe every 5, 10 senconds) send
event telling what is the current image pulling progress, which will at
least tell user that whether the image pulling is stuck there.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#19077 (comment)

@vishh
Copy link
Contributor

vishh commented May 3, 2016

@smarterclayton Assuming that exposing image pull progress is mainly for human consumption, would the following work?

  • If an image is being pulled for a container, then update ContainerStateWaiting to contain a reason PullingImage and a message "progress: x%". The progress will be made available on a best-effort basis.
  • If a pod is inPending phase because images are being pulled, reflect that in PodStatus.Reason and PodStatus.Message. Message could be an aggregate across all containers.

This would essentially push the burden of generating human friendly pod and container status to Kubelet.

@smarterclayton
Copy link
Contributor

I don't even think progress is required, I'd be happy with a single
reason. But I'll take the message and say that it'll help users even on a
best effort.

On Tue, May 3, 2016 at 6:42 PM, Vish Kannan notifications@github.com
wrote:

@smarterclayton https://github.com/smarterclayton Assuming that
exposing image pull progress is mainly for human consumption, would the
following work?

  • If an image is being pulled for a container, then update
    ContainerStateWaiting to contain a reason PullingImage and a message
    "progress: x%". The progress will be made available on a best-effort basis.
  • If a pod is inPending phase because images are being pulled, reflect
    that in PodStatus.Reason and PodStatus.Message. Message could be an
    aggregate across all containers.

This would essentially push the burden of generating human friendly pod
and container status to Kubelet.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#19077 (comment)

@smarterclayton
Copy link
Contributor

Basically, yes, I think that would make 90% of clients better.

On Tue, May 3, 2016 at 6:47 PM, Clayton Coleman ccoleman@redhat.com wrote:

I don't even think progress is required, I'd be happy with a single
reason. But I'll take the message and say that it'll help users even on a
best effort.

On Tue, May 3, 2016 at 6:42 PM, Vish Kannan notifications@github.com
wrote:

@smarterclayton https://github.com/smarterclayton Assuming that
exposing image pull progress is mainly for human consumption, would the
following work?

  • If an image is being pulled for a container, then update
    ContainerStateWaiting to contain a reason PullingImage and a message
    "progress: x%". The progress will be made available on a best-effort basis.
  • If a pod is inPending phase because images are being pulled,
    reflect that in PodStatus.Reason and PodStatus.Message. Message could
    be an aggregate across all containers.

This would essentially push the burden of generating human friendly pod
and container status to Kubelet.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#19077 (comment)

@vishh
Copy link
Contributor

vishh commented May 3, 2016

@stuartbassett Will you be able to re-purpose your PR (#25032) to do what's mentioned in #19077 (comment) ? I can provide more specific design details, if you have difficulty parsing #19077 (comment)

@Random-Liu
Copy link
Member

Random-Liu commented May 3, 2016

@smarterclayton Yeah, for now it will block starting of all other containers, but we definitely do not want that and should make it better in the future, :)

@nphmuller
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 14, 2019
@amadav
Copy link

amadav commented May 17, 2019

Do we know if there exists any better way of achieving this?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 15, 2019
@nomcopter
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 15, 2019
@saschagrunert
Copy link
Member

We could write up a KEP for this and pitch it in SIG Node. I’d be happy to drive this topic forward, but we should get at least 3 people on board. Who is in? 🙃

@bboreham
Copy link
Contributor

@saschagrunert I am interested - what sort of commitment do you need?

@saschagrunert
Copy link
Member

@saschagrunert I am interested - what sort of commitment do you need?

I never wrote a KEP, but I’m thrilled to write one. I would just need relevant input, review and maybe implementation support. We could create a small work group in slack if you want. :)

@smarterclayton
Copy link
Contributor

I commented on the cri-o issue, but I think we could separate this into two parts:

  1. Progress for end users of kube (the issue this was raised for)
  2. Better administrative / operational insight into pulls

The former is definitely Kube since it would have to be exposed via an api. However the second is likely to be fairly specific to container runtime implementation and the storage that backs it. And given the improvements in monitoring since this issue was opened, and that most deployments likely have prometheus or something like it watching their container runtimes, the second might be the best place to start both to make a concrete first step now for admins, while also learning more about how we might expose progress. I do not think the former item was intended to solve the latter, and the latter is probably more broadly applicable since the vast majority of clusters are single team owned.

@bboreham
Copy link
Contributor

FWIW my interest comes from working on tools layered on top of Kubernetes, where the rollout is initiated by something other that kubectl (for instance a git commit).
And we'd like that tooling to be able to feed back status and/or issues without guessing.

Status needs to be tied to a specific update, since multiple overlapping updates can be issued.

I can't immediately see how Prometheus solves this requirement.

@saschagrunert
Copy link
Member

I commented on the cri-o issue, but I think we could separate this into two parts:

1. Progress for end users of kube (the issue this was raised for)
2. Better administrative / operational insight into pulls

I'd be happy to push both topics forward. From the runtime perspective: If we have the data at hand, then we could expose it to any interface.

I see four major points, whereas the first one could be dropped from here:

  1. Getting the data inside the runtime (out of scope from here)
  2. Getting the data into the kubelet via the CRI API (KEP)
  3. Exposing the data via the API Server (KEP)
  4. Exposing the data via the CLI (KEP)

Anything else?

@saschagrunert
Copy link
Member

I wrote an email to the SIG Node mailing list regarding the topic and the plan:
https://groups.google.com/forum/#!topic/kubernetes-sig-node/JHEus_TlZzA

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 21, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 20, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@metametadata
Copy link

metametadata commented Mar 21, 2020

Is there a ticket where one could vote for disabling fejta-bot?

It pollutes the long-standing discussions and eventually closes the important issues where, I suspect, people simply got tired of interacting with the bot. I'm certainly annoyed of getting notifications from it both as an author and participant of a few issues.

@turowicz
Copy link

turowicz commented Mar 26, 2021

Is this ever going to be a thing?

@smarterclayton

@aminmr
Copy link

aminmr commented Jul 12, 2023

Is there no progress on this most-wanted feature request?

@dims
Copy link
Member

dims commented Aug 9, 2023

For those of you interested in this issue, please coordinate your interest into something actionable which in our community is a KEP:

Please feel free to use community resources (sig-node mailing list, agenda on sig-node meetings, google docs to seed discussion etc) to figure out what needs to be in the KEP and how the feature progress(es) through community process(es). Some good info can be found in:

You should probably read one of the KEP(s) from before in sig-node for example in:

Looking forward to folks stepping up to help with this! thanks in advance.

@saschagrunert
Copy link
Member

Closing the loop, there is a KEP: kubernetes/enhancements#3542

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubectl area/usability lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests