Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portainer fails to pull images with authenticated registry and breaks deployments #9040

Open
figassis opened this issue Jun 5, 2023 · 13 comments
Assignees
Labels

Comments

@figassis
Copy link

figassis commented Jun 5, 2023

Bug description
When using private images, deployments via webhook fail with "pull access denied for mycontainer, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"

Expected behavior
Redeployment via webhook should succees is properly authenticated

Portainer Logs
This is the response I receive

{
  "message": "Failed to update the stack",
  "details": "failed to deploy a docker compose stack 25: failed to pull images of the stack:  IMAGE_NAME Pulling  IMAGE_NAME Error Error response from daemon: pull access denied for USERNAME/IMAGE_NAME, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"
}

Steps to reproduce the issue:

  1. Setup portainer CE on host A
  2. Setup portainer environment on host B (docker standalone)
  3. Add docker hub as an authenticated registry in portainer (expectation is that this will always be used to pull anything from docker hub)
  4. Login to docker hub on both hosts
  5. Create a git stack on host B environment and setup a webhook
  6. Make changes to the repository and push, then call webhook

Result: webhook will fail with message below:

{
  "message": "Failed to update the stack",
  "details": "failed to deploy a docker compose stack 25: failed to pull images of the stack:  IMAGE_NAME Pulling  IMAGE_NAME Error Error response from daemon: pull access denied for USERNAME/IMAGE_NAME, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"
}

Technical details:

  • Portainer version: 2.18
  • Docker version: 20.10.12
  • Kubernetes version (managed by Portainer):
  • Platform: Ubuntu 20.04
  • Command used to start Portainer (docker run -p 9443:9443 portainer/portainer):
version: "3.5"
services:
  portainer:
    container_name: portainer
    image: portainer/portainer-ce:latest
    restart: always
    security_opt:
      - no-new-privileges:true
    tmpfs:
      - /tmp
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./.portainer:/data
  • Browser: chrome
  • Use Case (delete as appropriate): Using Portainer at Home.
  • Have you reviewed our technical documentation and knowledge base? Yes

Additional context
Looks like portainer server always tries to pull the image locally (which makes no sense if it's trying to deploy remotely), and somehow never uses the authenticated docker hub registry, it uses the anonymous one and fails. Solution would be to just leave the remote agent deal with this, but there is no option to get around this because the local pull is hardcoded. I tested the changes below with a custom image and it just works.

image
@jamescarppe jamescarppe self-assigned this Jun 6, 2023
@jamescarppe
Copy link
Member

I've just been attempting to replicate this myself on a fresh CE 2.18.3 installation but am not running into the same issue you are. The steps I took:

  1. Deploy a new Portainer CE server installation (using your example compose file)
  2. Add a new Agent (2.18.3) on a remote server from the new CE server
  3. Add my DockerHub credentials on the CE server
  4. Create a Git repository with a docker-compose.yml referencing a private image in my DockerHub account
  5. Connect to the remote server from within Portainer
  6. Create a new Git stack from the above-created Git repository, enabling Automatic Updates in webhook mode and copying the webhook
  7. Confirm the stack deployed successfully
  8. Made a change to the docker-compose.yml in the Git repository (in this case, changing the image tag but not the image to force a pull)
  9. Triggered the webhook to update the stack
  10. Confirm the stack redeployed with the new image tag (which it did)

I also then made a different change to the docker-compose.yml file just to make sure (in this case, I published an additional port) and triggered the webhook again, and once again the redeployment was successful, no errors.

One thing to note about my steps above is that I did not do a docker login on either the server or the agent at any point - this shouldn't be necessary to do when using Portainer, so I left it out.

Are you perhaps able to give us some more information on your setup, both on the server and agent side, as well as any specifics you're able to share about your setup that might relate (networking, firewall, etc). Also, if there's anything special about your private image that you're able to share, that might help.

@figassis
Copy link
Author

figassis commented Jun 6, 2023

I ran docker login after the issue started, so this happened without it as well.
The servers are part of a tailscale network and can both reach each other
There's nothing special about the image, it's a basic go image built from the following Dockerfile

FROM golang:alpine as builder
ENV GO111MODULE on

RUN mkdir -p /build

WORKDIR /build
COPY go.mod go.sum ./
RUN apk add git && go mod download
ADD . .

# Build
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -ldflags '-extldflags "-static"' -o /app .

# FROM alpine
FROM gcr.io/distroless/base
COPY --from=builder /app /app

WORKDIR /

EXPOSE 8080

CMD ["./app"]

I'm wondering though, why do we need to pull an image on the server when we want to deploy somewhere else? It's not like we're shipping the image to the agent.

@jamescarppe
Copy link
Member

I'm wondering though, why do we need to pull an image on the server when we want to deploy somewhere else? It's not like we're shipping the image to the agent.

We don't - I didn't during my attempted replication of your issue. I've just checked to confirm, the image I deployed on the remote environment is not on my Portainer Server environment at all.

@kamlad
Copy link

kamlad commented Jun 13, 2023

@figassis Would you be able to provide your code changes via a pull request and Portainer server and agent log files? Thanks.

@figassis
Copy link
Author

@kamlad sorry for the late reply, sidetracked at work. I'll submit as soon as I clean it up a bit

@ncstc1
Copy link

ncstc1 commented Aug 11, 2023

We were hit by this problem after not resisting the upgrade temptation.

Here is some information that would have helped us, in case it can be useful to others...

Our workaround was to go back to an earlier version (we used both 2.15.1 and 2.16.0).
Note that just downgrading was not enough and the webhooks had to be created again as the problem seems to persist in the database (BTW, we redeployed the impacted stacks from scratch too, just in case).

This being said, we saw that there were many other similar issues that did not help but we missed this one (#7792) before using the workaround... maybe this is a better solution. Please confirm if you follow that path 😃
We would prefer if we could use a more recent version!

@jonasfoyth
Copy link

jonasfoyth commented Aug 17, 2023

After a fresh install of a 2.18.4 (latest) ce server, same problem here, steps:

1- Setup server (latest version)
2- Setup client (docker standlone/edge agent standard with latest version)
3- Setup private registry

image

4- create a edge group
5- create a edge stack using a private docker hub, ex:

services:
  smartgrid:
        image: "docker.io/jonasfoyfth/smartgrid:latest"
smartgrid Pulling smartgrid Error Error response from daemon: pull access denied for jonasfoyfth/smartgrid, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

Otherside, if i pull direct (connect and go to images), works fine:

image

And compose of edge stack works because image was deployed:

image

@sondreslathia
Copy link

I can confirm this on a clean server (just installed from scratch). Running 2.19.1 CE.

@foureight84
Copy link

I can confirm this on a clean server (just installed from scratch). Running 2.19.1 CE.

Same with me on 2.19.4 BE, swarm with 5 nodes. It's been happening for a while. Some times updating the stack with the "pull image" toggle turned on will work, but most of the time I receive a "Cannot find image x." This is pulling from a private ECR repo.

@ncstc1
Copy link

ncstc1 commented Feb 15, 2024

As the problem has been there for a while without much visible progress, can someone from the team tell us if it has been identified and if there a good chance it will be fixed in the next release?

Thanks in advance

@JanBednarik
Copy link

JanBednarik commented Jun 7, 2024

I can confirm the same problem after upgrading Portainer to 2.20.3 BE.

We have GitLab server with mix of public and private projects with Docker registries. Credentials (username nad PAT) with access to all projects and registeries are configured in Portainer Registeries UI. I can see and pull both public and private images trough Portainer Images UI. And via docker login the same credentials are saved on all Swarm nodes where I can docker pull both public and private images.

When I try to Update a Service in Portainer UI with option "Re-pull image" checked, then it does not pull private images. The same issue happens when using Portainer API PUT /endpoints/{id}/forceupdateservice. Both methods worked fine in older Portainer versions. When it happens, syslog contains errors like:

swarm1 dockerd: level=info msg="Attempting next endpoint for pull after error: Head \"https://docker-registry.xxx.xx/v2/xx/xx/manifests/master\": unauthorized: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password. See https://gitlab.xxx.xx/help/user/profile/account/two_factor_authentication#troubleshooting"
swarm1 dockerd: level=error msg="pulling image failed" error="Head \"https://docker-registry.xxx.xx/v2/xx/xx/manifests/master\": unauthorized: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password. See https://gitlab.xxx.xx/help/user/profile/account/two_factor_authentication#troubleshooting" module=node/agent/taskmanager node.id=xx service.id=xx task.id=xx

In that case Portainer fails to start Tasks with status Rejected and Error message: "No such image: ...". And they are getting rejected until they start on some Node with image from previous deployment. So instead of redeploy with new images it just restarts service using old images.

Partial workaround is to stop and start a Stack. In that case it pulls images fine. However only to nodes where it just started new Tasks. On other Nodes are left outdated images which may be used if Service for some reason migrates to other Nodes.

Only safe workaround is to manually pull images on all Swarm Nodes before updating a service with Portainer.

@JoseArellanoV
Copy link

JoseArellanoV commented Dec 10, 2024

any updates on this? we used 2.20.1 CE and we facies the same issue.
for the registry image:

  • registry:2

I can confirm the same problem after upgrading Portainer to 2.20.3 BE.

We have GitLab server with mix of public and private projects with Docker registries. Credentials (username nad PAT) with access to all projects and registeries are configured in Portainer Registeries UI. I can see and pull both public and private images trough Portainer Images UI. And via docker login the same credentials are saved on all Swarm nodes where I can docker pull both public and private images.

When I try to Update a Service in Portainer UI with option "Re-pull image" checked, then it does not pull private images. The same issue happens when using Portainer API PUT /endpoints/{id}/forceupdateservice. Both methods worked fine in older Portainer versions. When it happens, syslog contains errors like:

swarm1 dockerd: level=info msg="Attempting next endpoint for pull after error: Head \"https://docker-registry.xxx.xx/v2/xx/xx/manifests/master\": unauthorized: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password. See https://gitlab.xxx.xx/help/user/profile/account/two_factor_authentication#troubleshooting"
swarm1 dockerd: level=error msg="pulling image failed" error="Head \"https://docker-registry.xxx.xx/v2/xx/xx/manifests/master\": unauthorized: HTTP Basic: Access denied. The provided password or token is incorrect or your account has 2FA enabled and you must use a personal access token instead of a password. See https://gitlab.xxx.xx/help/user/profile/account/two_factor_authentication#troubleshooting" module=node/agent/taskmanager node.id=xx service.id=xx task.id=xx

In that case Portainer fails to start Tasks with status Rejected and Error message: "No such image: ...". And they are getting rejected until they start on some Node with image from previous deployment. So instead of redeploy with new images it just restarts service using old images.

Partial workaround is to stop and start a Stack. In that case it pulls images fine. However only to nodes where it just started new Tasks. On other Nodes are left outdated images which may be used if Service for some reason migrates to other Nodes.

Only safe workaround is to manually pull images on all Swarm Nodes before updating a service with Portainer.

@john8329
Copy link

john8329 commented Jan 8, 2025

Same here, ECR images sometimes aren't downloaded due to the expired tokens. Re-deploying the stack works, updating a single service doesn't trigger the token renewal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants