Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull image from a cross-subscription azure container registry does not work by MSI #67892

Closed
andyzhangx opened this issue Aug 27, 2018 · 10 comments · Fixed by #77245
Closed

pull image from a cross-subscription azure container registry does not work by MSI #67892

andyzhangx opened this issue Aug 27, 2018 · 10 comments · Fixed by #77245
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@andyzhangx
Copy link
Member

andyzhangx commented Aug 27, 2018

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

What happened:
PR(#66429) fixed pull image error from a cross-subscription azure container registry by service principal, while for MSI, this issue is not fixed.
Need to find a way how to fix this issue for MSI, related code is here:

if a.config.UseManagedIdentityExtension {
glog.V(4).Infof("listing registries")
result, err := a.registryClient.List(ctx)
if err != nil {
glog.Errorf("Failed to list registries: %v", err)
return cfg
}
for ix := range result {
loginServer := getLoginServer(result[ix])
glog.V(2).Infof("loginServer: %s", loginServer)
cred, err := getACRDockerEntryFromARMToken(a, loginServer)
if err != nil {
continue
}
cfg[loginServer] = *cred
}

Here the difficulty is registryClient.List() could not list all ACR since it's based on one subscriptionID:

registryClient := containerregistry.NewRegistriesClient(subscriptionID)

So if we could not get loginServer in a different subscription, how could we set the DockerConfigEntry.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):
error msg would be like following when pull a repository from an ACR in another subscription:

Events:
  Type     Reason     Age                From                               Message
  ----     ------     ----               ----                               -------
  Normal   Scheduled  31s                default-scheduler                  Successfully assigned default/nginx to k8s-agentpool-95620664-0
  Normal   Pulling    15s (x2 over 30s)  kubelet, k8s-agentpool-95620664-0  pulling image "andyacr.azurecr.io/nginx-server:1.0.0"
  Warning  Failed     14s (x2 over 28s)  kubelet, k8s-agentpool-95620664-0  Failed to pull image "xxx.azurecr.io/nginx-server:1.0.0": rpc error: code = Unknown desc = Error response from daemon: Get https://xxx.azurecr.io/v2/nginx-server/manifests/1.0.0: unauthorized: authentication required
  Warning  Failed     14s (x2 over 28s)  kubelet, k8s-agentpool-95620664-0  Error: ErrImagePull
  Normal   BackOff    1s (x2 over 27s)   kubelet, k8s-agentpool-95620664-0  Back-off pulling image "andyacr.azurecr.io/nginx-server:1.0.0"
  Warning  Failed     1s (x2 over 27s)   kubelet, k8s-agentpool-95620664-0  Error: ImagePullBackOff

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): v.9 - v1.12
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

/sig azure
/assign

@andyzhangx
Copy link
Member Author

andyzhangx commented Sep 10, 2018

By further investigation, I think we should implement LazyProvider for azure acrProvider:

// DockerConfigProvider is the interface that registered extensions implement
// to materialize 'dockercfg' credentials.
type DockerConfigProvider interface {
	// Enabled returns true if the config provider is enabled.
	// Implementations can be blocking - e.g. metadata server unavailable.
	Enabled() bool
	// Provide returns docker configuration.
	// Implementations can be blocking - e.g. metadata server unavailable.
	Provide() DockerConfig
	// LazyProvide() gets called after URL matches have been performed, so the
	// location used as the key in DockerConfig would be redundant.
	LazyProvide() *DockerConfigEntry
}

Related code:

authConfig := credentialprovider.LazyProvide(currentCreds)

func LazyProvide(creds LazyAuthConfiguration) dockertypes.AuthConfig {
if creds.Provider != nil {
entry := *creds.Provider.LazyProvide()
return DockerConfigEntryToLazyAuthConfiguration(entry).AuthConfig
} else {
return creds.AuthConfig

@andyzhangx
Copy link
Member Author

Update:
It's hard to fix this since in the beginning we cannot list ACR in another subscription, registryClient.List(ctx) will always list acr under current subscription, and without target loginServer address of ACR, we cannot get registryRefreshToken:

directive, err := receiveChallengeFromLoginServer(loginServer)

And in LazyProvide func, we cannot get target repo, it's not passed on

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 16, 2019
@justaugustus
Copy link
Member

@andyzhangx -- is this still an issue? If not, let's /close it out.

@andyzhangx
Copy link
Member Author

@justaugustus it's still an issue, and currently it should be low priority, let's keep it

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@justaugustus
Copy link
Member

/remove-lifecycle rotten
/reopen

@k8s-ci-robot
Copy link
Contributor

@justaugustus: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Feb 19, 2019
@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants