Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cri-api dependency to v0.26.0-beta.0 #7656

Merged
merged 2 commits into from
Nov 18, 2022

Conversation

ruiwen-zhao
Copy link
Member

@ruiwen-zhao ruiwen-zhao commented Nov 11, 2022

Signed-off-by: ruiwen-zhao ruiwen@google.com

This PR includes:

  1. Copy cri-api v1alpha2 to containerd internal directory
  2. Update cri-api dependency to v0.26.0-beta.0

1 is needed because v0.26.0-beta.0 does not have support for v1alpha2 apis. Since containerd still has clients that might need v1alpha2 api, we need to preserve the code. Therefore, I copied the cri-api v1alpha2 to containerd internal directory. More specifically, I

a. Copied vendor/k8s.io/cri-api/pkg/apis/runtime/v1 with cri-api v0.25.34 to third_party/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/
b. Modified Makefile to exclude pkg/cri/v1alpha2 from protos and proto-fmt check
c. Modified Makefile to exclude third_party/k8s.io/cri-api/pkg/apis/runtime/v1alpha2 from PACKAGES, API_PACKAGES, and NON_API_PACKAGES, so that it can be excluded from bin/protoc-gen-go-fieldpath check
c. Added containerd file header to third_party/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/constants.go

2 was done by
a. Running go get k8s.io/cri-api@1cee3924915eebf375ed6a2dd13aef6378158862, and then go mod tidy and go mod sum.
b. Adding place-holder (Unimplemented-error-returning) implementations for ListPodSandboxMetrics and ListMetricDescriptors

@k8s-ci-robot
Copy link

Hi @ruiwen-zhao. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -1,7 +1,7 @@
# Architecture of The CRI Plugin
This document describes the architecture of the `cri` plugin for `containerd`.

This plugin is an implementation of Kubernetes [container runtime interface (CRI)](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/api.proto). Containerd operates on the same node as the [Kubelet](https://kubernetes.io/docs/reference/generated/kubelet/). The `cri` plugin inside containerd handles all CRI service requests from the Kubelet and uses containerd internals to manage containers and container images.
This plugin is an implementation of Kubernetes [container runtime interface (CRI)](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1/api.proto). Containerd operates on the same node as the [Kubelet](https://kubernetes.io/docs/reference/generated/kubelet/). The `cri` plugin inside containerd handles all CRI service requests from the Kubelet and uses containerd internals to manage containers and container images.

The `cri` plugin uses containerd to manage the full container lifecycle and all container images. As also shown below, `cri` manages pod networking via [CNI](https://github.com/containernetworking/cni) (another CNCF project).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@ruiwen-zhao ruiwen-zhao Nov 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I update the "Kubernetes Support" section like the following?

| Kubernetes Version | containerd Version | CRI Version  |
|--------------------|--------------------|--------------|
| 1.20               | 1.5.0+             | v1alpha2     |
| 1.21               | 1.5.0+             | v1alpha2     |
| 1.22               | 1.5.5+             | v1alpha2     |
| 1.23               | 1.6.0+, 1,5.8+     | v1, v1alpha2 |
| 1.24               | 1.6.4+, 1.5.11+    | v1, v1alpha2 |
| 1.25               | 1.6.4+, 1.5.11+    | v1, v1alpha2 |
| 1.26               | ???                | v1 |

I assume this change will go to 1.7 so the containerd version here should be 1.7.0+? But we do want to support k8s 1.26 with 1.6.x containerd right?

Copy link
Member

@AkihiroSuda AkihiroSuda Nov 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good.
The containerd version will be v1.7.0+ .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right but with 1.6 being the LTS version, I guess we want to include 1.6.x for 1.26 as well? If thats the case do we know which patch version of 1.6.x should we use here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I guess 1.6.4+ is fine, assuming that the CRI v1 implementation of K v1.26 is compatible with K v1.25 and v1.24?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that should be the case. I put

| 1.26               | 1.7.0+, 1.6.4+     | v1           |

here.

@AkihiroSuda AkihiroSuda added impact/changelog area/cri Container Runtime Interface (CRI) labels Nov 11, 2022
@samuelkarp
Copy link
Member

/ok-to-test

@estesp
Copy link
Member

estesp commented Nov 11, 2022

Needs a rebase

@estesp
Copy link
Member

estesp commented Nov 11, 2022

You need to commit the changes to the vendoring in this PR as the project checks tries to validate go.mod and go.sum are correct and are finding differences because the removal isn't committed in this PR.

@ruiwen-zhao
Copy link
Member Author

You need to commit the changes to the vendoring in this PR as the project checks tries to validate go.mod and go.sum are correct and are finding differences because the removal isn't committed in this PR.

oh so I need to run go mod tidy && go mod vendor to include the vendor changes in this PR? Makes sense. Will do

@ruiwen-zhao
Copy link
Member Author

ruiwen-zhao commented Nov 11, 2022

The windows failure seems to be related...

1 attempt "run_crictl"! Trying again in 1 seconds...
2 attempt "run_crictl"! Trying again in 2 seconds...
3 attempt "run_crictl"! Trying again in 3 seconds...
4 attempt "run_crictl"! Trying again in 4 seconds...
time="2022-11-11T19:01:07Z" level=fatal msg="getting status of runtime: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.RuntimeService"
[SC] DeleteService SUCCESS

SERVICE_NAME: containerd-test 
        TYPE               : 10  WIN32_OWN_PROCESS  
        STATE              : 4  RUNNING 
                                (STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN)
        WIN32_EXIT_CODE    : 0  (0x0)
        SERVICE_EXIT_CODE  : 0  (0x0)
        CHECKPOINT         : 0x0
        WAIT_HINT          : 0x0
mingw32-make: *** [Makefile:223: cri-integration] Error 1
Error: Process completed with exit code 2.

Only windows tests are failing, but not the linux ones. So I am guessing windows tests are using their own config.toml

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per offline discussion..

let's keep (or at least try to keep) cri v1alpha2 in containerd until containerd v2.. this because we have a number of older clients still using containerd and they may need to update to the latest containerd for certain scenarios.. without moving up k/k and all the other clients still using cri v1alpha2 .. we can and should deprecate / provide notice from instrumented..

plan would be to vendor the v1alpha2 apis from one version of cri-api and v1 can go ahead and move up

@ruiwen-zhao
Copy link
Member Author

per offline discussion..

let's keep (or at least try to keep) cri v1alpha2 in containerd until containerd v2.. this because we have a number of older clients still using containerd and they may need to update to the latest containerd for certain scenarios.. without moving up k/k and all the other clients still using cri v1alpha2 .. we can and should deprecate / provide notice from instrumented..

plan would be to vendor the v1alpha2 apis from one version of cri-api and v1 can go ahead and move up

I was doing some golang research but didn't really find a way to do this. To make sure I captured it correct, we would like

  1. containerd to depend on both cri-api v0.26.x (which has latest everything), and v0.25.3 (which has v1alpha2 support)
  2. k8s.io/cri-api/pkg/apis/runtime/v1 to use on v0.26.x, and k8s.io/cri-api/pkg/apis/runtime/v1alpha2 to use v0.25.3.

I tried using something like

replace k8s.io/cri-api/pkg/apis/runtime/v1alpha2 => k8s.io/cri-api v0.25.3

but i can't make it point to a directory under cri-api. Did i miss something here?

@mikebrow
Copy link
Member

mikebrow commented Nov 14, 2022

nod.. sans a go package trick.. since grpc only registers ServiceName: "runtime.v1alpha2.RuntimeService", let's just fork the v1alpha2 api for now through containerd v2.. maybe into containerd/containerd/pkg/cri/v1alpha2 do the renaming of the package import directory in a couple files and good to go I think!

@ruiwen-zhao ruiwen-zhao force-pushed the removeAlpha branch 2 times, most recently from b346115 to 6991a27 Compare November 14, 2022 19:57
@ruiwen-zhao ruiwen-zhao changed the title Remove runtime v1alpha2 Update cri-api dependency to v0.26.0-beta.0 Nov 14, 2022
@ruiwen-zhao ruiwen-zhao force-pushed the removeAlpha branch 4 times, most recently from 1557a73 to 1dbc1c0 Compare November 14, 2022 21:30
@ruiwen-zhao
Copy link
Member Author

protobuf check fails with

go build  -gcflags=-trimpath=/home/runner/work/containerd/containerd/src -buildmode=pie  -o bin/protoc-gen-go-fieldpath -ldflags '-X github.com/containerd/containerd/version.Version=b055e73 -X github.com/containerd/containerd/version.Revision=b055e736c2f16d4a8d444cab43c420ce6840e96e -X github.com/containerd/containerd/version.Package=github.com/containerd/containerd -s -w ' -tags "urfave_cli_no_docs"  ./cmd/protoc-gen-go-fieldpath
+ protos
github.com/gogo/protobuf/gogoproto/gogo.proto: File not found.
github.com/containerd/containerd/pkg/cri/v1alpha2/api.proto:23:1: Import "github.com/gogo/protobuf/gogoproto/gogo.proto" was not found or had errors.
2022/11/14 21:32:14 protoc -I./protobuf:/home/runner/work/containerd/containerd/src:/usr/local/include:/usr/include --go_out=/home/runner/work/containerd/containerd/src /home/runner/work/containerd/containerd/src/github.com/containerd/containerd/pkg/cri/v1alpha2/api.proto
make: *** [Makefile:179: protos] Error 1
Error: Process completed with exit code 2.

I am just copying the existing proto file instead of re-generating the proto, so I guess that caused some problem.

@ruiwen-zhao
Copy link
Member Author

Project checks fails with

Run echo "::group::📚 File headers"
  echo "::group::📚 File headers"
  /home/runner/work/_actions/containerd/project-checks/v1/script/validate/fileheader /home/runner/work/_actions/containerd/project-checks/v1/
  echo "::endgroup::"
  shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
  env:
    GO_VERSION: 1.19.3
    GOPATH: /home/runner/work/containerd/containerd
📚 File headers
  /home/runner/work/containerd/containerd/bin/ltag
  pkg/cri/v1alpha2/constants.go
  Error: Process completed with exit code 1.

Not too much info here...

@samuelkarp
Copy link
Member

samuelkarp commented Nov 15, 2022

📚 File headers
  /home/runner/work/containerd/containerd/bin/ltag
  pkg/cri/v1alpha2/constants.go

Not too much info here...

pkg/cri/v1alpha2/constants.go has the wrong file header. The right header is here, though since you copied that file from another project you'll likely want to keep both headers and note where the file came from in a comment too.

@ruiwen-zhao
Copy link
Member Author

ruiwen-zhao commented Nov 16, 2022

Thanks @samuelkarp for the pointers! I added containerd header to constant.go and removed gogoproto references from cri-api proto (some how I cannot verify if it works locally so has to push it to the PR and see)

One thing I want to call out here is that, by applying the manual fixes to the copied cri-api code, we are creating a divergence between the containerd-owned cri-api v1alpha2 code and the real cri-api code. I don't think this is a big problem because v1alpha2 is deprecated and no one will be changing it anymore. So the divergence will not increase.

If this kind of divergence is not acceptable in any way, an alternative of maintaining v1alpha2 dependency while picking up the latest v1 cri-api is to create a github fork repo pointing to 0.25.x cri-api, which still has v1alpha2 code. And make contaienrd's v1alpha2 dependency point to that fork.


Update:

So removing gogproto refs will make the proto check fail again because the generated go code is different than the copied-over one...

@ruiwen-zhao ruiwen-zhao force-pushed the removeAlpha branch 7 times, most recently from f3d813e to 4a4ec93 Compare November 17, 2022 19:29
@ruiwen-zhao
Copy link
Member Author

Ok cool, finally made protobuf check happy by excluding pkg/cri/v1alpha2 from multiple checks in Makefile. Updated the details in the PR description.

@ruiwen-zhao
Copy link
Member Author

Code scanning is failing because cri-api code returns Password in one of the methods:

func (this *AuthConfig) String() string {
	if this == nil {
		return "nil"
	}
	s := strings.Join([]string{`&AuthConfig{`,
		`Username:` + fmt.Sprintf("%v", this.Username) + `,`,
		`Password:` + fmt.Sprintf("%v", this.Password) + `,`,
		`Auth:` + fmt.Sprintf("%v", this.Auth) + `,`,
		`ServerAddress:` + fmt.Sprintf("%v", this.ServerAddress) + `,`,
		`IdentityToken:` + fmt.Sprintf("%v", this.IdentityToken) + `,`,
		`RegistryToken:` + fmt.Sprintf("%v", this.RegistryToken) + `,`,
		`}`,
	}, "")
	return s
}

Is Code scanning configured anywhere and I can exclude this file from the check, just like vendor?

@samuelkarp
Copy link
Member

Ok cool, finally made protobuf check happy by excluding pkg/cri/v1alpha2 from multiple checks in Makefile. Updated the details in the PR description.

I don't think we should do this. Rather than copying the generated api.pb.go file, can we copy the api.proto file, adjust the file header, and use the existing proto generation logic in containerd to generate the new api.pb.go file?

Is Code scanning configured anywhere and I can exclude this file from the check, just like vendor?

We should probably exclude this field from logging rather than adjust the check.

@ruiwen-zhao
Copy link
Member Author

ruiwen-zhao commented Nov 18, 2022

I don't think we should do this. Rather than copying the generated api.pb.go file, can we copy the api.proto file, adjust the file header, and use the existing proto generation logic in containerd to generate the new api.pb.go file?

Yes we can do that technically. But I think that the cri-api go code should be treated in the same way as vendor. Containerd has been using the cri-api go code built by k8s already.

We should probably exclude this field from logging rather than adjust the check.

Maybe I am looking at wrong places but I don't think instrumented_server logs Password. For example, in the failed run , it complains

Check failure on line 79 in pkg/cri/sbserver/instrumented_service.go
... Sensitive data returned by an access to Password flows to a logging call.

But looking at the code where the check fails, instrumented_service only logs PodSandboxMetadata, which doesn't contain AuthConfig.Password:

defer func() {
		if err != nil {
			log.G(ctx).WithError(err).Errorf("RunPodSandbox for %+v failed, error", r.GetConfig().GetMetadata())

Note that AuthConfig is part of pull image request, so it is not even part of RunPodSandbox API call, so it shouldn't even be in the error.

Similar goes to the other failure, it only logs the image name:

defer func() {
		if err != nil {
			log.G(ctx).WithError(err).Errorf("PullImage %q failed", r.GetImage().GetImage())

What also confuses me is that this PR does not change the cri-api v1alpha2 code, why this check only fails now?

@ruiwen-zhao
Copy link
Member Author

Regarding code scanning failures, sync'ed with @samuelkarp offline. For all the 10 scanning failures, the tool thinks the error message returned from PullImage is exposing AuthConfig.Password. And the tool thinks ref.String() at https://github.com/containerd/containerd/blob/main/reference/docker/reference.go#L632 is the root cause of leaking AuthConfig.Password, which doesn't seem to be true.

I dont know what throws off Code Scanning in this case. But I am gonna update my PR to run Code Scanning on only the first commit, and see if it solves the problem.

@ruiwen-zhao ruiwen-zhao force-pushed the removeAlpha branch 3 times, most recently from 933aa30 to e7d84d8 Compare November 18, 2022 20:09
@mikebrow
Copy link
Member

Regarding code scanning failures, sync'ed with @samuelkarp offline. For all the 10 scanning failures, the tool thinks the error message returned from PullImage is exposing AuthConfig.Password. And the tool thinks ref.String() at https://github.com/containerd/containerd/blob/main/reference/docker/reference.go#L632 is the root cause of leaking AuthConfig.Password, which doesn't seem to be true.

I dont know what throws off Code Scanning in this case. But I am gonna update my PR to run Code Scanning on only the first commit, and see if it solves the problem.

this all sounds familiar I may have had to do the override to approve this before..

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on green

@ruiwen-zhao
Copy link
Member Author

So even with only one commit, Code Scanning still fails. I will add back the other commit now.

@samuelkarp
Copy link
Member

Yes we can do that technically. But I think that the cri-api go code should be treated in the same way as vendor. Containerd has been using the cri-api go code built by k8s already.

My preference would be to regenerate the Go code for consistency with the rest of the protos in containerd. However, if we must consume it exactly, let's move it into a clearly-marked "third party" folder and then apply the exception at that level.

Signed-off-by: ruiwen-zhao <ruiwen@google.com>
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
@ruiwen-zhao
Copy link
Member Author

Yes we can do that technically. But I think that the cri-api go code should be treated in the same way as vendor. Containerd has been using the cri-api go code built by k8s already.

My preference would be to regenerate the Go code for consistency with the rest of the protos in containerd. However, if we must consume it exactly, let's move it into a clearly-marked "third party" folder and then apply the exception at that level.

Makes sense. I moved the v1alpha2 proto files to third_party/k8s.io/cri-api/pkg/apis/runtime/v1alpha2/ instead. Also updated PR description.

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on green cleared the scanning checks.. we do pass passwords through for pull authentication, and did have to remove passwords from logs way back around the start of the cri plugin... but we are not logging passwords in the code path any more. That we are having an error on pull request is necessary to report to the caller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) impact/changelog ok-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants