-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24) #420
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest Please review the full test history for this PR and help us cut down flakes. |
e2e:
From
From the CVO logs:
The $ git --no-pager grep 'type Role struct'
vendor/k8s.io/api/rbac/v1/types.go:type Role struct {
vendor/k8s.io/api/rbac/v1alpha1/types.go:type Role struct {
vendor/k8s.io/api/rbac/v1beta1/types.go:type Role struct { Which has not changed in this PR? $ git --no-pager log -1 --oneline origin/master
e49df43a (wking/master, origin/release-4.7, origin/release-4.6, origin/master, origin/HEAD, master) Merge pull request #418 from wking/RetriveOnce-typo
$ git --no-pager grep 'type Role struct' e49df43a
e49df43a:vendor/k8s.io/api/rbac/v1/types.go:type Role struct {
e49df43a:vendor/k8s.io/api/rbac/v1alpha1/types.go:type Role struct {
e49df43a:vendor/k8s.io/api/rbac/v1beta1/types.go:type Role struct { So not clear on what's going on here yet... |
/retest Please review the full test history for this PR and help us cut down flakes. |
From kubernetes/kubernetes#49642:
But I can reproduce this in a new unit test, so I'll just stumble around until I figure out what magic we're missing to get automatic translation... |
/hold No need to spend CI on this while I debug the Role business. |
Ah: $ git --no-pager show --oneline ba32622 vendor/k8s.io/apimachinery/pkg/conversion/converter.go
ba32622c (HEAD -> client-go-bump, wking/client-go-bump, origin/pr/420) vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24)
diff --git a/vendor/k8s.io/apimachinery/pkg/conversion/converter.go b/vendor/k8s.io/apimachinery/pkg/conversion/converter.go
index 2d7c8bd1..838d5b0a 100644
--- a/vendor/k8s.io/apimachinery/pkg/conversion/converter.go
+++ b/vendor/k8s.io/apimachinery/pkg/conversion/converter.go
@@ -442,20 +442,20 @@ func (c *Converter) doConversion(src, dest interface{}, flags FieldMatchingFlags
if fn, ok := c.generatedConversionFuncs.untyped[pair]; ok {
return fn(src, dest, scope)
}
- // TODO: consider everything past this point deprecated - we want to support only point to point top level
- // conversions
dv, err := EnforcePtr(dest)
if err != nil {
return err
}
- if !dv.CanAddr() && !dv.CanSet() {
- return fmt.Errorf("can't write to dest")
- }
sv, err := EnforcePtr(src)
if err != nil {
return err
}
+ return fmt.Errorf("converting (%s) to (%s): unknown conversion", sv.Type(), dv.Type())
+
+ // TODO: Everything past this point is deprecated.
+ // Remove in 1.20 once we're sure it didn't break anything.
+
// Leave something on the stack, so that calls to struct tag getters never fail.
scope.srcStack.push(scopeStackElem{})
scope.destStack.push(scopeStackElem{}) So that's kubernetes/kubernetes#90018. |
Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion in unit tests. The manual conversion and wash through JSON seems like a terrible hack, but I haven't been able to figure out a more elegant approach yet. From [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642
Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion in unit tests. The manual conversion and wash through JSON seems like a terrible hack, but I haven't been able to figure out a more elegant approach yet. From [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642
Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion in unit tests. The manual conversion and wash through JSON seems like a terrible hack, but I haven't been able to figure out a more elegant approach yet. From [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. Also add similar logic for apiregistration, since that's the other group where we register multiple schema versions. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642
Ok, I think I have something that works in 6f247d4, although it's not particularly elegant. /hold cancel |
/retest |
/test e2e The failure did not seem related to the PR https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/420/pull-ci-openshift-cluster-version-operator-master-e2e/1288770413639241728 |
e2e: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/420/pull-ci-openshift-cluster-version-operator-master-e2e/1288770413639241728/artifacts/e2e/ipi-install-install/log-bundle-20200730102730.tar | tar -xOz log-bundle-20200730102730/bootstrap/containers/cluster-version-operator-3859ba2ba9e7592f8693b4ee362fda81b2c463b32f6970c7b1d2ea0f0a0226fd.log | grep 'Result of work' | tail -n2
I0730 10:23:46.632144 1 task_graph.go:546] Result of work: [Cluster operator etcd is still updating Cluster operator kube-apiserver is still updating Cluster operator kube-controller-manager is still updating Cluster operator kube-scheduler is still updating Cluster operator config-operator is still updating Cluster operator machine-api is still updating Cluster operator openshift-apiserver is still updating Cluster operator kube-storage-version-migrator is still updating Cluster operator cloud-credential is still updating Cluster operator authentication is still updating Could not update role "openshift-machine-api/cluster-autoscaler-operator" (211 of 606) Cluster operator csi-snapshot-controller is still updating Cluster operator image-registry is still updating Cluster operator ingress is still updating Cluster operator machine-approver is still updating Could not update clusterrole "cluster-node-tuning-operator" (294 of 606) Cluster operator openshift-controller-manager is still updating Could not update rolebinding "openshift/cluster-samples-operator-openshift-edit" (321 of 606): resource may have been deleted Cluster operator storage is still updating Could not update clusterrole "cluster-monitoring-operator" (361 of 606) Could not update oauthclient "console" (367 of 606): the server does not recognize this resource, check extension API servers Cluster operator insights is still updating Cluster operator operator-lifecycle-manager is still updating Could not update clusterrolebinding "marketplace-operator" (454 of 606) Cluster operator service-ca is still updating Could not update clusterrolebinding "default-account-cluster-network-operator" (477 of 606) Cluster operator dns is still updating Could not update clusterrolebinding "default-account-openshift-machine-config-operator" (503 of 606) Could not update role "openshift-authentication/prometheus-k8s" (514 of 606): resource may have been deleted]
I0730 10:25:16.724712 1 task_graph.go:546] Result of work: [] Dunno what's up with that yet. Ah, maybe somebody doesn't like our version-converted |
Ah, version conversion still busted: $ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/420/pull-ci-openshift-cluster-version-operator-master-e2e/1288770413639241728/artifacts/e2e/ipi-install-install/log-bundle-20200730102730.tar | tar -xOz log-bundle-20200730102730/bootstrap/containers/cluster-version-operator-3859ba2ba9e7592f8693b4ee362fda81b2c463b32f6970c7b1d2ea0f0a0226fd.log | grep 'role "' | tail -n1
E0730 10:27:31.386463 1 task.go:81] error running apply for clusterrole "cluster-node-tuning-operator" (294 of 606): resource name may not be empty |
/hold No need to retest while I'm taking another pass at the conversion failure. |
Pulling in [1,2]. Generated with: $ go get -u github.com/openshift/client-go@83993cebb5aee533bf2f2dded2d87b9e32835f4a go: github.com/openshift/client-go 94e1065ab1f8a34cd8a0e1edb9ba8be0326e77b5 => v0.0.0-20200723130357-94e1065ab1f8 $ go get -u github.com/openshift/library-go@cc498c355c99983057e9e01f3abcceb40ca5c298 go: github.com/openshift/library-go 1ed21c4fa86c2bf6034823231b8e52df4f068d58 => v0.0.0-20200724192307-1ed21c4fa86c $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.14.4 linux/amd64 This gets us up to 1.19 for the Kubernetes-based dependencies, and should hopefully allow me to bump openshift/api to pick up [3] without hitting: go: github.com/openshift/cluster-version-operator/lib imports k8s.io/client-go/kubernetes/scheme imports k8s.io/api/auditregistration/v1alpha1: package provided by k8s.io/api at latest version v0.18.6 but not at required version v0.19.0-rc.2 And the API bump in turn will allow us to start using the new Release type. [1]: openshift/client-go@94e1065 [2]: openshift/library-go@1ed21c4 [3]: openshift/api@3ae6754
Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion Because merging and application are also version-dependent, it's hard to paper over this in resourceread with automatic type conversion. Although from [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. Anyhow, with this commit, I'm doubling down on the approach from 4ee7b07 (Add apiextensions.k8s.io/v1 support for CRDs, 2019-10-22, openshift#259) and collapsing the readers into two helpers that support all of our types and return runtime.Object. From 0a255ab (cvo: Use protobuf for sending events and other basic API commands, 2019-01-18, openshift#90), protobuf is more efficient, so we should use it where possible. And because all of this is very tedious to maintain by hand, there's now a Python generator to spit out all the boilerplate. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642
Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion Because merging and application are also version-dependent, it's hard to paper over this in resourceread with automatic type conversion. Although from [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. Anyhow, with this commit, I'm doubling down on the approach from 4ee7b07 (Add apiextensions.k8s.io/v1 support for CRDs, 2019-10-22, openshift#259) and collapsing the readers into two helpers that support all of our types and return runtime.Object. From 0a255ab (cvo: Use protobuf for sending events and other basic API commands, 2019-01-18, openshift#90), protobuf is more efficient, so we should use it where possible. And because all of this is very tedious to maintain by hand, there's now a Python generator to spit out all the boilerplate. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642
Generated with: $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.14.4 linux/amd64
e2e failed on /override ci/prow/e2e |
@wking: Overrode contexts on behalf of wking: ci/prow/e2e In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold cancel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: LalatenduMohanty, vrutkovs, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
|
When we moved to 0.19 Kube tooling in ef236c3 (vendor: Bump client-go and library-go to current 4.6 tips 2020-07-24, openshift#420), the tooling moved to klog v2. Port our local klog consumers so we don't have to worry about configuring multiple klogs [1]. Generated with: $ sed -i 's|^go 1.13|go 1.15|' go.mod $ sed -i 's|k8s.io/klog v1.0.0|k8s.io/klog/v2 v2.3.0|' go.mod $ sed -i 's|"k8s.io/klog"|"k8s.io/klog/v2"|' $(git grep -l k8s.io/klog cmd lib pkg) $ sed -i 's/klog.V(\([0-9]\)) /klog.V(\1).Enabled() /' $(git grep -l klog.V cmd lib pkg) $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.15.2 linux/amd64 The change to Go 1.15 catches us up with 1d9a319 (Dockerfile.rhel: Bump to Go 1.15, 2020-09-16, openshift#457). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1883705
When we moved to 0.19 Kube tooling in ef236c3 (vendor: Bump client-go and library-go to current 4.6 tips 2020-07-24, openshift#420), the tooling moved to klog v2. Port our local klog consumers so we don't have to worry about configuring multiple klogs [1]. Generated with: $ sed -i 's|^go 1.13|go 1.15|' go.mod $ sed -i 's|k8s.io/klog v1.0.0|k8s.io/klog/v2 v2.3.0|' go.mod $ sed -i 's|"k8s.io/klog"|"k8s.io/klog/v2"|' $(git grep -l k8s.io/klog cmd lib pkg) $ sed -i 's|klog.V(\([0-9]\)) |klog.V(\1).Enabled() |' $(git grep -l klog.V cmd lib pkg) $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.15.2 linux/amd64 The change to Go 1.15 catches us up with 1d9a319 (Dockerfile.rhel: Bump to Go 1.15, 2020-09-16, openshift#457). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1883705
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.
I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.
I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.
I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628).
I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition
Pulling in openshift/client-go@94e1065ab1 and openshift/library-go@1ed21c4fa8. Generated with:
using:
This gets us up to 1.19 for the Kubernetes-based dependencies, and should hopefully allow me to bump openshift/api to pick up openshift/api@3ae6754513 without hitting:
And the API bump in turn will allow us to start using the new
Release
type (#419).