-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTA-222: Add a manifest annotation to be used for object deletion #438
OTA-222: Add a manifest annotation to be used for object deletion #438
Conversation
bff3f3e
to
83ceaa0
Compare
83ceaa0
to
42c6d09
Compare
15720a8
to
cf015af
Compare
/retest |
f758f44
to
e73b06a
Compare
@jottofar: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Until we can remove that namespace. Some background in [1]. We cannot remove the namespace until we land [2]. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1875900 [2]: openshift/cluster-version-operator#438
@jottofar: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
6f3cb50
to
8478432
Compare
/test unit |
/test e2e-agnostic-upgrade |
fd89f4d
to
37814ce
Compare
/retest Please review the full test history for this PR and help us cut down flakes. |
16 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/cc @wking Can I get an override on |
/override ci/prow/e2e-agnostic-upgrade |
@jottofar: Overrode contexts on behalf of jottofar: ci/prow/e2e-agnostic-upgrade In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…=False Because: Upgradeable=False Reason: MultipleReasons Message: Cluster cannot be upgraded between minor versions for multiple reasons: AdminAckRequired,IncompatibleOperatorsInstalled doesn't include all the useful information needed to resolve those issues. This pivots to using the same approach we use today when aggregating multiple Upgradeable=False ClusterOperators. Also shift the patch-blocking overrides check in front of the other minor blocking checks. It's more important (because it blocks more targets), and it also will not included a nested sub-list (which might be confusing, because today I'm not adding indents or any such markers to make it easy to distinguish levels in a nested list hierarchy). This cherry-picks 0c742e5 (pkg/cvo/upgradeable: Include messages for multiple-reason Upgradeable=False, 2021-10-07, openshift#671) back to 4.8, with a minor context change because 4.8 doesn't have 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438).
…=False Because: Upgradeable=False Reason: MultipleReasons Message: Cluster cannot be upgraded between minor versions for multiple reasons: AdminAckRequired,IncompatibleOperatorsInstalled doesn't include all the useful information needed to resolve those issues. This pivots to using the same approach we use today when aggregating multiple Upgradeable=False ClusterOperators. Also shift the patch-blocking overrides check in front of the other minor blocking checks. It's more important (because it blocks more targets), and it also will not included a nested sub-list (which might be confusing, because today I'm not adding indents or any such markers to make it easy to distinguish levels in a nested list hierarchy). This cherry-picks 0c742e5 (pkg/cvo/upgradeable: Include messages for multiple-reason Upgradeable=False, 2021-10-07, openshift#671) back to 4.8, with a minor context change because 4.8 doesn't have 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438).
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.
The original APIService reconciliation landed in 662e182 (handle APIService, Service objects, 2018-09-27, openshift#26). We definitely reconcile a bunch of Service manifests (e.g. for serving operator metrics, including the CVO's own Service), but it's not clear what the use case was for APIService. DeleteAPIServicev1 landed in 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438), but was never consumed. We dropped APIService reconciliation support in 5681a70 (Drop APIService support, 2021-06-10, openshift#566). This commit drops the unused DeleteAPIServicev1. Auditing z-stream tips from 4.3 through 4.11: $ oc adm release extract --to 4.11 quay.io/openshift-release-dev/ocp-release-nightly@sha256:080e5cf5e3e043ac0877b8f545ba2b596016f3fd76f3f457d15060603b3615e1 $ oc adm release extract --to 4.10 quay.io/openshift-release-dev/ocp-release:4.10.13-x86_64 $ oc adm release extract --to 4.9 quay.io/openshift-release-dev/ocp-release:4.9.32-x86_64 $ oc adm release extract --to 4.8 quay.io/openshift-release-dev/ocp-release:4.8.39-x86_64 $ oc adm release extract --to 4.7 quay.io/openshift-release-dev/ocp-release:4.7.50-x86_64 $ oc adm release extract --to 4.6 quay.io/openshift-release-dev/ocp-release:4.6.57-x86_64 $ oc adm release extract --to 4.5 quay.io/openshift-release-dev/ocp-release:4.5.41-x86_64 $ oc adm release extract --to 4.4 quay.io/openshift-release-dev/ocp-release:4.4.33-x86_64 $ oc adm release extract --to 4.3 quay.io/openshift-release-dev/ocp-release:4.3.40-x86_64 $ grep -r 'kind: APIService' 4.* ...no hits...
I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628).
The new manifest annotation will trigger new CVO delete logic, added by this PR, to delete the in-cluster object instead of creating/updating it. This will provide a more straightforward way for engineering to remove content.
See Enhancement for adding a manifest annotation for object removal
#419 for implementation details.
Per the enhancement whenever object deletions are pending, one or more deletions have been requested but object removal has not completed, the cluster is blocked from upgrading to the next minor release (
Upgradeable==false
).if an object previously removed reappears a warning is logged which meets what is specified by the enhancement. This does not result in
Upgradeable==false
and no attempt is made to remove the object again.Regarding implementation of enhancement section Upgrade / Downgrade Strategy, since an upgrade can result in >1 CVO restarts there is no straightforward way to distinguish the cause of an IsNotFound error. That is, discerning whether a given object being requested for deletion was ever present in the release or had been removed by a previous CVO pod. A warning is logged whenever an object specified for deletion is not found, and no indication exists in the current CVO pod that an object a delete request had been issued.
Since k8s.io v1beta1 APIs are no longer served In kuberentes v1.22 and are being removed via #545 delete is not implemented for these types. Users are expected to use v1 apiVersion in these delete manifests.