-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: ensure node upgrades between k8s versions work #7914
Comments
/subscribe |
Just to be sure, this issue is about upgrading API version, whereas #8082 is about upgrading binary version? |
I was thinking this was about upgrading the binary version of the full cluster. That includes:
My impression was that #8082 was about the "master" specifically, whereas this is about the "nodes" because we want to make sure normal workloads / e2es pass. But, because the master is upgraded first before node upgrades happens, this necessarily also includes a master in its process, so becomes a "whole cluster upgrade." Sorry if the wording is unclear (or if I'm confused; @roberthbailey can you confirm this interpretation is correct?). Please feel free to rename stuff. |
I concur with @mbforbes's assessment: We want to be able to test node upgrades. To do this, we either need to first upgrade the master, or somehow start with a cluster that has skew (there isn't currently a way to do this). So I'd consider #8082 to either be blocking this issue, or that they can be combined into a single test. |
/sub |
I'm happy to work on this, but we had explicitly split this off in case I'm holed up doing other upgrade work because it's pretty separable. (In other words, if you're reading this, it's unblocked, I haven't written another comment that I'm actively working on this, and you want to work on it, please feel free to self-assign.) |
OK, here is the plan: Create a specific test that will be skipped by default that goes through the following process (where "validate" means "ensure the resources exist and function correctly"):
Run this test, alone, on Jenkins:
We'll have to add more release validation jobs as we do more releases; this is a cost we can eat for a while. I'll own the PR to write this test, the GCE project creation and Jenkins setup/config to ensure this test is running. Follow-up extra credit involves testing more things (secrets, volumes, persistent volumes, ...+?). A non-goal for this issue is to upgrade the objects themselves (their serialized format on etcd). This is important but I think out of the scope of this issue. (Someone is likely addressing this elsewhere.) @roberthbailey @alex-mohr let me know if this doesn't sound good. |
Sorry, wrong issue. I meant #8081. |
Thanks for the pointer—seems super related, not just vaguely! |
As an update, I'm going to do this for the GKE provider first, as |
Tiny update as I see this is now a P0: I've been working on the e2e test code today and will have it out for review today or tomorrow. Regarding Jenkins, we've been rethinking how to do upgrade tests on Jenkins that involve just two or three Jenkins jobs that can run in parallel rather than six sequential ones; more details in this comment. The Jenkins tests that would close this issue would run in one of the slots outlined there. Regarding the test extensions I mentioned in #8081, my first PR for this will close out number 2 for sure, possibly more. |
Now that #9987 is merged, the remaining work for this issue is to
|
Both GCE and GKE upgrade (including node upgrade) builds are green. The last code thing is the final unfinished item from #8081 (comment):
We don't have any released (stable) versions that include these tests, but once we do, we'll need to continuously create new Jenkins jobs that verify supported release paths are upgrade-able. This should probably be in the GCE or GKE release instructions; I'll own getting those instructions written, but it's separate from this issue. |
This is blocked by using only MIG templates #7912 and a mechanism to do updates #6088.
Overview:
The text was updated successfully, but these errors were encountered: