-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple versions of addons running after upgrade. #37641
Comments
We could roll back #36008 to avoid the RC -> Deployment issue. To solve the 2 heapster deployments, we would have to either keep the version=v1.1.0 label in the v1.2.0 deployment or find a label combination that doesn't cause duplicate deployments to be created. We could address this in GKE with a post-upgrade cleanup script, and note this in the release notes with manual steps to correct this. I marked this as p0 because my counting fix won't work in 1.3 and the tests are still failing, and also there's a bigger issue at play. I would be ok with short term solutions to not encounter this for 1.5 and proper fixes in 1.6. |
cc @saad-ali |
We do injected the mechanism to delete the old ReplicationControllers/Deployments after an upgrade in Add-on Manager. This https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/addon-manager/kube-addons.sh#L191-L197 is for prunning the old ReplicationControllers. For the old heapter Deployment, because they have different names, Is there any way to retrieve the Add-on Manager's log from GKE master? |
cc @mikedanese |
The old resource pruning will have 1 minute delay though, it is for supporting zero downtime for kube-dns. But this seems not to be this case. |
Sorry, one mistake above. If the name of heapster Deployment changed, current Addon Manager will not prune the old one. This could be fixed by adding one more resource type in the same place(https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/addon-manager/kube-addons.sh#L191-L197). I'm taking look at why the old RCs were not pruned. |
@MrHohn yup, that's definitely it. Ping my on a PR and I can give you a quick review. |
We also need to merge #37139 to get the --prune-whitelist in. |
reference wrong issue, ignore above ^^^^^^ |
Yeah, but I think #37139 may not fix this issue since Addon Manager v6.0-alpha should be able to prune the old RCs in theory. I'm working on a repro on my own cluster (upgrade from 1.3 -> 1.5). Also checking GCE 1.4 -> 1.5 upgrade tests here, but found the Addon Manager's log looks normal. |
What currently is deployed doesn't have the prune whitelist and there are no RCs in the addons folder anymore so RCs aren't considered for pruning. I think we need both? |
You are right. I used to think that there is still one ReplicationController in the addons folder --- elasticsearch-logging-v1. But turns out this is not enable on GKE. If this is the case, #37139 combines with the quick fix for Deployment should do the job. Will sent that PR very soon. |
Automatic merge from submit-queue Fixes Addon Manager's pruning issue for old Deployments Fixes #37641. Attaches the `last-applied`annotations to the existing Deployments for pruning. Below images are built and pushed: - gcr.io/google-containers/kube-addon-manager:v6.1 - gcr.io/google-containers/kube-addon-manager-amd64:v6.1 - gcr.io/google-containers/kube-addon-manager-arm:v6.1 - gcr.io/google-containers/kube-addon-manager-arm64:v6.1 - gcr.io/google-containers/kube-addon-manager-ppc64le:v6.1 @mikedanese cc @saad-ali @krousey
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): No
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): None
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
Kubernetes version (use
kubectl version
):Environment:
What happened: Upgrades to version 1.5 (from any previous version) change existing addons from ReplicationControllers to Deployments without deleting old ReplicationControllers. This leads to multiple versions of the addons running at the same time. There also seems to be multiple deployments of heapster as well.
I found this as a counting error in https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/kubernetes-e2e-gke-container_vm-1.3-container_vm-1.5-upgrade-cluster/337. At first, I just thought it was incorrectly counting, and I attempted to fix that #36924. That fix is still valid and an improvement, but the underlying problem of multiple versions of addons still running is probably bad.
We need a mechanism to delete the old ReplicationControllers/Deployments after an upgrade.
The text was updated successfully, but these errors were encountered: