Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addon updater should not retry too many times - specs may be invalid #10237

Merged

Conversation

marekbiskup
Copy link
Contributor

Fixes #10136 (error was found in production code, not in the test)

Risk of regression: small (e2e test passes; this is only retry on error; another retry is performed in 10 minutes, so cluster should recover even if these retries fail)

The patch is not so important for production code, but our e2e test fails because of a race condition (that may appear in production too)

@k8s-bot
Copy link

k8s-bot commented Jun 23, 2015

GCE e2e build/test failed for commit 68ffa0bd5792279f051c66a01897cf1282bceda2.

@marekbiskup marekbiskup force-pushed the addon-update-do-not-retry-too-long branch from 68ffa0b to 4980d87 Compare June 24, 2015 07:16
@@ -181,7 +181,7 @@ spec:
`

var addonTestPollInterval = 3 * time.Second
var addonTestPollTimeout = 3 * time.Minute
var addonTestPollTimeout = 1 * time.Minute
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rollback of my previous "fix".

@marekbiskup
Copy link
Contributor Author

@zmerlynn , you reviewed the add-on updater. Could you also be the first reviewer of this fix?

@k8s-bot
Copy link

k8s-bot commented Jun 24, 2015

GCE e2e build/test failed for commit 4980d8786b894462f20ab5682b5cf1e3a80d1e85.

@marekbiskup
Copy link
Contributor Author

@k8s-bot test this please

@zmerlynn
Copy link
Member

LGTM (edited by @marekbiskup)

@zmerlynn zmerlynn added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 26, 2015
@marekbiskup
Copy link
Contributor Author

@k8s-bot test this please

@k8s-bot
Copy link

k8s-bot commented Jun 29, 2015

GCE e2e build/test failed for commit 4980d8786b894462f20ab5682b5cf1e3a80d1e85.

@marekbiskup
Copy link
Contributor Author

Investigating errors - may be patch-related. Do not merge.

@marekbiskup marekbiskup assigned marekbiskup and unassigned zmerlynn Jun 29, 2015
@marekbiskup
Copy link
Contributor Author

Indeed, failures are patch-related. The API server may be down and creating addons may fail. Retrying after 10 minutes is too late then.

Conclusion - we'd better retry more often, But it doesn't make any sense to retry a single execution of kube-addon-updater.sh than the interval between consecutive invocations of this script.
I'll modify the production code only a bit and update the test instead.

@marekbiskup marekbiskup force-pushed the addon-update-do-not-retry-too-long branch from 4980d87 to ac74376 Compare June 30, 2015 08:50
@k8s-bot
Copy link

k8s-bot commented Jun 30, 2015

GCE e2e build/test passed for commit ac7437644d43e3a4fa39d0244d43532ff6bbe922.

@marekbiskup marekbiskup removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 1, 2015
@marekbiskup
Copy link
Contributor Author

Ready for review.
It blocks #10618, which blocks #10487, so it has become urgent.

@@ -266,6 +271,12 @@ function create-objects() {
local -r file_paths=$2
local file_path
for file_path in ${file_paths}; do
# Remember that the file may have disappear by now
# But we don't want to check it here because
# such race concition may always happen after
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*condition

@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

LGTM modulo nit. Needs rebase?

@marekbiskup marekbiskup force-pushed the addon-update-do-not-retry-too-long branch from ac74376 to b59477b Compare July 1, 2015 14:19
@marekbiskup
Copy link
Contributor Author

rebased

@marekbiskup marekbiskup force-pushed the addon-update-do-not-retry-too-long branch from b59477b to 7873e25 Compare July 1, 2015 14:22
@marekbiskup
Copy link
Contributor Author

and typo fixed

@zmerlynn zmerlynn added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 1, 2015
@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

LGTM

@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

cc @alex-mohr @quinton-hoole for rubber stamps.

@k8s-bot
Copy link

k8s-bot commented Jul 1, 2015

GCE e2e build/test passed for commit b59477b8494f4fc2dd029a54283f8f810cc59084.

@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

(In case not immediately obvious, stamps quickly would be appreciated. This is at a bottom of a chain that goes #10237 -> #10618 -> #10487, and I can only chase so many LGTM/merges today.)

@k8s-bot
Copy link

k8s-bot commented Jul 1, 2015

GCE e2e build/test failed for commit 7873e25.

@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

@k8s-bot: retest this please

and kindly stop flaking on #10622, kthxbai.

@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

@k8s-bot: retest this please

@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

@k8s-bot: ok to test

@zmerlynn
Copy link
Member

zmerlynn commented Jul 1, 2015

ONCALL VIOLATING PROCESS: After careful consideration, this is at the bottom of a chain of 3 PRs related to a heavily triaged issue, and all the PRs need some incremental soak time today, in advance of the holiday weekend. If you wish to take my badge for violating process, you may. Merging "to be blessed".

zmerlynn added a commit that referenced this pull request Jul 1, 2015
…too-long

addon updater should not retry too many times - specs may be invalid
@zmerlynn zmerlynn merged commit bd12aef into kubernetes:master Jul 1, 2015
@k8s-bot
Copy link

k8s-bot commented Jul 1, 2015

GCE e2e build/test passed for commit 7873e25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants