Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl apply retry stale resource version #26557

Merged
merged 1 commit into from
Jun 13, 2016

Conversation

adohe-zz
Copy link

@adohe-zz adohe-zz commented May 31, 2016

kubectl apply: retry applying a patch if a version conflict error is encountered

Analytics

fixes #15493
@pwittrock I just got my original implementation back, ptal.

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels May 31, 2016
@pwittrock pwittrock assigned pwittrock and unassigned brendandburns May 31, 2016
@pwittrock
Copy link
Member

@adohe Thanks, looking.

return err
}

func patchResource(p *patcher, current runtime.Object, modified []byte, source, namespace, name string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this a function on patcher?

@pwittrock
Copy link
Member

Review pass done. Would you add a test for this?

@adohe-zz
Copy link
Author

adohe-zz commented Jun 2, 2016

Yes, I will update this today.

@pwittrock pwittrock added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 3, 2016
@pwittrock
Copy link
Member

Also add a release not label please

@pwittrock
Copy link
Member

@adohe Send ad PTAL when you want me to take another look

@pwittrock
Copy link
Member

@adohe Do you plan to have this in before the 1.3 release branch is cut? We are working aggressively on test flakes and it would be good to get this in asap.

@pwittrock pwittrock added this to the v1.3 milestone Jun 7, 2016
@adohe-zz
Copy link
Author

adohe-zz commented Jun 7, 2016

@pwittrock quit sorry about the delay, I will have a short holiday from tomorrow, I can get away from my company stuff, and have more time to do this, I will finish this tomorrow.

@pwittrock
Copy link
Member

@adohe Thanks. Much appreciated. I don't want to disrupt your holiday if you had other plans. If you don't have time to work on this and would prefer to have some one patch it and finish it let me know. Otherwise I will try to be as responsive as possible when you are working on this tomorrow.

@adohe-zz
Copy link
Author

adohe-zz commented Jun 8, 2016

@pwittrock I just changed the code, ptal. I think the test is a little bit ugly, do you have any good idea? Besides whether we should apply the same to kubectl patch?

@pwittrock
Copy link
Member

Looking. Would be good to do the same for patch, but lets do it as a follow up.

var getErr error
patchBytes, err := p.patchSimple(current, modified, source, namespace, name)
for i := 0; i < maxPatchRetry && errors.IsConflict(err); i++ {
if p.backOff.IsInBackOffSinceUpdate(backOffKey, p.backOff.Clock.Now()) {
Copy link
Member

@pwittrock pwittrock Jun 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, this is a cool utility. I think it may be overkill for what we are trying to do here, since it seems primarily for synchornizing ratelimits across concurrent routines. At a minimum, we should sleep the backoff duration (sleep p.backOff.Get(backOffKey)). While I really appreciate you showing me this utility, if we don't need the concurrency protection, I think I'd prefer to go with something simpler such as:

Patcher {
...
backoff clockwork.Clock // = clockwork.NewRealClock()
}

...
if i > p.triesBeforeBackoff {
  p.backoff.Sleep(p.duration)
}

@pwittrock
Copy link
Member

pwittrock commented Jun 8, 2016

@adohe The test isn't beautiful, but I don't think it is much worse than the other tests in the same file. I'd leave it and consider cleaning things up a bit in a follow up if there is a clear way of doing so. Low priority.

Would love to see the same treatment to patch when you have time. I would do that as a separate PR since it is lower priority than this.

Minor comments, otherwise looking good:

  • Consider simplifying the backoff object used, I think what you have is overkill and more complex than needed
  • Don't actually sleep in the tests when backing off, use a fake backoff / clock impl or set the time to 0
  • Have the test also check that kubectl calls GET a second time and uses this result for the second patch (not the first GET result)
  • Fix the govet issue

@adohe-zz
Copy link
Author

adohe-zz commented Jun 9, 2016

@pwittrock some mirror thoughts here:

  • For simplicity, patcher use a simple backoff object, should we make the backOffPeriod and triesBeforeBackoff fields of pathcher? I want to make it so even now it's not.

I don't want to delay this PR any further on shaving 1 second of the test time, but in a follow up I would:

  • Make the backoffPeriod a field of the ApplyCmd
  • Pass the backoffPeriod from ApplyCmd to the Patcher when creating it
  • In the test set the backoffPeriod on ApplyCmd to 0
  • It's a little bit complicated to return a different currentRC in the test, so I would keep the test here and update it as a follow up.

SGTM. Thanks.

  • If we want to use a fake clock in test, we need to move patcher out of apply.

Don't worry about this for now. Generally, I would like to get to a point where it is simple for us to actually test we are backing off correctly, and this is really best done with a fake clock. I am not sure if the clockwork FakeClock really gives us what we want in this area anyway and we may have to come up with something more tailored to our needs. Ideally calls to sleep would return immediately, but record the sleep call and interval, and then during the test we could query how many times sleep was called and for how long. Definitely not critical right now, but would be good to have.

@pwittrock pwittrock added lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-label-needed labels Jun 9, 2016
@j3ffml j3ffml added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 9, 2016
@pwittrock
Copy link
Member

@k8s-bot unit test this issue: #26966

artifacts

@adohe-zz
Copy link
Author

@pwittrock thanks for detailed explanation.

@pwittrock
Copy link
Member

@adohe My pleasure.

@pwittrock
Copy link
Member

ERROR: Step ‘Publish JUnit test result report’ failed: no workspace for kubernetes-pull-build-test-e2e-gce #44180
[PostBuildScript] - Execution post build scripts.
ERROR: Build step failed with exception
java.lang.NullPointerException: no workspace from node hudson.slaves.DumbSlave[agent-pr-3] which is computer hudson.slaves.SlaveComputer@13a20a95 and has channel null

@adohe-zz
Copy link
Author

seems I can do nothing with this test failed. :(

@pwittrock
Copy link
Member

@k8s-bot e2e test this issue: #IGNORE Jenkins NPE

@pwittrock
Copy link
Member

@k8s-bot unit test this issue: #IGNORE never ran

1 similar comment
@adohe-zz
Copy link
Author

@k8s-bot unit test this issue: #IGNORE never ran

@pwittrock
Copy link
Member

@k8s-bot test this issue: #IGNORE Timeout

@k8s-bot
Copy link

k8s-bot commented Jun 13, 2016

GCE e2e build/test passed for commit 69bcdc2.

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-bot
Copy link

k8s-bot commented Jun 13, 2016

GCE e2e build/test passed for commit 69bcdc2.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 0d02f8c into kubernetes:master Jun 13, 2016
@adohe-zz adohe-zz deleted the patch_retry branch June 14, 2016 00:38
@adohe-zz adohe-zz restored the patch_retry branch June 15, 2016 06:34
@adohe-zz adohe-zz deleted the patch_retry branch June 15, 2016 06:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubectl apply should retry stale resource version.
8 participants