Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist Bindings annotations alongside pod Host assignment #5005

Merged
merged 1 commit into from
Mar 16, 2015
Merged

Persist Bindings annotations alongside pod Host assignment #5005

merged 1 commit into from
Mar 16, 2015

Conversation

jdef
Copy link
Contributor

@jdef jdef commented Mar 3, 2015

as per @bgrant0607:

Status must be completely reconstructable based on observations. The proposed Pod.Status.Annotations would not be. Furthermore, I don't think we should introduce another annotations field to Pod. I'd prefer to add a podAnnotations field to Binding, and then merge them into the pod. As for clearing the annotations, I'd prefer to simply fail a pod that was rejected by a node so as not to have to deal with that.

In this PR, Bindings.Annotations are merged with those of the Pod, overwriting the values of any pre-existing keys. If the binding fails, the pod is flagged as failed.

Also added error checking in Binding creation, returning a more appropriate Status object on errors.

cc @smarterclayton @erictune

@googlebot
Copy link

Thanks for your pull request.

It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA) at https://cla.developers.google.com/.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check the information on your CLA or see this help article on setting the email on your git commits.

Once you've done that, please reply here to let us know. If you signed the CLA as a corporation, please let us know the company's name.

@jdef
Copy link
Contributor Author

jdef commented Mar 3, 2015

@jdef
Copy link
Contributor Author

jdef commented Mar 3, 2015

FWIW I've already signed the CLA

@jdef
Copy link
Contributor Author

jdef commented Mar 3, 2015

@bgrant0607 bgrant0607 self-assigned this Mar 3, 2015
@jdef jdef changed the title Persist Bindings annotations alongside pod Host assignment WIP: Persist Bindings annotations alongside pod Host assignment Mar 4, 2015
@smarterclayton
Copy link
Contributor

I can do the review on this

@jdef
Copy link
Contributor Author

jdef commented Mar 4, 2015

I applied this patch to the k8s code that I'm running in kubernetes-mesos and the annotations are still not showing up in the pod: they're missing in the pod json returned by the apiserver and are also not present in etcd. Digging in...

Found the problem: PEBKAC. It's actually working as expected.

@jdef jdef changed the title WIP: Persist Bindings annotations alongside pod Host assignment Persist Bindings annotations alongside pod Host assignment Mar 4, 2015
// can't really be helped, since there's not really a way to do atomic
// multi-object changes in etcd.
if _, err2 := r.setPodHostTo(ctx, podID, machine, ""); err2 != nil {
glog.V(1).Infof("failing pod %v because constraint checks failed", podID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't make this v1. It will happen too frequently. I suggest v4.

@bgrant0607
Copy link
Member

Gratefully punting review to @smarterclayton, who is also touching /bindings right now.

This behavior needs to be documented, ideally somewhere that would appear in the swagger (e.g., in the description on the ObjectMeta field for Binding), as well as in a .md doc. Note that descriptions on fields for v1beta3 are in flight.

@jdef
Copy link
Contributor Author

jdef commented Mar 6, 2015

related work in #5054

@smarterclayton
Copy link
Contributor

Sorry for the delay, will get to this very soon.

err2 = r.store.Helper.AtomicUpdate(podKey, &api.Pod{}, false, func(obj runtime.Object) (runtime.Object, error) {
if pod, ok := obj.(*api.Pod); !ok {
return nil, fmt.Errorf("unexpected object: %#v", obj)
} else if pod.Status.Host != machine {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the else's here? Since you're returning they're not necessary and it fits our style more (if the if block returns, never use else).

@smarterclayton
Copy link
Contributor

Just my comment and Brian's I think. Sorry again for the delay.

@bgrant0607
Copy link
Member

Fails gofmt:

!! gofmt needs to be run on the following files: 
./pkg/registry/pod/etcd/etcd.go

}
pod.Status.Host = "" //TODO(jdef) get rid of this once BoundPods goes away
pod.Status.Phase = api.PodFailed
pod.Status.Message = fmt.Sprintf("Assignment to host %v failed", machine)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was too terse last time. Please also populate Reason with something like HostAssignmentFailed. The more detailed message is useful, but the concise form (in particular, without the hostname) is useful for stats collection.

@jdef
Copy link
Contributor Author

jdef commented Mar 9, 2015

@bgrant0607 It looks like reason codes are pretty specific and should closely track HTTP codes. So I changed the pod.status.message to what you suggested, and set the api.status.reason to Conflict. The api.status.message will include a more detailed description of what went wrong.

Sound ok?

err = etcderr.InterpretCreateError(err, "binding", "")
out = &api.Status{Status: api.StatusSuccess}
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, this is somewhat wrong, because InterpretCreateError may already create an appropriate api.Status. If assignPod needs to return special errors, that should happen in that method (using errors.NewConflict)

@jdef
Copy link
Contributor Author

jdef commented Mar 9, 2015

@smarterclayton just re-read your comment after I pushed the "clean up error handling" change. PTAL and let me know if you still think I should create NewConflict errors in assignPod.

err = etcderr.InterpretCreateError(err, "binding", "")
err = r.assignPod(ctx, binding.PodID, binding.Host, binding.Annotations)
if err = etcderr.InterpretCreateError(err, "binding", ""); err != nil {
if _, ok := err.(*errors.StatusError); !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now. Bindings internal storage will be cleanup up over successive iterations so we don't need the perfect solution now.

@smarterclayton
Copy link
Contributor

This looks good to me - can you go ahead and squash?

@jdef
Copy link
Contributor Author

jdef commented Mar 9, 2015

squashed

On Mon, Mar 9, 2015 at 11:30 AM, Clayton Coleman notifications@github.com
wrote:

This looks good to me - can you go ahead and squash?


Reply to this email directly or view it on GitHub
#5005 (comment)
.

James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

@smarterclayton
Copy link
Contributor

Will merge after tests pass again.

@smarterclayton
Copy link
Contributor

Hrm, seeing a transient failure that looks related to this pull (not sure if it's coincidence or enemy action):

--- FAIL: TestEtcdCreateBindingNoPod (0.00s)
    <autogenerated>:29: returning /registry/pods/default/foo: &etcd.Response{Action:"", Node:(*etcd.Node)(nil), PrevNode:(*etcd.Node)(nil), EtcdIndex:0x0, RaftIndex:0x0, RaftTerm:0x0} &etcd.EtcdError{ErrorCode:100, Message:"", Cause:"", Index:0x0}
    etcd_test.go:873: Unexpected error returned: &errors.StatusError{ErrStatus:api.Status{TypeMeta:api.TypeMeta{Kind:"", APIVersion:""}, ListMeta:api.ListMeta{SelfLink:"", ResourceVersion:""}, Status:"Failure", Message:"binding \"\" cannot be updated: 100:  () [0]", Reason:"Conflict", Details:(*api.StatusDetails)(0xc20817eec0), Code:409}}
FAIL
coverage: 87.1% of statements

@jdef
Copy link
Contributor Author

jdef commented Mar 9, 2015

Funny, I think it's related to another PR of mine that was just merged. I'll take a look

@jdef
Copy link
Contributor Author

jdef commented Mar 10, 2015

@smarterclayton I'm not sure what's holding up the coveralls test. From what I can see the tests pass.

@smarterclayton
Copy link
Contributor

K, will look in a bit

On Mar 10, 2015, at 3:58 PM, James DeFelice notifications@github.com wrote:

@smarterclayton I'm not sure what's holding up the coveralls test. From what I can see the tests pass.


Reply to this email directly or view it on GitHub.

}
pod.Status.Host = "" //TODO(jdef) get rid of this once BoundPods goes away
pod.Status.Phase = api.PodFailed
pod.Status.Message = "HostAssignmentFailed"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pod status message should be more informative, such as including the host. Search the code for examples.

@bgrant0607
Copy link
Member

Don't worry about coveralls. It appears to be wedged.

@jdef
Copy link
Contributor Author

jdef commented Mar 11, 2015

Thanks for the feedback @bgrant0607. I've pushed some changes to address your concerns.

@jdef
Copy link
Contributor Author

jdef commented Mar 11, 2015

PTAL @bgrant0607 @smarterclayton

@@ -1025,6 +1025,13 @@ const (
// Status code 409
StatusReasonConflict StatusReason = "Conflict"

// StatusReasonResourceConflict means the requested update operation cannot be completed
// due to a resource conflict in the operation. The client may need to alter the request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "due to conflicting usage of underlying system/infrastructure resources".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

On Wed, Mar 11, 2015 at 6:29 PM, Brian Grant notifications@github.com
wrote:

In pkg/api/types.go
#5005 (comment)
:

@@ -1025,6 +1025,13 @@ const (
// Status code 409
StatusReasonConflict StatusReason = "Conflict"

  • // StatusReasonResourceConflict means the requested update operation cannot be completed
  • // due to a resource conflict in the operation. The client may need to alter the request.

How about "due to conflicting usage of underlying system/infrastructure
resources".


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/pull/5005/files#r26262653
.

James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

@jdef
Copy link
Contributor Author

jdef commented Mar 12, 2015

addressed your concerns @bgrant0607 PTAL

@bgrant0607
Copy link
Member

Code LGTM.

Are you able to run the e2e test? The change to write Spec.Host in setPodHostTo broke it, due to somewhat fragile assumptions.

We just added documentation of status codes to docs/api-conventions.md. Could you please add documentation on ResourceConflict there?

@bgrant0607
Copy link
Member

Alternatively, we could remove ResourceConflict and change back to Conflict. Host mismatch or etcd collision are the only reasons for failure now, since we removed hostport conflict detection.

We'll likely want to add back some best-effort constraint checking at some point. See:
#5207 (comment)

But we don't have it now.

I'll leave it up to you.

cc @jszczepkowski

clarify resource conflict status, rebase to master
remove ResourceConflict, replace usage with Conflict
@jdef
Copy link
Contributor Author

jdef commented Mar 13, 2015

Removed ResourceConflict. I don't want to muddy the water with additional
constraint checking here.

On Thu, Mar 12, 2015 at 9:43 PM, Brian Grant notifications@github.com
wrote:

Alternatively, we could remove ResourceConflict and change back to
Conflict. Host mismatch or etcd collision are the only reasons for failure
now, since we removed hostport conflict detection.

We'll likely want to add back some best-effort constraint checking at some
point. See:
#5207 (comment)
#5207 (comment)

But we don't have it now.

I'll leave it up to you.

cc @jszczepkowski https://github.com/jszczepkowski


Reply to this email directly or view it on GitHub
#5005 (comment)
.

James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)

@jdef
Copy link
Contributor Author

jdef commented Mar 13, 2015

@bgrant0607 any final thoughts?

@jdef
Copy link
Contributor Author

jdef commented Mar 16, 2015

PTAL @bgrant0607 @smarterclayton

@bgrant0607
Copy link
Member

LGTM

bgrant0607 added a commit that referenced this pull request Mar 16, 2015
Persist Bindings annotations alongside pod Host assignment
@bgrant0607 bgrant0607 merged commit 4075891 into kubernetes:master Mar 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Binding annotations should be preserved alongside Host assignment
4 participants