-
Notifications
You must be signed in to change notification settings - Fork 40.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job resource - controller logic #13259
Job resource - controller logic #13259
Conversation
GCE e2e build/test passed for commit 0a2f62e0cfcb78f5e361b681fbdab2f3cd7b6e84. |
Labelling this PR as size/XXL |
0a2f62e
to
001c9c5
Compare
GCE e2e build/test passed for commit 001c9c56e6736f3bf63a68d76c8ce7b983d7f6b8. |
001c9c5
to
0224b7e
Compare
Rebased against latest master to pick up the client changes, esp. the experimental part needed for testing. |
GCE e2e build/test passed for commit 0224b7e1257bd701414b6cec2d946c887cb2531d. |
0224b7e
to
6c8eb65
Compare
The logic itself, looks finished. All is left is integration tests for the controller. |
GCE e2e build/test passed for commit 6c8eb655bf77375078e003be433cf331ae66f937. |
@bgrant0607 told me that he likes how you are computing successes and failures by listing pods. So that can stay. |
@soltysh
Anything else you can think of? |
I currently working on:
It would be great if you could write the user docs, I do have some examples, so I can create a separate PR with that. |
Okay, will work on user docs. |
6c8eb65
to
137ba8f
Compare
@erictune this is rebased controller part (I've also added new testclient action to simplify testing client, the other commit in this PR). I'm currently working on e2e tests, reaper and scaler and examples. I'll create separate PRs for those, to make the review easier, unless you want me to include them in here? |
137ba8f
to
5e522b7
Compare
GCE e2e build/test failed for commit 137ba8f3f924ec860be60387576e42f2359b9362. |
GCE e2e build/test failed for commit 5e522b7be264046bd9dcd69eadd9e8da5d177624. |
|
||
func (jm *JobManager) manageJob(activePods []*api.Pod, successful, unsuccessful int, job *experimental.Job) int { | ||
active := len(activePods) | ||
parallelism := *job.Spec.Parallelism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dereferencing without a nil check worries me. Is everyone else satisfied with knowing that if the object is created via conversion from a serialized type and if no one ever clears the field, this works? It worries me that this code depends on a particular path of building its objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems okay to me to skip nil checks in private methods.
do the e2e pass when you run them yourself, e.g. on vagrant? |
}, | ||
&experimental.Job{}, | ||
replicationcontroller.FullControllerResyncPeriod, | ||
framework.ResourceEventHandlerFuncs{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a follow-up PR, consider factoring below inline function to a named function.
LGTM. |
the separation of commits is fine. |
diff := *job.Spec.Completions - successful | ||
// for RestartPolicyNever we need to count unsuccessful pods as well | ||
if job.Spec.Template.Spec.RestartPolicy == api.RestartPolicyNever { | ||
diff -= unsuccessful |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this differs from the API docs. Fix the API docs in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or make this match the doc and the design, and we can revisit adding failed completion after further design discussion.
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
GCE e2e build/test passed for commit 8cefa2e. |
Automatic merge from SubmitQueue |
Auto commit by PR queue bot
"k8s.io/kubernetes/pkg/watch" | ||
) | ||
|
||
type JobManager struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called JobController
@mikedanese I'll be addressing your and @erictune's comments next week in a followup PR. |
This is 3rd part of the job resource trilogy. Depends on #12910 and #13254.
@sdminonne @akram this is providing basic running mechanism for jobs, I'm still working on it, but it's getting better 😉
@smarterclayton @derekwaynecarr fyi