Define a set of methods(known as golang interface) to shape the default behaviors of job. In addition, provide a full controller and one implementation example(based on Kubernetes Job) for developers to follow when building their own controllers. This will help the community to integrate with Kueue more easily.
From day 0 in Kueue, we natively support Kubernetes Job by leveraging the capacity of suspend, this helps us to build a multi-tenant job queueing system in Kubernetes, this is attractive to other job-like applications like MPIJob, who lacks the capacity of queueing.
The good news is Kueue is extensible and simple to integrate with through the intermediate medium, we named Workload in Kueue, what we need to do is to build a controller to reconcile the workload and the job itself.
But the complexity lays in developers who are familiar with job-like applications may have little knowledge of the implementation details of Kueue and they have no idea where to start to build the controller. In this case, if we can provide an interface which defines the default behaviors of the Job, and serve the Kubernetes Job as a standard template, it will do them a great favor.
- Define an interface which shapes the default behaviors of Job
- Provide a full controller implementation which can be reused for different jobs
- Make Kubernetes Job an implementation template of the interface
- Integrate any job-like applications
We collected feedback from the community about how to fully integrate Kueue with MPIJob, see #499.
- Job interface defined here is a hint for developers to build their own controllers. It's a hard constrain if they wish to use the controller, but they can always write the controllers from scratch.
- This will increase the code complexity by wrapping the original Jobs.
Provide a full controller may lead to the interface changes more frequently, we can make the interface as small as possible to mitigate this.
We will define a new interface named GenericJob, this should be implemented by custom job-like applications:
type GenericJob interface {
// Object returns the job instance.
Object() client.Object
// IsSuspended returns whether the job is suspend or not.
IsSuspended() bool
// Suspend will suspend the job.
Suspend() error
// UnSuspend will unsuspend the job.
UnSuspend() error
// InjectNodeAffinity will inject the node affinity extracting from workload to job.
InjectNodeSelectors(nodeSelectors []map[string]string) error
// RestoreNodeAffinity will restore the original node affinity of job.
RestoreNodeSelectors(nodeSelectors []map[string]string) error
// Finished means whether the job is completed/failed or not,
Finished() finished bool
// PodSets returns the podSets corresponding to the job.
PodSets() []kueue.PodSet
// EquivalentToWorkload validates whether the workload is semantically equal to the job.
EquivalentToWorkload(wl kueue.Workload) bool
// PriorityClass returns the job's priority class name.
PriorityClass() string
// QueueName returns the queue name the job enqueued.
QueueName() string
// IgnoreFromQueueing returns whether the job should be ignored in queuing, e.g. lacking the queueName.
IgnoreFromQueueing() bool
// PodsReady instructs whether job derived pods are all ready now.
PodsReady() bool
We'll wrap the batchv1.Job to BatchJob
who implements the GenericJob interface.
type BatchJob struct {
batchv1.Job
}
var _ GenericJob = &BatchJob{}
Besides, we'll provide a full controller for developers to follow, all they need to do is just implement the GenericJob interface.
type reconcileOptions struct {
client client.Client
scheme *runtime.Scheme
record record.EventRecorder
manageJobsWithoutQueueName bool
waitForPodsReady bool
}
func GenericReconcile(ctx context.Context, req ctrl.Request, reconcileOptions) (ctrl.Result, error) {
// generic logics here
}
// Take batchv1.Job for example, all we want to do is just calling the GenericReconcile()
func (r *JobReconciler) Reconcile(ctx context.Context, req ctrl.Request, job GenericInterface) (ctrl.Result, error) {
var batchJob BatchJob
return GenericReconcile(ctx, req, &batchJob, reconcileOptions)
}
GenericReconcile:
// Ignore unmanaged jobs, like lacking queueName.
if job.Ignored():
return
// Ensure there's only one corresponding workload and
// return the matched workload, it could be nil.
workload = EnsureOneWorkload()
// Handing job is finished.
if job.Finished():
// Processing marking workload finished if not.
SetWorkloadCondition()
return
// Handing workload is nil.
if workload == nil:
// If workload is nil, the job should be unsuspend.
if !job.IsSuspend():
// When stopping the job, we'll call Suspend(), RestoreNodeAffinity() etc.,
// and update the job with client.
StopJob()
// When creating workload, we'll call PodSets(), QueueName(), PodsCount() etc.
// to fill up the workload.
workload = CreateWorkload()
// creating the constructed workload with client
// ...
// Handing job is suspend.
if job.IsSuspend():
// If job is suspend but workload is admitted,
// we should start the job.
if workload.Spec.Admission != nil:
// When starting the job, we'll call Unsuspend(), InjectNodeAffinity() etc..
StartJob()
return
// If job is suspend but we changed its queueName,
// we should update the workload's queueName.
// ...
// Handing job is unsuspend.
// If job is unsuspend but workload is unadmitted,
// we should suspend the job.
if workload.Spec.Admission == nil:
StopJob()
return
// Processing other logics like all-or-nothing scheduling
// ...
[x] I/we understand the owners of the involved components may require updates to existing tests to make this code solid enough prior to committing the changes necessary to implement this enhancement.
No.
pkg/controller/workload/job
:2023.01.30
-5.5%
(This is output via the go tool)
This is more like a refactor of the current implementation, theoretically no need to add more integration tests.
- 2023.01.09: KEP proposed for review, including motivation, proposal, risks, test plan.
- It will increase some maintenance costs, like if we change the interface, so we should minimize this kind of changes.
Each job implements their controller from scratch.