Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TAS: Assign only hostname values to TopologyAssignment #3677

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion apis/kueue/v1beta1/workload_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,9 @@ type PodSetAssignment struct {
// domain and specifies the node selectors for each topology domain, in the
// following way: the node selector keys are specified by the levels field
// (same for all domains), and the corresponding node selector value is
// specified by the domains.values subfield.
// specified by the domains.values subfield. If the TopologySpec.Levels field contains
// "kubernetes.io/hostname" label, topologyAssignment will contain data only for
// this label, and omit higher levels in the topology
//
// Example:
//
Expand All @@ -168,6 +170,21 @@ type PodSetAssignment struct {
// cloud.provider.com/topology-block: block-1
// cloud.provider.com/topology-rack: rack-2
//
// Example:
PBundyra marked this conversation as resolved.
Show resolved Hide resolved
// Below there is an equivalent of the above example assuming, Topology
// object defines kubernetes.io/hostname as the lowest level in topology.
// Hence we omit higher level of topologies, since the hostname label
// is sufficient to explicitly identify a proper node.
//
// topologyAssignment:
// levels:
// - kubernetes.io/hostname
// domains:
// - values: [hostname-1]
// count: 4
// - values: [hostname-2]
// count: 2
//
// +optional
TopologyAssignment *TopologyAssignment `json:"topologyAssignment,omitempty"`
}
Expand Down
19 changes: 18 additions & 1 deletion charts/kueue/templates/crd/kueue.x-k8s.io_workloads.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8340,7 +8340,9 @@ spec:
domain and specifies the node selectors for each topology domain, in the
following way: the node selector keys are specified by the levels field
(same for all domains), and the corresponding node selector value is
specified by the domains.values subfield.
specified by the domains.values subfield. If the TopologySpec.Levels field contains
"kubernetes.io/hostname" label, topologyAssignment will contain data only for
this label, and omit higher levels in the topology

Example:

Expand All @@ -8361,6 +8363,21 @@ spec:
- 2 Pods are to be scheduled on nodes matching the node selector:
cloud.provider.com/topology-block: block-1
cloud.provider.com/topology-rack: rack-2

Example:
Below there is an equivalent of the above example assuming, Topology
object defines kubernetes.io/hostname as the lowest level in topology.
Hence we omit higher level of topologies, since the hostname label
is sufficient to explicitly identify a proper node.

topologyAssignment:
levels:
- kubernetes.io/hostname
domains:
- values: [hostname-1]
count: 4
- values: [hostname-2]
count: 2
properties:
domains:
description: |-
Expand Down
19 changes: 18 additions & 1 deletion config/components/crd/bases/kueue.x-k8s.io_workloads.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8325,7 +8325,9 @@ spec:
domain and specifies the node selectors for each topology domain, in the
following way: the node selector keys are specified by the levels field
(same for all domains), and the corresponding node selector value is
specified by the domains.values subfield.
specified by the domains.values subfield. If the TopologySpec.Levels field contains
"kubernetes.io/hostname" label, topologyAssignment will contain data only for
this label, and omit higher levels in the topology

Example:

Expand All @@ -8346,6 +8348,21 @@ spec:
- 2 Pods are to be scheduled on nodes matching the node selector:
cloud.provider.com/topology-block: block-1
cloud.provider.com/topology-rack: rack-2

Example:
Below there is an equivalent of the above example assuming, Topology
object defines kubernetes.io/hostname as the lowest level in topology.
Hence we omit higher level of topologies, since the hostname label
is sufficient to explicitly identify a proper node.

topologyAssignment:
levels:
- kubernetes.io/hostname
domains:
- values: [hostname-1]
count: 4
- values: [hostname-2]
count: 2
properties:
domains:
description: |-
Expand Down
31 changes: 24 additions & 7 deletions keps/2724-topology-aware-scheduling/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -332,12 +332,12 @@ the same "rack" label, but in different "blocks".

For example, this is a representation of the dataset hierarchy;

| node | cloud.provider.com/topology-block | cloud.provider.com/topology-rack |
|:------:|:----------------------------------:|:--------------------------------:|
| node-1 | block-1 | rack-1 |
| node-2 | block-1 | rack-2 |
| node-3 | block-2 | rack-1 |
| node-4 | block-2 | rack-3 |
| node | cloud.provider.com/topology-block | cloud.provider.com/topology-rack |
| :----: | :-------------------------------: | :------------------------------: |
| node-1 | block-1 | rack-1 |
| node-2 | block-1 | rack-2 |
| node-3 | block-2 | rack-1 |
| node-4 | block-2 | rack-3 |

Note that, there is a pair of nodes, node-1 and node-3, with the same value of
the "cloud.provider.com/topology-rack" label, but in different blocks.
Expand Down Expand Up @@ -492,7 +492,9 @@ type PodSetAssignment struct {
// domain and specifies the node selectors for each topology domain, in the
// following way: the node selector keys are specified by the levels field
// (same for all domains), and the corresponding node selector value is
// specified by the domains.values subfield.
// specified by the domains.values subfield. If the TopologySpec.Levels field contains
// "kubernetes.io/hostname" label, topologyAssignment will contain data only for
// this label, and omit higher levels in the topology
//
// Example:
//
Expand All @@ -514,6 +516,21 @@ type PodSetAssignment struct {
// cloud.provider.com/topology-block: block-1
// cloud.provider.com/topology-rack: rack-2
//
// Example:
// Below there is an equivalent of the above example assuming, Topology
// object defines kubernetes.io/hostname as the lowest level in topology.
// Hence we omit higher level of topologies, since the hostname label
// is sufficient to explicitly identify a proper node.
//
// topologyAssignment:
// levels:
// - kubernetes.io/hostname
// domains:
// - values: [hostname-1]
// count: 4
// - values: [hostname-2]
// count: 2
//
// +optional
TopologyAssignment *TopologyAssignment `json:"topologyAssignment,omitempty"`
}
Expand Down
24 changes: 3 additions & 21 deletions pkg/cache/tas_cache_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -537,37 +537,29 @@ func TestFindTopologyAssignment(t *testing.T) {
},
count: 4,
wantAssignment: &kueue.TopologyAssignment{
Levels: defaultThreeLevels,
Levels: defaultOneLevel,
Domains: []kueue.TopologyDomainAssignment{
{
Count: 1,
Values: []string{
"b1",
"r3",
"x3",
},
},
{
Count: 1,
Values: []string{
"b1",
"r3",
"x4",
},
},
{
Count: 1,
Values: []string{
"b1",
"r3",
"x5",
},
},
{
Count: 1,
Values: []string{
"b1",
"r3",
"x6",
},
},
Expand All @@ -585,37 +577,29 @@ func TestFindTopologyAssignment(t *testing.T) {
},
count: 4,
wantAssignment: &kueue.TopologyAssignment{
Levels: defaultThreeLevels,
Levels: defaultOneLevel,
Domains: []kueue.TopologyDomainAssignment{
{
Count: 1,
Values: []string{
"b1",
"r1",
"x1",
},
},
{
Count: 1,
Values: []string{
"b1",
"r1",
"x2",
},
},
{
Count: 1,
Values: []string{
"b1",
"r2",
"x3",
},
},
{
Count: 1,
Values: []string{
"b1",
"r2",
"x4",
},
},
Expand All @@ -633,13 +617,11 @@ func TestFindTopologyAssignment(t *testing.T) {
},
count: 1,
wantAssignment: &kueue.TopologyAssignment{
Levels: defaultThreeLevels,
Levels: defaultOneLevel,
Domains: []kueue.TopologyDomainAssignment{
{
Count: 1,
Values: []string{
"b2",
"r2",
"x6",
},
},
Expand Down
30 changes: 20 additions & 10 deletions pkg/cache/tas_flavor_snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -321,22 +321,32 @@ func (s *TASFlavorSnapshot) updateCountsToMinimum(domains []*domain, count int32
return nil
}

func (s *TASFlavorSnapshot) buildAssignment(domains []*domain) *kueue.TopologyAssignment {
assignment := kueue.TopologyAssignment{
Levels: s.levelKeys,
Domains: make([]kueue.TopologyDomainAssignment, 0),
// buildTopologyAssignmentForLevels build TopologyAssignment for levels starting from levelIdx
func (s *TASFlavorSnapshot) buildTopologyAssignmentForLevels(domains []*domain, levelIdx int) *kueue.TopologyAssignment {
assignment := &kueue.TopologyAssignment{
Domains: make([]kueue.TopologyDomainAssignment, len(domains)),
}
assignment.Levels = s.levelKeys[levelIdx:]
for i, domain := range domains {
assignment.Domains[i] = kueue.TopologyDomainAssignment{
Values: domain.levelValues[levelIdx:],
Count: domain.state,
}
}
return assignment
}

func (s *TASFlavorSnapshot) buildAssignment(domains []*domain) *kueue.TopologyAssignment {
// lex sort domains
slices.SortFunc(domains, func(a, b *domain) int {
return cmp.Compare(a.id, b.id)
})
for _, domain := range domains {
assignment.Domains = append(assignment.Domains, kueue.TopologyDomainAssignment{
Values: domain.levelValues,
Count: domain.state,
})
levelIdx := 0
// assign only hostname values if topology defines it
if s.isLowestLevelNode() {
levelIdx = len(s.levelKeys) - 1
}
return &assignment
return s.buildTopologyAssignmentForLevels(domains, levelIdx)
}

func (s *TASFlavorSnapshot) lowerLevelDomains(domains []*domain) []*domain {
Expand Down
19 changes: 18 additions & 1 deletion site/content/en/docs/reference/kueue.v1beta1.md
Original file line number Diff line number Diff line change
Expand Up @@ -1668,7 +1668,9 @@ The assignment specifies the number of Pods to be scheduled per topology
domain and specifies the node selectors for each topology domain, in the
following way: the node selector keys are specified by the levels field
(same for all domains), and the corresponding node selector value is
specified by the domains.values subfield.</p>
specified by the domains.values subfield. If the TopologySpec.Levels field contains
&quot;kubernetes.io/hostname&quot; label, topologyAssignment will contain data only for
this label, and omit higher levels in the topology</p>
<p>Example:</p>
<p>topologyAssignment:
levels:</p>
Expand All @@ -1690,6 +1692,21 @@ cloud.provider.com/topology-rack: rack-1</li>
cloud.provider.com/topology-block: block-1
cloud.provider.com/topology-rack: rack-2</li>
</ul>
<p>Example:
Below there is an equivalent of the above example assuming, Topology
object defines kubernetes.io/hostname as the lowest level in topology.
Hence we omit higher level of topologies, since the hostname label
is sufficient to explicitly identify a proper node.</p>
<p>topologyAssignment:
levels:</p>
<ul>
<li>kubernetes.io/hostname
domains:</li>
<li>values: [hostname-1]
count: 4</li>
<li>values: [hostname-2]
count: 2</li>
</ul>
</td>
</tr>
</tbody>
Expand Down
56 changes: 32 additions & 24 deletions test/e2e/tas/job_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,21 +149,25 @@ var _ = ginkgo.Describe("TopologyAwareScheduling for Job", func() {
gomega.Expect(createdWorkload.Status.Admission.PodSetAssignments).Should(gomega.HaveLen(1))
gomega.Expect(createdWorkload.Status.Admission.PodSetAssignments[0].TopologyAssignment.Levels).Should(gomega.BeComparableTo(
[]string{
topologyLevelBlock,
topologyLevelRack,
corev1.LabelHostname,
},
))
podCountPerBlock := map[string]int32{}
for _, d := range createdWorkload.Status.Admission.PodSetAssignments[0].TopologyAssignment.Domains {
podCountPerBlock[d.Values[0]] += d.Count
}
// both pod assignments are in the same block
gomega.Expect(podCountPerBlock).Should(gomega.HaveLen(1))
// pod assignment count equals job parallelism
for _, pd := range podCountPerBlock {
gomega.Expect(pd).Should(gomega.Equal(ptr.Deref[int32](sampleJob.Spec.Parallelism, 0)))
}
gomega.Expect(createdWorkload.Status.Admission.PodSetAssignments[0].TopologyAssignment.Domains).Should(gomega.BeComparableTo(
[]kueue.TopologyDomainAssignment{
{
Values: []string{"kind-worker"},
Count: 1,
},
{
Values: []string{"kind-worker2"},
Count: 1,
},
{
Values: []string{"kind-worker3"},
Count: 1,
},
},
))
})
ginkgo.By(fmt.Sprintf("verify the workload %q gets finished", wlLookupKey), func() {
gomega.Eventually(func(g gomega.Gomega) {
Expand Down Expand Up @@ -198,21 +202,25 @@ var _ = ginkgo.Describe("TopologyAwareScheduling for Job", func() {
gomega.Expect(createdWorkload.Status.Admission.PodSetAssignments).Should(gomega.HaveLen(1))
gomega.Expect(createdWorkload.Status.Admission.PodSetAssignments[0].TopologyAssignment.Levels).Should(gomega.BeComparableTo(
[]string{
topologyLevelBlock,
topologyLevelRack,
corev1.LabelHostname,
},
))
podCountPerBlock := map[string]int32{}
for _, d := range createdWorkload.Status.Admission.PodSetAssignments[0].TopologyAssignment.Domains {
podCountPerBlock[d.Values[0]] += d.Count
}
// both pod assignments are in the same block
gomega.Expect(podCountPerBlock).Should(gomega.HaveLen(1))
// pod assignment count equals job parallelism
for _, pd := range podCountPerBlock {
gomega.Expect(pd).Should(gomega.Equal(ptr.Deref[int32](sampleJob.Spec.Parallelism, 0)))
}
gomega.Expect(createdWorkload.Status.Admission.PodSetAssignments[0].TopologyAssignment.Domains).Should(gomega.BeComparableTo(
[]kueue.TopologyDomainAssignment{
{
Values: []string{"kind-worker"},
Count: 1,
},
{
Values: []string{"kind-worker2"},
Count: 1,
},
{
Values: []string{"kind-worker3"},
Count: 1,
},
},
))
})

ginkgo.By(fmt.Sprintf("verify the workload %q gets finished", wlLookupKey), func() {
Expand Down