Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: preallocation for NodeToStatusMap #124714

Merged
merged 1 commit into from
May 7, 2024

Conversation

sanposhiho
Copy link
Member

@sanposhiho sanposhiho commented May 7, 2024

What type of PR is this?

/kind bug
/kind regression

What this PR does / why we need it:

Improve the throughput by a preallocation for NodeToStatusMap.

Which issue(s) this PR fixes:

Part of (hopefully fix) #124709

Edit: This was reverted in #125197 prior to release of 1.31.0

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fix a performance regression in 1.30.0 for scheduling daemonset pods to reach 300 pods/s, if the configured qps allows it.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 7, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label May 7, 2024
@sanposhiho
Copy link
Member Author

/cc @alculquicondor

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 7, 2024
@k8s-ci-robot k8s-ci-robot requested review from chendave and kerthcet May 7, 2024 00:04
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2024

allNodes, err := sched.nodeInfoSnapshot.NodeInfos().List()
if err != nil {
return nil, diagnosis, err
return nil, framework.Diagnosis{
NodeToStatusMap: make(framework.NodeToStatusMap),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think .List would fail for the snapshot, but couldn't it be left nil?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 and maybe we can defer the allocation until unsuccess real happens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't it be left nil?

Ah, yes we can just leave it nil.

@kerthcet Can you elaborate about your proposal?

Copy link
Member

@kerthcet kerthcet May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can make the map (allocate the memory) when pod unschedulable real happens rather than preallocate, especially for 5k nodes, however, scheduling runs in serial, so the benefit is small but will make the code more complicated, so this is not a good suggestion. 😢

@alculquicondor
Copy link
Member

alculquicondor commented May 7, 2024

This patch allows us to reach 300 pods/s again

@alculquicondor
Copy link
Member

/lgtm
/approve

Can you open a cherry-pick for release-1.30?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 7, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: cabde4b4d4acb424ef24fafe1fce6f24db0deda6

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, sanposhiho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit e798b9c into kubernetes:master May 7, 2024
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.31 milestone May 7, 2024
@saschagrunert
Copy link
Member

Is it possible that we regressed something timing related here?

See #124743

@sanposhiho
Copy link
Member Author

sanposhiho commented May 8, 2024

@saschagrunert Well... This PR is just changing the internal data structure's field to be preallocated.
I'm not very familiar with the failing test though, I cannot think of any scenarios that this PR could affect the behaviors of sig-node's stuff (actually even in the scheduler except perf improvement).

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels May 8, 2024
@sanposhiho
Copy link
Member Author

sanposhiho commented May 8, 2024

Can you open a cherry-pick for release-1.30?

Done; #124753

@alculquicondor
Copy link
Member

/release-note-edit

Fix throughput when scheduling daemonset pods to reach 300 pods/s, if the configured qps allows it.

k8s-ci-robot added a commit that referenced this pull request May 10, 2024
…124714-upstream-release-1.30

Automated cherry pick of #124714: scheduler: preallocation for NodeToStatusMap
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/regression Categorizes issue or PR as related to a regression from a prior release. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants