Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add specific errors for pod affinity predicates #51889

Merged
merged 1 commit into from
Sep 23, 2017

Conversation

guangxuli
Copy link
Contributor

What this PR does / why we need it:

Add specific error for pod affinity predicates

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

fix #51655

Special notes for your reviewer:
none
Release note:
none

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 4, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @guangxuli. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot k8s-github-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Sep 4, 2017
@guangxuli
Copy link
Contributor Author

/cc @bsalamat

@guangxuli
Copy link
Contributor Author

/sig scheduling

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Sep 4, 2017
@xiangpengzhao
Copy link
Contributor

/release-note-none
/ok-to-test

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Sep 4, 2017
@guangxuli
Copy link
Contributor Author

/retest

@guangxuli
Copy link
Contributor Author

/test pull-kubernetes-e2e-kops-aws

if !c.satisfiesExistingPodsAntiAffinity(pod, meta, nodeInfo) {
return false, []algorithm.PredicateFailureReason{ErrPodAffinityNotMatch}, nil
if result, err := c.satisfiesExistingPodsAntiAffinity(pod, meta, nodeInfo); !result {
return false, []algorithm.PredicateFailureReason{ErrPodAffinityNotMatch}, err
Copy link
Member

@bsalamat bsalamat Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not return these as errors. We should define three new predicate failure reasons and return them in []algorithm.PredicateFailureReason in addition to ErrPodAffinityNotMatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right. I make a misunderstanding. I have made another modification.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 8, 2017
@guangxuli
Copy link
Contributor Author

@bsalamat ptal, thanks.

@xiangpengzhao
Copy link
Contributor

/retest


tests := []struct {
pod *v1.Pod
pods []*v1.Pod
nodes []v1.Node
nodes []NodeTest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: instead of a new type ("NodeTest"), you could have added another slice that would list the expected failure reasons in the same order as the nodes array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reasonable,done

@@ -2564,7 +2572,7 @@ func TestInterPodAffinity(t *testing.T) {
test: "verify that PodAntiAffinity from existing pod is respected when pod has no AntiAffinity constraints. satisfy PodAntiAffinity symmetry with the existing pod",
},
}
expectedFailureReasons := []algorithm.PredicateFailureReason{ErrPodAffinityNotMatch}
//expectedFailureReasons := []algorithm.PredicateFailureReason{ErrPodAffinityNotMatch}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete the line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1115,10 +1115,10 @@ func (c *PodAffinityChecker) getMatchingAntiAffinityTerms(pod *v1.Pod, allPods [

// Checks if scheduling the pod onto this node would break any anti-affinity
// rules indicated by the existing pods.
func (c *PodAffinityChecker) satisfiesExistingPodsAntiAffinity(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) bool {
func (c *PodAffinityChecker) satisfiesExistingPodsAntiAffinity(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (bool, algorithm.PredicateFailureReason) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you are returning algorithm.PredicateFailureReason in this function, but the failure reason is always ErrExistingPodsAntiAffinityRulesNotMatch, unless there is an error. It would be better to return an error instead of algorithm.PredicateFailureReason and then in InterPodAffinityMatches function you check the error. If it is nil, you add ErrExistingPodsAntiAffinityRulesNotMatch to the failure reasons. If it is not nil, you return the error. You should do this for satisfiesPodsAffinityAntiAffinity as well. This would be more compatible with the existing behavior and would also address a problem in the current code that errors occurred in satisfiesPodsAffinityAntiAffinity and satisfiesExistingPodsAntiAffinity are not trickled up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't return algorithm.PredicateFailureReason in function satisfiesPodsAffinityAntiAffinity , we have to check the specific error just as your comment, two cases:
a. check the error: PodAffinityRulesNotMatch
b. check the error: PodAntiAffinityRulesNotMatch
if error is a or b, we add ErrPodsAffinityRulesNotMatch or ErrPodsAntiAffinityRulesNotMatch to algorithm.PredicateFailureReason in addition to ErrPodAffinityNotMatc and return nil. If got other errors we should return the error directly, right?

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 9, 2017
@guangxuli guangxuli force-pushed the pod_affinity_error branch 2 times, most recently from 9643627 to e928999 Compare September 9, 2017 08:23
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 9, 2017
@guangxuli
Copy link
Contributor Author

@bsalamat ptal.

@guangxuli
Copy link
Contributor Author

@bsalamat I have added the specific returning error. IMO, i think there is no need to check the err again which done in original test case. It is enough just to check predicates failure reason.

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good. Thanks! Just a minor comment.

@@ -1156,25 +1158,28 @@ func (c *PodAffinityChecker) satisfiesExistingPodsAntiAffinity(pod *v1.Pod, meta
// present in nodeInfo. Pods on other nodes pass the filter.
filteredPods, err := c.podLister.FilteredList(nodeInfo.Filter, labels.Everything())
if err != nil {
glog.Errorf("Failed to get all pods, %+v", err)
return false
format := "Failed to get all pods, %+v"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cleaner (and a bit more efficient) if we did:

errMessage := fmt.sprintf("Failed to get all pods, %+v", err)
glog.Error(errMessage)
return ErrExistingPodsAntiAffinityRulesNotMatch, errors.New(errMessage)

If you agree, please change it for the cases below as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's more cleaner than what i have done. :) thanks. Will do.

@guangxuli
Copy link
Contributor Author

@bsalamat All comments have addressed. PTAL.

@bsalamat
Copy link
Member

Thanks, @guangxuli!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 13, 2017
@guangxuli
Copy link
Contributor Author

/assign @k82cn

@guangxuli
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-bazel
/test pull-kubernetes-e2e-kops-aws

@guangxuli
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-bazel

1 similar comment
@guangxuli
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-bazel

@guangxuli
Copy link
Contributor Author

@k82cn ptal.

@guangxuli
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-bazel

@k82cn
Copy link
Member

k82cn commented Sep 15, 2017

sure, I'll review it today :).

@k82cn
Copy link
Member

k82cn commented Sep 15, 2017

/approve

@guangxuli
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-bazel

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, guangxuli, k82cn

Associated issue: 51655

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2017
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@guangxuli
Copy link
Contributor Author

/retest

@k8s-github-robot
Copy link

/test all

Tests are more than 96 hours old. Re-running tests.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 52168, 48939, 51889, 52051, 50396). If you want to cherry-pick this change to another branch, please follow the instructions here..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduler should return specific errors for affinity predicates
8 participants