Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

send bookmark right now after sending all items in watchCache store #127012

Merged
merged 1 commit into from
Sep 26, 2024

Conversation

Chaunceyctx
Copy link
Contributor

@Chaunceyctx Chaunceyctx commented Aug 30, 2024

What type of PR is this?

What this PR does / why we need it:

/kind bug

Which issue(s) this PR fixes:

Fixes # #126958

Special notes for your reviewer:

Does this PR introduce a user-facing change?

send bookmark right now after sending all items in watchCache store

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

// before this pr:
// Bookmark event is only generated by Cacher.dispatchEvents goroutine. And it will re-calculate the next time when bookmark should be sent.
// DispatchEvents goroutine will set bookmark event nextTime from immediately to defaultBookmarkFrequency(1min) after sending initial events end bookmark event.
// So reflector in client will not receive another bookmark event within 1 minute after receiving the initial events end bookmark event.
// And client can receive only non-bookmark event within 1 minute.

// after this pr:
// Initial events end bookmark event is generated by cacheWatcher.processInterval goroutine. And ordinary bookmark event is still generated by Cacher.dispatchEvents goroutine.
// After sending all events in cacherInterval, cacheWatcher.processInterval goroutine tries to send initial events end bookmark event. Before cacheWatcher.state is set to cacheWatcherBookmarkSent,
// Cacher.dispatchEvents goroutine may re-calculate the next time when bookmark should be sent. And it will still believe that bookmark should be sent immediately.
// Within the 1s-1.25s bookmark tick time, current unit test creates pods defined in scenario.podsAfterEstablishingWatch field.
// When bookmark tick is triggerred in 1s-1.25s , apiserver may send a non-initial bookmark event(resourceVersion can be equal to any pods' resourceVersion defined in scenario.podsAfterEstablishingWatch) to
// client. So we can not make sure that client will not receive another bookmark event within 1 minute after receiving the initial events end bookmark event.

// In a word, this phenomenon doesn't affect the function of watchlist feature. It's just that we need to pay attention to this phenomenon when writing ut cases

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 30, 2024
Copy link

linux-foundation-easycla bot commented Aug 30, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: Chaunceyctx / name: Chauncey (7239202)

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 30, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @Chaunceyctx. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 30, 2024
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Aug 30, 2024
@Chaunceyctx
Copy link
Contributor Author

gentle /ping @wojtek-t @p0lyn0mial @liggitt

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 30, 2024
@xuzhenglun
Copy link
Contributor

/ok-to-test
/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 30, 2024
@Chaunceyctx
Copy link
Contributor Author

gentle /ping @wojtek-t @p0lyn0mial :)

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for delay

if err != nil {
t.Fatalf("Failed to create watch: %v", err)
}
verifyWatchEvents(t, w, expectedWatchEvents)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't create a new function - simply use testCheckResultsInStrictOrder

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@@ -193,7 +193,13 @@ func testCheckResultsInStrictOrder(t *testing.T, w watch.Interface, expectedEven
}
}

func testCheckNoMoreResults(t *testing.T, w watch.Interface) {
func testCheckResultsWithDesiredAndActual(t *testing.T, expectedEvents []watch.Event, actualEvents []watch.Event) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to introduce such function - there already is testCheckResultsInStrictOrder

testCheckResultsInStrictOrder(t, w, scenario.expectedEventsAfterEstablishingWatch(createdPods))
testCheckNoMoreResults(t, w)

// before this pr:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we generally don't describe state before the PR - it's useless for future readers.

State before the PR can be documented in PR or commit description.


// In a word, this phenomenon doesn't affect the function of watchlist feature. It's just that we need to pay attention to this phenomenon when writing ut cases

// get corresponding pod watch events
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially started thinking if making bookmark period a configurable option and switching it off wouldn't be a better option for this test, but in the end, I wouldn't go this way.

That said, I would approach this slightly differently. In particular:

  • let's introduce testCheckResultWithIgnoreFunc function
    it's supposed to be somewhat similar to testCheckResultFunc, but additionally takes ignore func(watch.Event) bool parameter [and it has a loop for getting items from watch and just ignores one for which the ignore function returns true)
  • let's simply use this new helper function here, passing function that ignores events with type Bookmark as the ignore function

@Chaunceyctx
Copy link
Contributor Author

Sorry for delay

Don’t Worry about it. Thanks for your review. I will address all comments. :)

@Chaunceyctx
Copy link
Contributor Author

Chaunceyctx commented Sep 11, 2024

/retest

some ci jobs are so weird. I have change testCheckNoMoreResults to TestCheckNoMoreResults in order to expose this method and reuse it. But some ci jobs can still detect testCheckNoMoreResults method in my code.

@Chaunceyctx
Copy link
Contributor Author

/retest

some ci jobs are so weird. I have change testCheckNoMoreResults to TestCheckNoMoreResults in order to expose this method and reuse it. But some ci jobs can still detect testCheckNoMoreResults method in my code.

fine. I don't rebase master branch. lol

}

// make cacher.dispatchEvents goroutine exit and bookmark event will only be created by processInterval method
cacher.stopCh <- struct{}{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close stopCh instead.

Or you can probably even call cacher.Stop()

Copy link
Contributor Author

@Chaunceyctx Chaunceyctx Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for this mistake. In fact, we cannot close stopCh because watchCache will become not ready, affecting watch request. For sendInitialEvents=false, AllowBookmarks=true watch request, the bookmark sending frequency is 1 minute, and I control the watchevent receive timeout to 5 seconds, which is equivalent to ensuring that bookmarks are only generated by the processInterval method. So I just delete this code. : )

/gentle ping @wojtek-t

if err != nil {
t.Fatalf("Failed to create watch: %v", err)
}
verifyWatchEvents(t, w, expectedWatchEvents)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@Chaunceyctx Chaunceyctx force-pushed the new-send-bookmark branch 3 times, most recently from bcebd56 to 25113e8 Compare September 20, 2024 15:22
@Chaunceyctx
Copy link
Contributor Author

/retest

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few more nits - other than that LGTM.

}

for i := range scenarios {
expectedWatchEvents := expectedPodEvents
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for _, tc := range scenarios {
  t.Run(tc.name, func(t *testing.T) {
     ...
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it!

}
storagetesting.TestCheckResultsInStrictOrder(t, w, expectedWatchEvents)
storagetesting.TestCheckNoMoreResults(t, w)
cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the comment I mentioned above, make it a defer cancel() next to where context is created

} else {
t.Fatalf("cannot receive correct event, expect no event, but get a event: %+v", event)
}
case <-time.After(2 * time.Second):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still didn't fully address the problem of long-running tests and exceeding the timeout for single package tests.

So let's for now use something shorter - please do the same thing as in testCheckNoMoreResults

Copy link
Contributor Author

@Chaunceyctx Chaunceyctx Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the timeout used in the testCheckNoMoreResults function is 100ms, but we cannot guarantee that the normal bookmark event (not the initial bookmark event) will be received within 100ms. It is possible that it will be received at 101ms, 102ms, or 103ms ..., which could cause the subsequent testCheckNoMoreResults function to fail. Therefore, I still use 100ms in the testCheckResultWithIgnoreFunc function to avoid package test taking too long, and I add an ignore function argument to testCheckNoMoreResults and rename it to testCheckNoMoreResultsWithIgnoreFunc to ignore the normal bookmark event. :)

@@ -1512,8 +1512,10 @@ func RunWatchSemantics(ctx context.Context, t *testing.T, store storage.Interfac
require.NoError(t, err, "failed to add a pod: %v")
createdPods = append(createdPods, out)
}
testCheckResultsInStrictOrder(t, w, scenario.expectedEventsAfterEstablishingWatch(createdPods))
testCheckNoMoreResults(t, w)
testCheckResultWithIgnoreFunc(t, w, scenario.expectedEventsAfterEstablishingWatch(createdPods), func(watchEvent watch.Event) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: define

var ignoreEventsFn := func(event watch.Event) bool { return event.Type == watch.Bookmark }

and use it here

@Chaunceyctx
Copy link
Contributor Author

/retest

@Chaunceyctx Chaunceyctx force-pushed the new-send-bookmark branch 2 times, most recently from 59b0e4f to f476c78 Compare September 26, 2024 05:20
@wojtek-t
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 26, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 850b4e6861b8d406b79824ab024c7a1698fa3667

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Chaunceyctx, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 26, 2024
@k8s-ci-robot k8s-ci-robot merged commit 81ebfb3 into kubernetes:master Sep 26, 2024
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.32 milestone Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants