-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advance epoch #105469
Advance epoch #105469
Conversation
4f1bb71
to
6839bc3
Compare
staging/src/k8s.io/apiserver/pkg/util/flowcontrol/fairqueuing/queueset/queueset.go
Outdated
Show resolved
Hide resolved
cb40446
to
3708369
Compare
/triage accepted |
klog.ErrorS(errors.New("progress meter wrapped around"), "Wrap", "QS", qs.qCfg.Name, "prevR", prevR, "currentR", qs.currentR) | ||
incrR := SeatsTimesDuration(qs.getVirtualTimeRatioLocked(), timeSinceLast) | ||
if incrR > RDecrement/2 { | ||
klog.ErrorS(errors.New("unusably large R increment"), "Can not do it", "QS", qs.qCfg.Name, "when", realNow.Format(nsTimeFmt), "incrR", incrR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this - it's not an actionable error really..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actionable by whom? If it happens, I would like to know because that means we have more work to do.
} | ||
metrics.SetCurrentR(qs.qCfg.Name, qs.currentR.ToFloat()) | ||
} | ||
|
||
// advanceEpoch subtracts RDecrement from the global progress meter R | ||
// and all the readings that have been taked from that meter. | ||
func (qs *queueSet) advanceEpoch() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an admin - I would really like to be able to monitor this.
I think we need a metric that shows
a) when advancing epoch happened
b) any errors asociated with that (in particular I'm seeing the underflow error below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am most concerned by the fact that you are reporting seeing an underflow. That is far beyond what I expect. Can you give more details and/or describe how to reproduce?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems I wasn't clear - I didn't see underflow myself. I see that as error below.
But I would like to be able to monitor for that (i.e. have metrics - not just logs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the metrics.
t.Errorf("SeatSeonds(%d) formatted as %q rather than expected %q", uint64(testCase.ss), actualStr, testCase.expect) | ||
actualStrLen := len(actualStr) | ||
actualSuffix := actualStr[actualStrLen-2:] | ||
if actualSuffix != "ss" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm against these types of types.
The tests are supposed to be as simple as possible to avoid bugs in tests themselfes.
Given that this in particular is testing if the returned string looks as it should, what we should do is:
- have an
expectString
field in the testCase - simply compare that with with what ss.String() returns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to have a test that does not require editing when ssScale
changes. It was helpful when I experimented with different settings for that. But since you insist, I have sorrowfully made the test use literal strings.
@@ -149,6 +150,14 @@ const MaxSeatSeconds = SeatSeconds(math.MaxUint64) | |||
// MinSeatSeconds is the lowest representable value of SeatSeconds | |||
const MinSeatSeconds = SeatSeconds(0) | |||
|
|||
// RDecrement is the amount by which the progress meter R is wound backwards | |||
// when needed to avoid overflow. | |||
const RDecrement = MaxSeatSeconds / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an implementation detail - let's move these constants to queueset.go and make them private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, good idea.
3708369
to
c4c3d89
Compare
The force-push to c4c3d89f0f8 makes the changes suggested in review above. |
c4c3d89
to
5b3e05d
Compare
The force-push to 5b3e05d07c4 adds a bit of info to the info log message in |
@@ -207,8 +208,8 @@ var ( | |||
Help: "Number of requests currently pending in queues of the API Priority and Fairness system", | |||
StabilityLevel: compbasemetrics.ALPHA, | |||
}, | |||
[]string{priorityLevel, flowSchema}, | |||
) | |||
[]string{priorityLevel, flowSchema}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you changing those?
[I personally prefer what we had previously]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They weren't all using the same style. I picked a style that some used and involved blank lines, which I think is a good separator when the things being separated are big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed them all to separate with close paren.
staging/src/k8s.io/apiserver/pkg/util/flowcontrol/metrics/metrics.go
Outdated
Show resolved
Hide resolved
Namespace: namespace, | ||
Subsystem: subsystem, | ||
Name: "epoch_advance_total", | ||
Help: "Number of times the queueset's progress meter zero-point was advanced", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove "zero-point" from the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make it something like "Number of times the queueset's progress meter was jumped bakwards".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
staging/src/k8s.io/apiserver/pkg/util/flowcontrol/metrics/metrics.go
Outdated
Show resolved
Hide resolved
5b3e05d
to
7abd330
Compare
The force-push to 7abd330961e makes the changes discussed in the latest review. |
7abd330
to
49cecbe
Compare
The force-push to 49cecbe715f makes one overlooked change. |
Also add test for that situation.
49cecbe
to
a797fbd
Compare
The force-push to a797fbd corrected the corresponding test. |
/lgtm Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MikeSpreitzer, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
This PR enhances the queueset implementation in API Priority and Fairness to cope with the possibility of overflow of work accumulators.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
This is built on #105412 and will be rebased once that PR merges.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: