-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: lots of time spent in sweepone #9265
Comments
Should definitely do (1) first. |
I can't reproduce this after https://go-review.googlesource.com/2349 was applied. However, I think we're seeing this issue in production on one of the services at my job. I'll report back after I've grabbed some data. |
I just tried @dvyukov's test program on latest master and it runs in 1.3s and has the same profile with and without the "y = ..." line. @DanielMorsing, are you still seeing this issue? |
No, the issue in production turned out to be a case of some spurious allocations being made by application code. I still think we're doing excessive sweeping, but it runs concurrently with mutator code, so the effects aren't as apparent. Should be easier to figure out what's happening after https://go-review.googlesource.com/#/c/9506/ was applied. |
@aclements I guess you now need a different program to trigger this issue. Probably now GC collects the large span before the program allocates sufficient amount of small spans. |
No repro program, so no issue. |
Such approach to bug handling is kind of discouraging. package main
import (
"fmt"
"os"
"runtime"
"time"
)
var z []byte
var x [][]byte
func main() {
t0 := time.Now()
if len(os.Args) > 1 {
for i := 0; i < 1<<20; i++ {
x = append(x, make([]byte, 1<<13))
}
x = nil
runtime.GC()
}
fmt.Println(time.Since(t0))
t0 = time.Now()
for i := 0; i < 1e5; i++ {
z = make([]byte, 1<<13)
}
fmt.Println(time.Since(t0))
}
That's 34x slowdown. |
This didn't happen in time. |
Missed again. |
Just to expand on the original report a bit, when the heap is full of dead spans like in this test case, half of the time in sweepone is spent just incrementing the sweepidx and the other half is spent in cache misses just to check the span state (and determine that it's dead). This particular test amplifies this because it drives so many GC cycles in the second phase and, since the number of spans never goes down from the peak, between the closely spaced GC cycles in the second phase, sweeping has to sweep all of the dead spans. This isn't going to happen for 1.7, but my hope for 1.8 is that we can eliminate proportional sweeping entirely, which will fix this problem at its root. |
ping @aclements, is this still going to make 1.8? |
CL https://golang.org/cl/30536 mentions this issue. |
CL https://golang.org/cl/30535 mentions this issue. |
Currently sweeping walks the list of all spans, which means the work in sweeping is proportional to the maximum number of spans ever used. If the heap was once large but is now small, this causes an amortization failure: on a small heap, GCs happen frequently, but a full sweep still has to happen in each GC cycle, which means we spent a lot of time in sweeping. Fix this by creating a separate list consisting of just the in-use spans to be swept, so sweeping is proportional to the number of in-use spans (which is proportional to the live heap). Specifically, we create two lists: a list of unswept in-use spans and a list of swept in-use spans. At the start of the sweep cycle, the swept list becomes the unswept list and the new swept list is empty. Allocating a new in-use span adds it to the swept list. Sweeping moves spans from the unswept list to the swept list. This fixes the amortization problem because a shrinking heap moves spans off the unswept list without adding them to the swept list, reducing the time required by the next sweep cycle. Updates #9265. This fix eliminates almost all of the time spent in sweepone; however, markrootSpans has essentially the same bug, so now the test program from this issue spends all of its time in markrootSpans. No significant effect on other benchmarks. Change-Id: Ib382e82790aad907da1c127e62b3ab45d7a4ac1e Reviewed-on: https://go-review.googlesource.com/30535 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
Here is repro program:
http://play.golang.org/p/giKY6IcWjU
If you run it as-is, it runs for 1.357s and profile looks as:
21.94% test test [.] runtime.memclr
12.24% test test [.] runtime.MSpanList_IsEmpty
11.02% test test [.] runtime.xchg
7.94% test test [.] MHeap_AllocLocked
5.19% test test [.] runtime.MSpan_Sweep
If you uncomment "y = make([]byte, 1<<30)" line, it runs for 26.696s and profile looks as:
37.52% test test [.] runtime.sweepone
27.15% test test [.] runtime.xadd
24.96% test test [.] markroot
1.96% test test [.] runtime.memclr
0.75% test test [.] runtime.xchg
The problem is that the program creates 131221 spans when the live set is 1GB and the heap grows to 2GB. Later the live set drops to 1MB and GCs become very frequent, but each GC has to iterate over all of 131221 spans in sweepone and markroot. The total time spent in span handling becomes 90%.
This was reported on golang-nuts:
https://groups.google.com/forum/#!topic/golang-nuts/-Dq4WkivD2E
We need to either (1) re-partition allspans so that we don't iterate over the dead part, or (2) remove and deallocate dead spans from allspans after some time of inactivity.
The text was updated successfully, but these errors were encountered: