Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: lots of time spent in sweepone #9265

Closed
dvyukov opened this issue Dec 11, 2014 · 13 comments
Closed

runtime: lots of time spent in sweepone #9265

dvyukov opened this issue Dec 11, 2014 · 13 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@dvyukov
Copy link
Member

dvyukov commented Dec 11, 2014

Here is repro program:
http://play.golang.org/p/giKY6IcWjU

If you run it as-is, it runs for 1.357s and profile looks as:
21.94% test test [.] runtime.memclr
12.24% test test [.] runtime.MSpanList_IsEmpty
11.02% test test [.] runtime.xchg
7.94% test test [.] MHeap_AllocLocked
5.19% test test [.] runtime.MSpan_Sweep

If you uncomment "y = make([]byte, 1<<30)" line, it runs for 26.696s and profile looks as:
37.52% test test [.] runtime.sweepone
27.15% test test [.] runtime.xadd
24.96% test test [.] markroot
1.96% test test [.] runtime.memclr
0.75% test test [.] runtime.xchg

The problem is that the program creates 131221 spans when the live set is 1GB and the heap grows to 2GB. Later the live set drops to 1MB and GCs become very frequent, but each GC has to iterate over all of 131221 spans in sweepone and markroot. The total time spent in span handling becomes 90%.

This was reported on golang-nuts:
https://groups.google.com/forum/#!topic/golang-nuts/-Dq4WkivD2E

We need to either (1) re-partition allspans so that we don't iterate over the dead part, or (2) remove and deallocate dead spans from allspans after some time of inactivity.

@rsc
Copy link
Contributor

rsc commented Dec 11, 2014

Should definitely do (1) first.
(2) can happen later if needed.

@DanielMorsing
Copy link
Contributor

I can't reproduce this after https://go-review.googlesource.com/2349 was applied.

However, I think we're seeing this issue in production on one of the services at my job. I'll report back after I've grabbed some data.

@aclements
Copy link
Member

I just tried @dvyukov's test program on latest master and it runs in 1.3s and has the same profile with and without the "y = ..." line.

@DanielMorsing, are you still seeing this issue?

@DanielMorsing
Copy link
Contributor

No, the issue in production turned out to be a case of some spurious allocations being made by application code.

I still think we're doing excessive sweeping, but it runs concurrently with mutator code, so the effects aren't as apparent. Should be easier to figure out what's happening after https://go-review.googlesource.com/#/c/9506/ was applied.

@dvyukov
Copy link
Member Author

dvyukov commented May 18, 2015

@aclements I guess you now need a different program to trigger this issue. Probably now GC collects the large span before the program allocates sufficient amount of small spans.

@rsc rsc modified the milestones: Go1.6Early, Go1.5 Jul 21, 2015
@rsc
Copy link
Contributor

rsc commented Nov 4, 2015

No repro program, so no issue.

@rsc rsc closed this as completed Nov 4, 2015
@dvyukov
Copy link
Member Author

dvyukov commented Nov 4, 2015

Such approach to bug handling is kind of discouraging.
Catch a new repro program:

package main

import (
    "fmt"
    "os"
    "runtime"
    "time"
)

var z []byte
var x [][]byte

func main() {
    t0 := time.Now()
    if len(os.Args) > 1 {
        for i := 0; i < 1<<20; i++ {
            x = append(x, make([]byte, 1<<13))
        }
        x = nil
        runtime.GC()
    }
    fmt.Println(time.Since(t0))
    t0 = time.Now()
    for i := 0; i < 1e5; i++ {
        z = make([]byte, 1<<13)
    }
    fmt.Println(time.Since(t0))
}
$ ./test
2.245µs
582.776723ms

$ ./test 1
3.470712076s
19.764293403s

That's 34x slowdown.

@dvyukov dvyukov reopened this Nov 4, 2015
@rsc
Copy link
Contributor

rsc commented Nov 24, 2015

This didn't happen in time.

@rsc rsc modified the milestones: Go1.7Early, Go1.6Early Nov 24, 2015
@adg adg modified the milestones: Go1.8Early, Go1.7Early May 4, 2016
@adg
Copy link
Contributor

adg commented May 4, 2016

Missed again.

@aclements
Copy link
Member

Just to expand on the original report a bit, when the heap is full of dead spans like in this test case, half of the time in sweepone is spent just incrementing the sweepidx and the other half is spent in cache misses just to check the span state (and determine that it's dead). This particular test amplifies this because it drives so many GC cycles in the second phase and, since the number of spans never goes down from the peak, between the closely spaced GC cycles in the second phase, sweeping has to sweep all of the dead spans.

This isn't going to happen for 1.7, but my hope for 1.8 is that we can eliminate proportional sweeping entirely, which will fix this problem at its root.

@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Sep 30, 2016
@quentinmit
Copy link
Contributor

ping @aclements, is this still going to make 1.8?

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/30536 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/30535 mentions this issue.

@rsc rsc modified the milestones: Go1.8, Go1.8Early Oct 20, 2016
gopherbot pushed a commit that referenced this issue Oct 25, 2016
Currently sweeping walks the list of all spans, which means the work
in sweeping is proportional to the maximum number of spans ever used.
If the heap was once large but is now small, this causes an
amortization failure: on a small heap, GCs happen frequently, but a
full sweep still has to happen in each GC cycle, which means we spent
a lot of time in sweeping.

Fix this by creating a separate list consisting of just the in-use
spans to be swept, so sweeping is proportional to the number of in-use
spans (which is proportional to the live heap). Specifically, we
create two lists: a list of unswept in-use spans and a list of swept
in-use spans. At the start of the sweep cycle, the swept list becomes
the unswept list and the new swept list is empty. Allocating a new
in-use span adds it to the swept list. Sweeping moves spans from the
unswept list to the swept list.

This fixes the amortization problem because a shrinking heap moves
spans off the unswept list without adding them to the swept list,
reducing the time required by the next sweep cycle.

Updates #9265. This fix eliminates almost all of the time spent in
sweepone; however, markrootSpans has essentially the same bug, so now
the test program from this issue spends all of its time in
markrootSpans.

No significant effect on other benchmarks.

Change-Id: Ib382e82790aad907da1c127e62b3ab45d7a4ac1e
Reviewed-on: https://go-review.googlesource.com/30535
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
@golang golang locked and limited conversation to collaborators Oct 25, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants