SD-232 Recycle classloaders to be anti-hostile to JIT #2754

retronym · 2016-09-26T05:49:33Z

The compiler interface subclasses scala.tools.nsc.Global,
and loading this new subclass before each compile task forces
HotSpot JIT to deoptimize larges swathes of compiled code. It's
a bit like SBT has rigged the dice to always descend the longest
ladder in a game of Snakes and Ladders.

The slowdown seems to be larger with Scala 2.12. There are a number
of variables at play, but I think the main factor here is that
we now rely on JIT to devirtualize calls to final methods in traits
whereas we used to emit static calls. JIT does a good job at this,
so long as classloading doesn't undo that good work.

This commit extends the existing ClassLoaderCache to encompass
the classloader that includes the compiler interface JAR. I've
resorted to adding a var to AnalyzingCompiler to inject the
dependency to get the cache to the spot I need it without binary
incompatible changes to the intervening method signatures.

Here are a few numbers showing the speedup of the peak (warmed up) performance:

Codebase	Scala Version	Direct	SBT 0.13.12	This PR
src/compiler	2.12.0-SNAPSHOT	19s	32s (1.68x)	24s (1.26x)
scalap	2.11.8	1.19 s	1.72s (1.45x)	1.39s (1.17x)
scalap	2.12.0-SNAPSHOT	1.02 s	1.95s (1.91x)	1.20s (1.17x)

The compiler interface subclasses `scala.tools.nsc.Global`, and loading this new subclass before each `compile` task forces HotSpot JIT to deoptimize larges swathes of compiled code. It's a bit like SBT has rigged the dice to always descend the longest ladder in a game of Snakes and Ladders. The slowdown seems to be larger with Scala 2.12. There are a number of variables at play, but I think the main factor here is that we now rely on JIT to devirtualize calls to final methods in traits whereas we used to emit static calls. JIT does a good job at this, so long as classloading doesn't undo that good work. This commit extends the existing `ClassLoaderCache` to encompass the classloader that includes the compiler interface JAR. I've resorted to adding a var to `AnalyzingCompiler` to inject the dependency to get the cache to the spot I need it without binary incompatible changes to the intervening method signatures.

retronym · 2016-09-26T05:54:12Z

Note that the 2.12 numbers are based on a branch that includes scala/scala#5417, a performance regression in Scala 2.12s code generation.

dwijnand

I'm not sure what game of Snakes and Ladders you played, because you're meant to ascend with the ladders and descend with the snakes.

I'm a little bit on the fence here about whether we should do the necessary song-and-dance to burrow State's classLoaderCache to AnalzyingCompiler.

So I'll LGTM, and see what @eed3si9n says.

We need some notes here, both to flash how much better it performs and to document sbt.disable.interface.classloader.cache.

retronym · 2016-09-26T07:16:20Z

Must have been Co-Snakes-and-ladders.

retronym · 2016-09-26T07:37:00Z

I just tried this out on SBT itself. Steady state performance of compile (measured by running ;clean;update;, then compile repeatedly in the shell) improved a little 34s -> 31s.

dwijnand · 2016-09-26T08:23:37Z

@retronym I removed the var and setter in dwijnand@b6a4789.

Have a look and see if you want to cherry-pick it.

smarter · 2016-09-26T09:01:26Z

The compiler interface subclasses scala.tools.nsc.Global,
and loading this new subclass before each compile task forces
HotSpot JIT to deoptimize larges swathes of compiled code

Interesting, do you mean that it's deoptimizing code unrelated to scala.tools.nsc.Global? Do you know why?
Also note to self: I should probably imitate that caching in https://github.com/lampepfl/dotty/blob/master/bridge/src/main/scala/xsbt/CompilerClassLoader.scala until I can port this to sbt proper.

retronym · 2016-09-26T09:30:29Z

@smarter Take a look at the diagnostic flags and output in scala/scala-dev#232 for the trail of clues.

In general, loading classes can deoptimize speculative optimizations based on Call Hierarachy Analysis. Similarly, loading and using a new Global subclass can invalidate devirtualized calls sites (a new reciever class appears which invalidates the assumption that this call site was effectively mono- or bi-morphic). I think that checkcast or instanceof checks could also also be affected.

Here's the JDK code that prints out the deoptimization messages: http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/share/vm/runtime/deoptimization.cpp

retronym · 2016-09-26T12:11:41Z

A theory for why the improvement in peak performance in the SBT's own build was only modest is that when multiple sub projects compiled in parallel the single copy of Global (loaded in a shared classloader) immediately was seen to have numerous active subclasses, so call sites didn't "overfit" to an assumption that would be soon invalidated. One could test this hypothesis by disabling parallel compilation of subprojects.

eed3si9n

Overall the change looks good, but I'm a bit concerned about how SoftReference behaves against several other uses of SoftReference, like in Analysis cache with many-subproject build.
If we're putting this in to 0.13.13, then RC should be bumped back.

retronym · 2016-09-26T22:24:35Z

@eed3si9n I've run a test under this patch in which I constrained the heap and metaspace, and then compiled a project with a sequence of different Scala versions within a single SBT session.

https://gist.github.com/retronym/d621926d6f9ea9d3c3c992ac44fa87a7

eed3si9n · 2016-09-26T22:53:03Z

@retronym Thanks for the additional testing.

retronym · 2016-11-08T02:04:32Z

I've just verified this again on scala/scala 2.12.x branch now that we've upgraded to SBT 0.13.13.

Below are the times to perform a clean compile of our sources as the JVM warms up, with the classloader caching in this PR disabled then enabled.

⚡ for b in true false; do echo "disabled=$b"; (((for i in {1..5}; do printf ';clean;update\ncompile\n'; done) | sbt -Dsbt.disable.interface.classloader.cache=$b) 2>&1) | egrep 'Total time: \d\d' ; done
disabled=true
[success] Total time: 131 s, completed 08/11/2016 11:47:50 AM
[success] Total time: 107 s, completed 08/11/2016 11:49:39 AM
[success] Total time: 105 s, completed 08/11/2016 11:51:25 AM
[success] Total time: 105 s, completed 08/11/2016 11:53:11 AM
[success] Total time: 105 s, completed 08/11/2016 11:54:57 AM
disabled=false
[success] Total time: 118 s, completed 08/11/2016 11:57:04 AM
[success] Total time: 88 s, completed 08/11/2016 11:58:32 AM
[success] Total time: 85 s, completed 08/11/2016 11:59:59 AM
[success] Total time: 86 s, completed 08/11/2016 12:01:26 PM
[success] Total time: 83 s, completed 08/11/2016 12:02:50 PM

retronym · 2016-11-08T02:21:58Z

I've repeated the experiment on our 2.11.x branch (with one local change to bump from SBT 0.13.12 to 0.13.12). Similar numbers:

⚡ for b in true false; do echo "disabled=$b"; (((for i in {1..4}; do printf ';clean;update\ncompile\n'; done) | sbt -Dsbt.disable.interface.classloader.cache=$b) 2>&1) | egrep 'Total time: \d\d' ; done
disabled=true
[success] Total time: 138 s, completed 08/11/2016 12:08:32 PM
[success] Total time: 106 s, completed 08/11/2016 12:10:20 PM
[success] Total time: 102 s, completed 08/11/2016 12:12:04 PM
[success] Total time: 102 s, completed 08/11/2016 12:13:48 PM
disabled=false
[success] Total time: 118 s, completed 08/11/2016 12:15:55 PM
[success] Total time: 86 s, completed 08/11/2016 12:17:23 PM
[success] Total time: 84 s, completed 08/11/2016 12:18:48 PM
[success] Total time: 82 s, completed 08/11/2016 12:20:12 PM

That's good news, I wasn't sure whether the gain would be the same using 2.11 or 2.12 as a compiler.

eed3si9n · 2016-11-08T02:26:15Z

Nice

retronym · 2016-11-08T04:25:36Z

@eed3si9n Do you know if/how this problem relates to Maven/Gradle/IntelliJ using zinc without SBT?

eed3si9n · 2016-11-08T04:32:34Z

Given that this setting is passed in via Defaults.scala, I would assume that none of the other build tools would be aware of this change unless we publish typesafehub/zinc and let them them know specifically for it.

retronym · 2016-11-08T08:36:07Z

@eed3si9n Okay, tracking that task as typesafehub/zinc#105

stuhood · 2016-11-08T19:12:52Z

https://github.com/pantsbuild/pants will happily pick this up assuming it is available in https://github.com/sbt/zinc, which we consume directly.

Thanks!

pedrofurla · 2016-11-11T04:19:00Z

Some interesting results from my one of the scala project at x.ai:

The code base has 209 files, 70k LoC (excluding blanks and comments).

There are 174 traits, 193 classes (93 are implicit classes), 514 case classes, 1029 objects.

There is a small abuse of Shapeless, that produces coproducts for most of our domain objects, when fully compiled it adds a signficant slowdown. The same can be said of our centralized serialization source.

On sbt 0.13.12

First try on a fresh sbt session, after ;clean; test:clean; update

sbt> test:compile
...
[success] Total time: 255 s, completed Nov 10, 2016 4:22:17 PM

Same SBT session, after another ;clean; test:clean; update

sbt> test:compile
...
[success] Total time: 245 s, completed Nov 10, 2016 7:50:02 PM

On SBT 0.13.13

First try on a fresh sbt session, after ;clean; test:clean; update

sbt> test:compile
...
[success] Total time: 228 s, completed Nov 10, 2016 4:01:28 PM

Same SBT session, after another ;clean; test:clean; update

sbt> test:compile
...
[success] Total time: 185 s, completed Nov 10, 2016 4:04:39 PM

retronym · 2016-11-11T04:23:43Z

Thanks for sharing those numbers, @pedrofurla!

As SBT itself does: sbt/sbt#2754

As SBT itself does: sbt/sbt#2754 Fixes typesafehub#105

adriaanm · 2017-04-14T15:59:51Z

Related PR in dotty's compiler bridge: scala/scala3#2262
Is this something to adopt in the Scala 2.x bridges as well?

smarter · 2017-04-14T16:06:10Z

@adriaanm I don't think there's anything that needs to be done in the scalac bridge, the dotty bridge needs to create its own classloader (because the sbt phases are part of the compiler and not part of the bridge, and the classloader constructed by sbt doesn't support that), but the scalac bridge just uses the classloader that should already be cached by sbt thanks to this PR.

smarter · 2017-04-14T16:09:08Z

(because the sbt phases are part of the compiler and not part of the bridge, and the classloader constructed by sbt doesn't support that)

For the record, see https://github.com/lampepfl/dotty/blob/9e45ad16d012e6a2ff3be411c2fe101b1c74b831/sbt-bridge/src/xsbt/CompilerClassLoader.scala#L45-L72 for the gory details.

adriaanm · 2017-04-14T16:54:45Z

Ok, thanks for explaining!

So far, Mill was caching ScalaInstance which contains a classloader but this is not enough: Zinc creates its own classloader by combining the ScalaInstance classloader with the path to the compiler-bridge. Zinc takes care of caching this classloader in each instance of ScalaCompiler. We can take advantage of this by caching an instance of Compilers since it contains everything we want to cache (ScalaCompiler and ScalaInstance). This significantly reduces the amount of classloading that happens at each compilation step, as measured by running: export _JAVA_OPTIONS="-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+TraceClassUnloading" mill -i foo.compile Then looking at the output of `out/mill-worker-1/logs` while adding a new line in a source file in `foo` and running `mill -i foo.compile` again. The speedup is going to depend on the project, but I measured a ~2x improvement when running on the Mill build `time(core.compile())` and `rm -rf out/core/compile/` in a loop until results stabilized. See also the results that were obtained when a very similar issue was fixed in sbt itself: sbt/sbt#2754

retronym changed the title ~~SD-232 Recycle classloaders to be anti-hostile to JIT.~~ SD-232 Recycle classloaders to be anti-hostile to JIT Sep 26, 2016

retronym mentioned this pull request Sep 26, 2016

Overhead of incremental compiler appears much higher in 2.12 (1.75x) scala/scala-dev#232

Closed

dwijnand approved these changes Sep 26, 2016

View reviewed changes

Replace var/set with withClassLoaderCache

06cdefe

eed3si9n approved these changes Sep 26, 2016

View reviewed changes

eed3si9n merged commit b4bc9f1 into sbt:0.13.13 Sep 26, 2016

dwijnand mentioned this pull request Oct 25, 2016

notes on possible 2.12 release notes improvements scala/make-release-notes#202

Closed

16 tasks

retronym mentioned this pull request Nov 8, 2016

Upgrade to SBT 0.13.13 and profit from performance improvements typesafehub/zinc#105

Closed

3 tasks

retronym mentioned this pull request Nov 10, 2016

Zinc extract used names is very sluggish sbt/zinc#187

Closed

This was referenced Nov 15, 2016

Forward port sbt/sbt#2754 SD-232 Recycle classloaders to be anti-hostile to JIT sbt/zinc#195

Closed

Forward port: SD-232 Recycle classloaders to be anti-hostile to JIT sbt/zinc#196

Merged

eed3si9n mentioned this pull request Nov 15, 2016

[sbt 1.0] Forward port 0.13 changes #2352

Closed

7 tasks

JoshRosen mentioned this pull request Nov 30, 2016

[SPARK-18638][BUILD] Upgrade sbt, Zinc, and Maven plugins apache/spark#16069

Closed

mchv mentioned this pull request Dec 9, 2016

Update to SBT 0.13.13 - Reduce compilation time guardian/frontend#15282

Merged

eed3si9n mentioned this pull request Jan 15, 2017

[fport] SD-232 Recycle classloaders to be anti-hostile to JIT #2920

Merged

retronym added a commit to retronym/zinc-legacy that referenced this pull request Feb 17, 2017

Take advantage of SBT 0.13.13's compiler interface classloader cache

51d7049

As SBT itself does: sbt/sbt#2754

retronym mentioned this pull request Feb 17, 2017

Take advantage of SBT 0.13.13's compiler interface classloader cache typesafehub/zinc#108

Merged

retronym added a commit to retronym/zinc-legacy that referenced this pull request Feb 17, 2017

Take advantage of SBT 0.13.13's compiler interface classloader cache

ca475f4

As SBT itself does: sbt/sbt#2754 Fixes typesafehub#105

retronym mentioned this pull request Jun 19, 2017

Support for benchmarking compilation under SBT scala/compiler-benchmark#30

Merged

jvican mentioned this pull request Jul 14, 2017

Consider using ClassLoaderCache in Scala's integration pantsbuild/pants#4744

Closed

retronym mentioned this pull request Dec 10, 2017

Add bench/ scalameta/scalameta#1181

Merged

jvican mentioned this pull request Jan 19, 2018

Use the classloader cache in AnalyzingCompiler scalacenter/bloop#179

Closed

smarter mentioned this pull request Jul 27, 2018

Make hot compilation 2x faster by properly reusing classloaders com-lihaoyi/mill#393

Merged

lrytz mentioned this pull request Oct 24, 2019

Remove support for resident compiler sbt/zinc#711

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SD-232 Recycle classloaders to be anti-hostile to JIT #2754

SD-232 Recycle classloaders to be anti-hostile to JIT #2754

retronym commented Sep 26, 2016 •

edited

Loading

retronym commented Sep 26, 2016

dwijnand left a comment

retronym commented Sep 26, 2016

retronym commented Sep 26, 2016

dwijnand commented Sep 26, 2016

smarter commented Sep 26, 2016

retronym commented Sep 26, 2016 •

edited

Loading

retronym commented Sep 26, 2016

eed3si9n left a comment

retronym commented Sep 26, 2016

eed3si9n commented Sep 26, 2016

retronym commented Nov 8, 2016 •

edited

Loading

retronym commented Nov 8, 2016

eed3si9n commented Nov 8, 2016

retronym commented Nov 8, 2016

eed3si9n commented Nov 8, 2016

retronym commented Nov 8, 2016

stuhood commented Nov 8, 2016

pedrofurla commented Nov 11, 2016

retronym commented Nov 11, 2016

adriaanm commented Apr 14, 2017

smarter commented Apr 14, 2017

smarter commented Apr 14, 2017

adriaanm commented Apr 14, 2017

SD-232 Recycle classloaders to be anti-hostile to JIT #2754

SD-232 Recycle classloaders to be anti-hostile to JIT #2754

Conversation

retronym commented Sep 26, 2016 • edited Loading

retronym commented Sep 26, 2016

dwijnand left a comment

Choose a reason for hiding this comment

retronym commented Sep 26, 2016

retronym commented Sep 26, 2016

dwijnand commented Sep 26, 2016

smarter commented Sep 26, 2016

retronym commented Sep 26, 2016 • edited Loading

retronym commented Sep 26, 2016

eed3si9n left a comment

Choose a reason for hiding this comment

retronym commented Sep 26, 2016

eed3si9n commented Sep 26, 2016

retronym commented Nov 8, 2016 • edited Loading

retronym commented Nov 8, 2016

eed3si9n commented Nov 8, 2016

retronym commented Nov 8, 2016

eed3si9n commented Nov 8, 2016

retronym commented Nov 8, 2016

stuhood commented Nov 8, 2016

pedrofurla commented Nov 11, 2016

retronym commented Nov 11, 2016

adriaanm commented Apr 14, 2017

smarter commented Apr 14, 2017

smarter commented Apr 14, 2017

adriaanm commented Apr 14, 2017

retronym commented Sep 26, 2016 •

edited

Loading

retronym commented Sep 26, 2016 •

edited

Loading

retronym commented Nov 8, 2016 •

edited

Loading