Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default ClassfileManager doesn't recover from partially successful compilation results #958

Closed
gkossakowski opened this issue Nov 6, 2013 · 10 comments
Assignees
Labels

Comments

@gkossakowski
Copy link
Contributor

Let's assume that incOptions := sbt.inc.IncOptions.Default which means we are using default compiler options. Let's start with these sources:

// A.scala
class A {
  def initialized: Boolean = false
}

// B.scala
class B {
  def foo(a: A): Boolean = a.initialized
}

and compile them. Then let's comment out initialized method:

// A.scala
class A {
//  def initialized: Boolean = false
}

and try to compile. The incremental compiler will compile A.scala first, then it will try to recompile B.scala and it will fail. The error will get reported as expected. Now, if we change A.scala back to its original content:

// A.scala
class A {
  def initialized: Boolean = false
}

and we try to compile again we expect it should succeed. However, it will report the same error as with second compile attempt.

Let's see what happened exactly. After commenting out initialized in A.scala incremental compiler invoked Scala compiler that compiled A.scala and produced a new class file for A. It failed to compile B.scala so class file for B got removed. However, since the whole incremental compilation session failed the Analysis object rolled back so it has a hash of original A.scala file.

Now, once we changed A.scala back to original file and we asked to recompile again. Since Analysis object contains the hash of original file the incremental compiler doesn't notice the change to A.scala. However, since class file for B.scala has been removed (and that wasn't recorded) it tries to recompile B.scala and uses stale class file for A. Here's the relevant log output:

> compile
[debug] 
[debug] Initial source changes: 
[debug]     removed:Set()
[debug]     added: Set()
[debug]     modified: Set()
[debug] Removed products: Set(/Users/grek/scala/xsbt/sbt/src/sbt-test/compiler-project/error-in-invalidated/target/scala-2.10/classes/B.class)

The solution to this problem is to make ClassfileManager transactional so it commits changes to class files only when entire session of incremental compilation succeeds. This way Analysis object and class files stay in sync.

@ghost ghost assigned gkossakowski Nov 6, 2013
gkossakowski added a commit to gkossakowski/sbt that referenced this issue Nov 6, 2013
The ticket contains detailed description of the problem. The test case
just shows that if we set `incOptions := sbt.inc.IncOptions.Default`
then the incremental compiler doesn't recover from compiler errors
properly.
gkossakowski added a commit to gkossakowski/sbt that referenced this issue Nov 6, 2013
The ticket contains detailed description of the problem. The test case
just shows that if we set `incOptions := sbt.inc.IncOptions.Default`
then the incremental compiler doesn't recover from compiler errors
properly.
@harrah
Copy link
Member

harrah commented Nov 6, 2013

It seems to me that the problem is that A's class file does not correspond to A.scala's hash. I will have to look at the code, but if the check is currentClassFile.lastModified != cachedLastModified, it would detect that the class file is different. I believe the check is actually currentClassFile.lastModified >= cachedLastModified, though. Changing it to != would fix the problem.

@gkossakowski
Copy link
Contributor Author

Are you talking about this line:

val removedProducts = previous.allProducts.filter( p => !equivS.equiv( previous.product(p), current.product(p) ) ).toSet

I believe it uses this Equiv instance:

implicit val equivStamp: Equiv[Stamp] = new Equiv[Stamp] {
        def equiv(a: Stamp, b: Stamp) = (a,b) match {
                case (h1: Hash, h2: Hash) => h1.value sameElements h2.value
                case (e1: Exists, e2: Exists) => e1.value == e2.value
                case (lm1: LastModified, lm2: LastModified) => lm1.value == lm2.value
                case _ => false
        }
}

I don't know why this logic do not invalidate those products then.

@harrah
Copy link
Member

harrah commented Nov 6, 2013

Yes, that's what I mean. There are two possibilities that I can think of. One is that the resolution of last modified is 1 s on linux. Doing two quick compiles can mean that the last modified time is the same. This would typically only be the case in an automated test case. You can put a sleep in the scripted test to check or change the Stamp implementation to be Hash based. The other possibility is that class files only use an Exists Stamp.

@gkossakowski
Copy link
Contributor Author

You are right that class files use Exists Stamp, see:

scala> val analysis = (compile in Compile).eval
analysis: sbt.inc.Analysis = Analysis: 2 Scala sources, 2 classes, 2 binary dependencies

scala> analysis.stamps.products
res6: Map[java.io.File,sbt.inc.Stamp] = Map(/Users/grek/scala/xsbt/sbt/src/sbt-test/compiler-project/error-in-invalidated/target/scala-2.10/classes/B.class -> exists, /Users/grek/scala/xsbt/sbt/src/sbt-test/compiler-project/error-in-invalidated/target/scala-2.10/classes/A.class -> exists)

Do you think it would be safe to switch to last-modified stamps?

@harrah
Copy link
Member

harrah commented Nov 7, 2013

Yes and no. I think Exists < LastModified < Hash in terms of accuracy and you can always go to the more accurate one. I don't know whether an existence check is less expensive than a last modified check (and that could matter given the large number of class files). If users are overwriting class files as part of their build, a last modified check will cause things to be recompiled every time. This is not the way sbt users should be doing things (they should be writing to a new file), but it might be common in other tools or people might do it anyway. sbt of course uses the transactional class file manager, but it should still work without it.

@gkossakowski
Copy link
Contributor Author

I just tried

diff --git a/compile/inc/src/main/scala/sbt/inc/Compile.scala b/compile/inc/src/main/scala/sbt/inc/Compile.scala
index ce1803c..b46ebf9 100644
--- a/compile/inc/src/main/scala/sbt/inc/Compile.scala
+++ b/compile/inc/src/main/scala/sbt/inc/Compile.scala
@@ -19,7 +19,7 @@ object IncrementalCompile
            output: Output, log: Logger,
            options: IncOptions): (Boolean, Analysis) =
        {
-               val current = Stamps.initial(Stamp.exists, Stamp.hash, Stamp.lastModified)
+               val current = Stamps.initial(Stamp.hash, Stamp.hash, Stamp.lastModified)

All tests still pass and this issue is being fixed. I agree that both performance might be a concern. As far as I can see products Stamps are not cached. Could it be that they are recalculated for every iteration of incremental compiler?

Also, I agree that LastModified might have unfortunate effect of introducing more recompilations triggered by external tools (e.g. IDE) overwriting class files. For that reasons I'd be in favor of hashing which is always correct.

@harrah
Copy link
Member

harrah commented Nov 7, 2013

If last modified might be a performance concern, hashing is definitely one. Hashing would have the same problem- if you overwrite the class file, the hash would change. I think we can fall back on our ideal inc. compiler definition where the outputs should be the same as if you had done a full compilation. A full compilation would overwrite those class files.

gkossakowski referenced this issue Nov 7, 2013
This is useful when an error occurs in a later incremental step that
requires a fix in the originally changed files.

CC @gkossakowski
@gkossakowski
Copy link
Contributor Author

I was thinking that hashing would help us with scenario where you use sbt and IDE at the same time. You hit save in IDE and sources are recompiled. Then you switch to sbt and run compile. All products have the right hash (corresponding to source file) but I forgot that since Analysis is not shared sbt's Analysis won't "know" about updated hashes and recompile anyway.

Therefore I agree that LastModified is enough. When it comes to performance I just tested it:

scala -cp commons-io-2.4/commons-io-2.4.jar:Thyme.jar
Welcome to Scala version 2.10.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_65).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.apache.commons.io.FileUtils;  import org.apache.commons.io.filefilter._;
import org.apache.commons.io.FileUtils
import org.apache.commons.io.filefilter._

scala> val classFiles = FileUtils.listFiles(new java.io.File("/Users/grek/scala/scala-master/build/quick/classes"), new RegexFileFilter("^(.*?).class"), DirectoryFileFilter.DIRECTORY)
classFiles: java.util.Collection[java.io.File] = [/Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/$bang$.class, /Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/$bang.class, /Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/AbstractActor$class.class, /Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/AbstractActor.class, /Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/Actor$$anon$1.class, /Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/Actor$$anon$2.class, /Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/Actor$$anon$4.class, /Users/grek/scala/scala-master/build/quick/classes/actors/scala/actors/Actor$$anonfun$respondOn$1$$anon$3.class,...

scala> def checkExists = classFiles.foreach(_.exists)
checkExists: Unit

scala> def checkLastModified = classFiles.foreach(_.lastModified)
checkLastModified: Unit

scala> val th = new ichi.bench.Thyme
th: ichi.bench.Thyme = ichi.bench.Thyme@3a70f2db

scala> th.pbench(checkExists)
Benchmark (20 calls in 1.221 s)
  Time:    60.37 ms   95% CI 57.31 ms - 63.43 ms   (n=20)
  Garbage: 650.0 us   (n=16 sweeps measured)

scala> th.pbench(checkLastModified)
Benchmark (20 calls in 1.197 s)
  Time:    58.56 ms   95% CI 55.89 ms - 61.24 ms   (n=19)
  Garbage: 700.0 us   (n=16 sweeps measured)

As you can see, there's no noticeable difference.

@harrah
Copy link
Member

harrah commented Nov 7, 2013

Great! It looks like using last modified is the correct, immediate fix for this issue.

@gkossakowski
Copy link
Contributor Author

@harrah: agreed! Does this warrant another RC of sbt 0.13.1?

I can submit the fix tomorrow.

gkossakowski added a commit that referenced this issue Nov 19, 2013
The ticket contains detailed description of the problem. The test case
just shows that if we set `incOptions := sbt.inc.IncOptions.Default`
then the incremental compiler doesn't recover from compiler errors
properly.
gkossakowski added a commit that referenced this issue Nov 19, 2013
The #958 describes a scenario where partially successful results are
produced in form of class files written to disk. However, if compilation
fails down the road we do not record any new compilation results (products)
in Analysis object. This leads to Analysis object and disk contents to get
out of sync.

One way to solve this problem is to use transactional ClassfileManager that
commits changes to class files on disk only when entire incremental
compilation session is successful. Otherwise, new class files are rolled
back to previous state.

The other way to solve this problem is to record time stamps of class files
in Analysis object. This way, incremental compiler can detect that class
files and Analysis object got out of sync and recover from that by
recompiling corresponding sources.

This commit uses latter solution which enables simpler (non-transactional)
ClassfileManager to handle scenario from #958.

Fixes #958
gkossakowski added a commit that referenced this issue Mar 21, 2014
The ticket contains detailed description of the problem. The test case
just shows that if we set `incOptions := sbt.inc.IncOptions.Default`
then the incremental compiler doesn't recover from compiler errors
properly.
gkossakowski added a commit that referenced this issue Mar 21, 2014
The #958 describes a scenario where partially successful results are
produced in form of class files written to disk. However, if compilation
fails down the road we do not record any new compilation results (products)
in Analysis object. This leads to Analysis object and disk contents to get
out of sync.

One way to solve this problem is to use transactional ClassfileManager that
commits changes to class files on disk only when entire incremental
compilation session is successful. Otherwise, new class files are rolled
back to previous state.

The other way to solve this problem is to record time stamps of class files
in Analysis object. This way, incremental compiler can detect that class
files and Analysis object got out of sync and recover from that by
recompiling corresponding sources.

This commit uses latter solution which enables simpler (non-transactional)
ClassfileManager to handle scenario from #958.

Fixes #958
eed3si9n added a commit to eed3si9n/sbt that referenced this issue Apr 18, 2016
Fixes sbt#2546. Ref sbt#958
scripted compiler-project/error-in-invalidated has been failing
frequently on Travis CI. It seems like incremental compiler is not
catching the change in source occasionally for `changes/A2.scala`.
dwijnand pushed a commit to dwijnand/sbt that referenced this issue Jan 3, 2017
Fixes sbt#2546. Ref sbt#958
scripted compiler-project/error-in-invalidated has been failing
frequently on Travis CI. It seems like incremental compiler is not
catching the change in source occasionally for `changes/A2.scala`.

Forward-port of sbt#2565
dwijnand pushed a commit to dwijnand/sbt that referenced this issue Jan 5, 2017
Fixes sbt#2546. Ref sbt#958
scripted compiler-project/error-in-invalidated has been failing
frequently on Travis CI. It seems like incremental compiler is not
catching the change in source occasionally for `changes/A2.scala`.

Forward-port of sbt#2565
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants