Understanding incremental recompilation

Compiling Scala code is slow, and SBT makes it often faster. By understanding how, you can even understand how to make compilation even faster. Modifying source files with many dependencies might require recompiling only those source files—which might take, say, 5 seconds—instead of all the dependencies—which might take, say, 2 minutes. Often you can control which will be your case and make development much faster by some simple coding practices.

In fact, improving Scala compilation times is one major goal of SBT, and conversely the speedups it gives are one of the major motivations to use it. A significant portion of SBT sources and development efforts deals with strategies for speeding up compilation.

To reduce compile times, SBT uses two strategies:

reduce the overhead for restarting Scalac;
implement smart and transparent strategies for incremental recompilation, so that only modified files and the needed dependencies are recompiled.
SBT runs Scalac always in the same virtual machine. If one compiles source code using SBT, keeps SBT alive, modifies source code and triggers a new compilation, this compilation will be faster because (part of) Scalac will have already been JIT-compiled. In the future, SBT will reintroduce support for reusing the same compiler instance, similarly to FSC.
When a source file A.scala is modified, SBT goes to great effort to recompile other source files depending on A.scala only if required - that is, only if the interface of A.scala was modified. With other build management tools (especially for Java, like ant), when a developer changes a source file in a non-binary-compatible way, he needs to manually ensure that dependencies are also recompiled - often by manually running the clean command to remove existing compilation output; otherwise compilation might succeed even when dependent class files might need to be recompiled. What is worse, the change to one source might make dependencies incorrect, but this is not discovered automatically: One might get a compilation success with incorrect source code. Since Scala compile times are so high, running clean is particularly undesirable.

By organizing your source code appropriately, you can minimize the amount of code affected by a change. SBT cannot determine precisely which dependencies have to be recompiled; the goal is to compute a conservative approximation, so that whenever a file must be recompiled, it will, even though we might recompile extra files.

The basic idea is that for each class, SBT tracks classes which depend on it directly; if the interface of a class changes, all dependencies are recompiled. In particular, this currently includes all transitive dependencies, that is, also dependencies of dependencies, dependencies of these and so on to arbitrary depth. This applies not only to classes but also to objects and traits.

Different sources are moreover recompiled together; hence a compile error in one source implies that no bytecode is generated for any of those. When a lot of files need to be recompiled and the compile fix is not clear, it might be best to comment out the offending location (if possible) to allow other sources to be compiled, and then try to figure out how to fix the offending location—this way, trying out possible solutions to the compile error will take less time, say 5 seconds instead of 2 minutes.

The heuristics used by SBT imply the following user-visible consequences.

XXX Please note that this part of the documentation is a first draft; part of the strategy might be unsound, part of it might be not yet implemented.

Adding, removing, modifying private methods does not require recompilation of client classes. Therefore, suppose you add a method to a class with a lot of dependencies, and that this method is only used in the declaring class; marking it private will prevent recompilation of clients. However, this only applies to methods which are not accessible to other classes, hence methods marked with private or private[this]; methods which are private to a package, marked with private[name], are part of the API.
Modifying the interface of a non-private method requires recompiling all clients, even if the method is not used.
Modifying one class does not require recompiling dependencies of other classes defined in the same file (XXX does it not?).
Adding a method which did not exist requires recompiling all clients, counterintuitively, due to complex scenarios with implicit conversions. Hence in some cases you might want to start implementing a new method in a separate, new class, complete the implementation, and then cut-n-paste the complete implementation back into the original source.
Changing the implementation of a method should not affect its clients, unless the return type is inferred, and the new implementation leads to a slightly different type being inferred. Hence, annotating the return type of a non-private method explicitly, if it is more general than the type actually returned, can reduce the code to be recompiled when the implementation of such a method changes.

All the above discussion about methods also applies to fields and members in general; similarly, references to classes also extend to objects and traits.

Why changing the implementation of a method might affect clients, and why type annotations help

Changing the return type of a method might be source-compatible, for instance if the new type is more specific, or if it is less specific, but still more specific than the type required by clients (note however that making the type more specific might still invalidate clients in non-trivial scenarios involving for instance type inference or implicit conversions—for a more specific type, too many implicit conversions might be available, leading to ambiguity); however, the bytecode for a method call includes the return type of the invoked method, hence the code needs to be recompiled. Suppose for instance that the implementation of method A.openFiles returns List[java.io.FileWriter]; suppose however that intended stable interface, however, is probably that A.openFiles returns Seq[java.io.Writer]. You can have the choice to make that explicit through a return type annotation - it might be a good idea, simply to hide from clients of A.openFiles some implementation details—that is, the specific implementation chosen for Seq and java.io.Writer. Suppose now that you indeed later modify these implementation details—by changing return type to Vector[java.io.BufferedWriter]. If the return type of A.openFiles was not annotated explicitly, this change modifies the binary interface and requires clients to be recompiled. Hence, adding explicit return types on classes with many dependencies might reduce the occasions where client code needs to be recompiled. Moreover, this is in general a good development practice when interface between different modules become important—specifying such interface documents the intended behavior and helps ensuring binary compatibility, which is especially important when the exposed interface is used by other software component.

Why adding a member requires recompiling existing clients

In Java adding a member does not require recompiling existing valid source code. The same should seemingly hold also in Scala, but this is not the case: implicit conversions might enrich class Foo with method bar without modifying class Foo itself through the pimp-my-library pattern (see discussion in issue #288 - XXX integrate more). However, if another method bar is introduced in class Foo, this method should be used in preference to the one added through implicit conversions. Therefore any class depending on Foo should be recompiled. One can imagine more fine-grained tracking of dependencies, but this is currently not implemented.

Further references

The incremental compilation logic is implemented in https://github.com/harrah/xsbt/blob/0.13/compile/inc/Incremental.scala. Some related documentation for SBT 0.7 is available at: https://code.google.com/p/simple-build-tool/wiki/ChangeDetectionAndTesting. Some discussion on the incremental recompilation policies is available in issue #322 and #288.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding incremental recompilation

Why changing the implementation of a method might affect clients, and why type annotations help

Why adding a member requires recompiling existing clients

Further references

Clone this wiki locally