Gradle Incremental Builds

From NovaOrdis Knowledge Base
Revision as of 05:39, 9 November 2020 by Ovidiu (talk | contribs) (→‎@Internal)
Jump to navigation Jump to search

External

Internal

Incremental Build

An incremental build is a build that avoids running tasks whose inputs and outputs did not change since the previous build, making the execution of such tasks unnecessary. An incremental build does not execute a task if it does not need to. The build determines whether is needed or not to run the task by performing up-to-date checks. Incremental builds save time, because they avoid doing unnecessary and some times lengthy work.

The central element that enables such behavior is the capacity of Gradle to analyze a task's inputs and outputs relative to the last build, detect that nothing that would require the task to run changed and decide that it does not need to run. A task must have at least one output to be considered as part of an incremental build. If it does not, it will be executed every time the build runs.

An even more sophisticated behavior consists in identifying which of the input elements had changed and perform work only for those elements. An incremental build is not required to support this behavior, it only has to avoid to run tasks unnecessarily. However, Gradle supports the behavior described above with incremental tasks.

All builds in Gradle are incremental builds by default, Build caching does not have to be enabled to allow them.

Task Inputs and Outputs

In most cases, a task takes some inputs and generates some outputs. For Java compilation, the inputs are the source files, target JDK version, whether we want debugging information or not, and the outputs are the generated class files. The amount of memory allocated to the compiler is not an input, but an internal task property, as its change does not influence the tasks' outputs.

Up-to-Date Check

When it is about to execute a task, during the execution phase, Gradle performs an "up-to-date" check: it tests externally to the task whether any of the following changed from the last build:

  • any of the task inputs changed.
  • any of the task outputs changed or it was removed.
  • the task code changed.

If none of the above elements changed, Gradle considers the task to be UP-TO_DATE and skips executing its actions. If gradle is executed with the -i command line option, the task outcome will be displayed. To be considered as part of an incremental build, a task must have at least one output.

The cleanest way to designate task inputs and outputs is by using annotation when implementing the task as a custom enhanced task: the corresponding getter method for the property should be marked with annotations, as shown below. ⚠️ If the task is written in Java, the annotation must be attached to the getter method. Annotations on setters or on the field will be ignored. Groovy allows annotating properties as well.

Alternatively, a Runtime API can be used to achieve the same results. The Runtime API can be used with simple tasks.

Inputs

The essential characteristic of an input is whether it affects one or more outputs in any way. In the above example, source files, target JDK version, enabling debugging are all task inputs. The amount of memory allocated to the compiler is not an input, but an internal task property.

If a task property affects the output of the task, it must be registered as input, otherwise the task will be considered UP-TO-DATE when the corresponding property changes, and the task is not actually up-to-date. Conversely, properties that do not affect the output must not be registered as inputs, otherwise the task will potentially execute when does not need to.

Simple Values as @Input

@Input

Strings or numbers, and in general, any type that implements Serializable, can be handled as a simple value. Enums implement Serializable automatically, so they can be used here. Java example:

private String customVersion;

@Input
public String getCustomVersion() {
  return customVersion;
}

public void setCustomVersion(String s) {
  this.customVersion = s;
}

Note that the result of the @Input evaluation is displayed when running with -i command line option:

> Task :example
Task ':example' is not up-to-date because:
  Value of input property 'customVersion' has changed for task ':example'
...

Playground Example:

Simple @Input property

File Types

Gradle detects if at least one file or directory changed, and executes the task as not UP-TO-DATE. It is possible to optimize the behavior even further, and only handle the file(s) or directory(es) that changed, but that is the responsibility of task, not Gradle. If the task is capable to handle these situations efficiently, it is called an incremental task. Gradle does help task implementers via its incremental task input feature.

@InputFile

@InputFile

@InputFile can be used to annotate a single input file (not directory). Gradle will consider the task out-of-date when the file path or contents have changed.

Playground Example:

@InputFile property

@InputDirectory

@InputDirectory

@InputDirectory can be used to annotate a single input directory (not file). Gradle will consider the task out-of-date when the directory location or contents have changed. To make the task dependent on the directory's location but not its contents, expose the path of the directory as an @Input property instead.

@InputFiles

@InputFiles

@InputFiles can be used to annotate an iterable of input files and directories. The properties that can be annotated with these annotations are standard Java File instances, but also derivatives of Gradle's org.gradle.api.file.FileCollection type and anything else that can be passed to either the Project.file(Object) method for single/file directory properties or the Project.files(Object...) method. This will cause the task to be considered out-of-date when the file paths or contents have changed.

Classpath Types

@Classpath

@Classpath

An iterable of input files and directories that represent a Java classpath. This allows the task to ignore irrelevant changes to the property, such as different names for the same files. It is similar to annotating the property @PathSensitive(RELATIVE) but it will ignore the names of JAR files directly added to the classpath, and it will consider changes in the order of the files as a change in the classpath. Gradle will inspect the contents of jar files on the classpath and ignore changes that do not affect the semantics of the classpath (such as file dates and entry order).

@CompileClasspath

@CompileClasspath

An iterable of input files and directories that represent a Java compile classpath. This allows the task to ignore irrelevant changes that do not affect the API of the classes in classpath. The following kinds of changes to the classpath will be ignored:

  • Changes to the path of jar or top level directories.
  • Changes to timestamps and the order of entries in Jars.
  • Changes to resources and Jar manifests, including adding or removing resources.
  • Changes to private class elements, such as private fields, methods and inner classes.
  • Changes to code, such as method bodies, static initializers and field initializers (except for constants).
  • Changes to debug information, for example when a change to a comment affects the line numbers in class debug information.
  • Changes to directories, including directory entries in Jars.

Nested Values and @Nested

@Nested

@Nested can be used to annotate custom types that do not conform the other two categories but have their own properties that are inputs or outputs - the task inputs or outputs are nested inside these custom types. The custom type may not implement Serializable but does have at least one field or property marked with one of the input annotations. @Nested can be specified recursively.

Internal Task Properties

Anything that is not a task input, meaning that it does not influence one or more outputs, it is an internal task property.

@Internal

@Internal

Indicates that the property is used internally but is neither an input nor an output.

Outputs

To be considered as part of an incremental build, a task must have at least one output and zero or more inputs.

If a task declares outputs but those are not configured in the configuration phase, the build will fail with a message similar to "No value has been specified for property '...'".

Output Annotations

@OutputFile

@OutputFile

A single output file (not a directory).

@OutputDirectory

@OutputDirectory

A single output directory (not a file).

@OutputFiles

@OutputFiles

An iterable or map of output files. Using a file tree turns caching off for the task.

@OutputDirectories

@OutputDirectories

An iterable or map of output directories. Using a file tree turns caching off for the task.

Outputs used as Inputs for another Tasks

Expand on this. The palantir docker task accepts tasks.jar.outputs as "files".

Declaring Inputs and Outputs with Runtime API

Annotations in custom task types are the cleanest way to declare inputs and outputs. However, simple tasks can be configured to participate in incremental builds by using the Runtime API.

Declaring Inputs and Outputs with Runtime API

Examples

@Destroys and Destroyables

@Destroys

Specifies one or more files that are removed by this task. Note that a task can define either inputs/outputs or destroyables, but not both.

Task Local State

@LocalState

@LocalState

@LocalState specifies one or more files that represent the local state of the task. These files are removed when the task is loaded from cache.

@Console

@Console

The annotation indicates that the annotated property is neither an input nor an output. It simply affects the console output of the task in some way, such as increasing or decreasing the verbosity of the task.

Non-Deterministic Tasks

If a task generates different output for exactly the same inputs, these tasks should not be configured for an incremental build, by declaring inputs and outputs, as up-to-date checks won't work and the task will not be executed when it should be.

Incremental Task

Incremental Task

Build Cache

Gradle allows caching task output in build caches, if the tasks producing those outputs are cacheable tasks, hence enabling sharing of outputs between builds. Build caching does not have to be enabled to allow incremental builds.

Builds Cache

TO PROCESS

An input can be a property, a directory or one or more files. An output is a directory or one or more files. Inputs and outputs are fields of the DefaultTask class. Task inputs and outputs are obtained with Task.getInputs() and Task.getOutputs(), which return TaskInputs and TaskOutputs instances, respectively.

Task inputs and outputs are evaluated during the configuration phase to wire up the task dependencies. That is why they need to be defined in a configuration block. To avoid unexpected behavior, make sure that the value you assign to inputs and outputs is accessible at configuration time. If you need to implement programmatic output evaluation, the method upToDateWhen(Closure) on TaskOutputs comes in handy. In contrast to regular inputs/outputs evaluation, this method is evaluated at execution time. If the closure returns true, the task is considered up to date.

Debug a Java compilation task to see how input/output works.