Caching Strategy Reminder for Maven-Based Docker Builds
My local development feedback loop between code change and runnable container was annoyingly long on a Maven-based project I was recently working on. I wanted to speed things up.
The scenario was something like this:
- touch/change some source code
docker build- maven downloads the world
- maven compiles my project
docker run- touch/change some source code
docker build- maven downloads the world
- maven compiles my project
docker run- touch/change some source code
docker build- maven downloads the world
- maven compiles my project
docker run- …
I didn’t really enjoy the “maven downloads the world” steps, and wanted to minimize the number of times it needed to run.
Let’s follow along as I make my situation a little better. For illustration, we’ll start off with this generic archetype-created skeleton project:
package com.keyholesoftware.blog;
public class App
{
public static void main( String[] args )
{
System.out.println( "Hello World!" );
}
}package com.keyholesoftware.blog;
import junit.framework.*;
public class AppTest extends TestCase
{
public void testApp()
{
assertTrue( true );
}
}FROM maven:3.2.5-jdk-8u40 RUN mkdir --parents /usr/src/app WORKDIR /usr/src/app ADD . /usr/src/app RUN mvn verify
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.keyholesoftware.blog</groupId>
<artifactId>khs-docker-caching-blog</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>
Things aren’t that bad when I am building back-to-back, e.g.
$ docker build . ... $ docker build . ...
Notice that the second build is fast as everything is cached up. But what about when we do something like this:
$ docker build . ... $ touch src/main/java/com/keyholesoftware/blog/App.java ... $ docker build . ...
Notice that the second build is unnecessarily slowed down by the redownload portion.
I sat around and despaired for a while until I remembered the tricks I’ve seen with selective caching:
FROM maven:3.2.5-jdk-8u40 RUN mkdir --parents /usr/src/app WORKDIR /usr/src/app # selectively add the POM file ADD pom.xml /usr/src/app/ # get all the downloads out of the way RUN mvn verify clean --fail-never ADD . /usr/src/app RUN mvn verify
Let’s try that sequence again.
$ docker build . ... $ touch src/main/java/com/keyholesoftware/blog/App.java ... $ docker build . ...
Getting better, but there were still a few downloads going on during the second build. They are related to the surefire test/plugin. Actually this process will help us iron out downloads which are chosen dynamically, and lock those down. In this case, we lock down our surefire provider.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.keyholesoftware.blog</groupId>
<artifactId>khs-docker-caching-blog</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<properties>
<surefire.version>2.8.1</surefire.version>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>${surefire.version}</version>
<!-- lock down our surefire provider -->
<dependencies>
<dependency>
<groupId>org.apache.maven.surefire</groupId>
<artifactId>surefire-junit3</artifactId>
<version>${surefire.version}</version>
</dependency>
</dependencies>
</plugin>
</plugins>
</build>
</project>Let’s try that sequence again.
$ docker build . ... $ touch src/main/java/com/keyholesoftware/blog/App.java ... $ docker build . ...
So now, unless we change the POM, we don’t have to redownload anything. Nice.
Now the scenario is something like this:
- touch/change some source code
docker build- maven downloads the world
- maven compiles my project
docker run- touch/change some source code
docker build- maven compiles my project
docker run- touch/change some source code
docker build- maven compiles my project
docker run- …
Notice the “maven downloads the world” step only happens once (unless I actually change the POM, of course).
Final Thoughts
There might be better ways to handle some of this (e.g. dependency:resolve/resolve-plugin but that doesn’t seem to work as thoroughly, and probably something with fig), but I mainly wanted to highlight a possible use of the selective adding/caching.
Other Notes:
- For you Ruby+Rakefile, Python+requirements.txt, Node+package.json, Go+GoDeps.json etc. folks — Maven doesn’t have an explicit ‘install dependencies’ step. See Introduction to the Build Lifecycle if you’re bored.
- For you Gradle folks, I haven’t used Gradle much. What are your thoughts?
- The source code for this post is at: https://github.com/in-the-keyhole/khs-docker-caching-blog
Thanks for reading!
| Reference: | Caching Strategy Reminder for Maven-Based Docker Builds from our JCG partner Luke Patterson at the Keyhole Software blog. |











