Optimize Docker Image via Multi-Stage Builds

Mehmet Ali Baykara
4 min readApr 26, 2020
Docker

Docker is a great tool that provides to run your application isolated in a container. That also allows you to build, run, and test the application independently from your host machine. For more, please see detailed docker documentation. We will focus on optimizing your Dockerfile so that you get as much as possible a lightweight container. There are many approaches to optimize your Dockerfile, but I will focus on the multi-stage build in docker in this post.

Multi-stage builds are part of the docker since version 17.05. It let you keep your image smaller and easy to maintain.

Let’s begin and see how to do that. In one of my previous posts, I’ve set up Clang 10 in a docker container. Clang is a C/C++ compiler. We will use that image and additionally add another tool such as ninja-build. Ninja is a small build system for also C/C++ applications that focused on speed.

Let’s go through the Dockerfile, respectively. In the first stage, I called it clang-stage, but you may call it whatever you want. We will download and install clang 10 and extract prebuilt binaries to the clang-10 directory that we created.

FROM ubuntu:18.04 as clang-stage
# installing requirements to get and extract prebuilt binaries
RUN apt-get update && apt-get install -y xz-utils curl
#Getting prebuilt binary from LLVM
RUN cd /tmp/ \
&& curl -SL https://github.com/llvm/llvm-project/releases/download/llvmorg-10.0.0/clang+llvm-10.0.0-x86_64-linux-gnu-ubuntu-18.04.tar.xz -# -o llvm.tar.xz\
&& mkdir clang-10/ \
&& tar xf ./llvm.tar.xz -C clang-10 --strip-components=1
#Ninja
FROM ubuntu:18.04 as ninja-stage
RUN apt-get update \
&& apt-get install -y git clang python3
RUN mkdir -p /tmp/ && cd /tmp/ \
&& git clone https://github.com/ninja-build/ninja.git \
&& cd /tmp/ninja/ \
&& CXX=clang++ ./configure.py --bootstrap
#Main stage
FROM ubuntu:18.04
COPY --from=clang-stage /tmp/clang-10/bin/ /usr/bin/
COPY --from=clang-stage /tmp/clang-10/lib/ /usr/lib/
COPY --from=ninja-stage /tmp/ninja/ninja /usr/bin/
CMD [“/bin/bash”]

In the second stage, I called it ninja-stage, we will install the required dependencies to compile ninja that we cloned from GitHub. Also, we work in the/tmp directory.

In the last, which is the main stage, we use ubuntu.18.04 as the base image. In case you need another ubuntu version, then see here.

From now on, we copy clang binaries from clang-stage and ninja binary from ninja-stage to our main-stage. As you noticed, the syntax is straightforward and no complex structure.

Now we can build our image as below: Note that it will take 1–2 minutes.

$ docker build -t my_multi .
Sending build context to Docker daemon 2.56kB
Step 1/11 : FROM ubuntu:18.04 as clang-stage
---> c3c304cb4f22
Step 2/11 : RUN apt-get update && apt-get install -y xz-utils curl
---> Using cache
---> 2cf8b3c79238
Step 3/11 : RUN cd /tmp/ && curl -SL https://github.com/llvm/llvm-project/releases/download/llvmorg-10.0.0/clang+llvm-10.0.0-x86_64-linux-gnu-ubuntu-18.04.tar.xz -# -o llvm.tar.xz && mkdir clang-10/ && tar xf ./llvm.tar.xz -C clang-10 --strip-components=1
---> Using cache
---> dfe53fd3c0fc
Step 4/11 : FROM ubuntu:18.04 as ninja-stage
---> c3c304cb4f22
Step 5/11 : RUN apt-get update && apt-get install -y git clang python3
---> Using cache
---> 9986530ae466
Step 6/11 : RUN mkdir -p /tmp/ && cd /tmp/ && git clone https://github.com/ninja-build/ninja.git && cd /tmp/ninja/ && CXX=clang++ ./configure.py --bootstrap
---> Using cache
---> 3744e8e66dac
Step 7/11 : FROM ubuntu:18.04
---> c3c304cb4f22
Step 8/11 : COPY --from=clang-stage /tmp/clang-10/bin/ /usr/bin/
---> e34fb4fe7805
Step 9/11 : COPY --from=clang-stage /tmp/clang-10/lib/ /usr/lib/
---> 486da7f7baa6
Step 10/11 : COPY --from=ninja-stage /tmp/ninja/ninja /usr/bin/
---> ab80675661d2
Step 11/11 : CMD ["/bin/bash"]
---> Running in e56dd9e7532d
Removing intermediate container e56dd9e7532d
---> 6390bf0db8e6
Successfully built 6390bf0db8e6
Successfully tagged my_multi:latest

We could verify the installation of tools easily via the starting container and checking version.

debian@debian:~/codebase/medium$ docker run — rm -it my_multi bash
root@d608578a4c51:/# clang — version
clang version 10.0.0
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
root@d608578a4c51:/# ninja — version
1.10.0.git
root@d608578a4c51:/#

Hence, this post's name is optimization; let’s see how much we could optimize our image. To compare that, I also build an image without a multi-stage build, installed clang 10 and ninja in a docker image. I named it clang-ninja.

$ docker images
clang_ninja latest e74403b3fe79 5 seconds ago 2.96GB
multi latest 6390bf0db8e6 11 minutes ago 2.28GB

As you see, the optimized image 0.66 GB (660MB) is smaller than the non-optimized image. Still, for an image over 1 GB is really big. It might make a huge difference in a different application. For instance, you an application that uses only static binary, then the optimization might be fascinating. For sure, you could optimize Dockerfile further, but this is not focused on this post. So I wish you a happy Sunday!

Resources
* https://docs.docker.com/develop/develop-images/multistage-build/
* https://www.docker.com/company/newsroom/media-resources
* https://ninja-build.org/

--

--