Undocking - Containers without Docker
Faster, smaller builds
In our last post, we managed to use Buildah inside of Podman to create new images, but it’s a slow process. We want to be able to iterate on images quickly, so we need to make our build process fast.
How slow is slow?
My conclusion that our build is slow is based on what it felt like when I was iterating on builds while writing the blog post. But something feeling like its slow doesn’t mean it really is slow, so let’s see just how slow it is.1
$ time podman run --device /dev/fuse \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
Fedora 32 openh264 (From Cisco) - x86_64 4.8 kB/s | 5.1 kB 00:01
Fedora Modular 32 - x86_64 2.5 MB/s | 4.9 MB 00:02
...
Storing signatures
2ee36c681e3c767fa62107b423a16feff3e94d953b7ee6b2a08cd0fd954d32dd
real 1m48.225s
user 0m0.258s
sys 0m0.509s
That took 108 seconds on my (reasonably powerful) machine. That’s pretty slow for something so simple, but it’s also only one data point. It could have been a fluke. Let’s run it a few more times and see what we get.
$ time podman run --device /dev/fuse \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
...
real 1m50.388s
user 0m0.254s
sys 0m0.429s
$ time podman run --device /dev/fuse \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
...
real 1m54.995s
user 0m0.215s
sys 0m0.461s
It looks like we’re pretty consistently somewhere around 110 seconds per build.
Once again, Dan Walsh solves (some of) our problems for us
When I was first deciding whether to use a Containerfile or a build script, Dan Walsh had written an article that heavily influenced my decision. This article, which is all about making Buildah build faster, focused on improving the performance of package installation via DNF.
Let’s start by mounting the DNF cache so we can get a speed improvement on
repeat builds. Because we’re specifying the install root, DNF is going to use
the /var/cache/dnf
folder relative to the install root for its cache by
default. We want it to use the one we mounted, so we need to specify the cache
directory via options, too. This means we need to make a small change to our
build script.
--- buildah.sh.orig 2020-07-09 21:13:13.332341700 -0700
+++ buildah.sh 2020-07-09 21:14:30.582341700 -0700
@@ -3,7 +3,8 @@
container=$(buildah from scratch) # 1
mount=$(buildah mount ${container}) # 2
-dnf --installroot ${mount} --releasever 32 install --assumeyes nginx # 3
+dnf --installroot ${mount} --releasever 32 \
+ --setopt cachedir=/var/cache/dnf install --assumeyes nginx # 3
buildah config --entrypoint '["nginx"]' ${container} # 4
buildah config --cmd '-g "daemon off;"' ${container} # 5
$ mkdir -p ~/.cache/dnf
$ time podman run --device /dev/fuse -v ~/.cache/dnf:/var/cache/dnf \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
...
real 2m22.036s
user 0m0.340s
sys 0m0.525s
$ time podman run --device /dev/fuse -v ~/.cache/dnf:/var/cache/dnf \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
...
real 1m2.463s
user 0m0.357s
sys 0m0.436s
It looks like our first run was almost 30 seconds slower.2 But even with that degradation, we still shaved off almost 50 seconds from our original build time by using a persistent cache, once it was populated.
Without having some mechanism to keep the cache up-to-date, it will eventully get stale. Once it does, the next build we run will take the full time again, as it needs to rebuild the cache. If we build frequently enough, either because we’re iterating, or because we’re running as part of a frequent CI process, or anything else, this will be an occasional penalty that is largely outweighed by the gain in performance for subsequent builds.
Shrinking the image
If we take a look at our image, we’ll see that it’s almost half a gigabyte in size.
$ podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/buildah/nginx latest cb595df4e499 30 seconds ago 473 MB
quay.io/buildah/stable v1.14.8 0e58e549e48c 2 months ago 281 MB
That’s pretty big to move around between repositories, takes up a fair bit of disk space, and takes a while to install. If we can make the image smaller, we can possibly also make the install faster.
We can see where we’re starting from by looking at the output from our DNF command.
Install 179 Packages
Total download size: 94 M
Installed size: 489 M
179 packages, which requires downloading 94 M which unpacks to 489 M. That’s a pretty sizeable installation. Looking through what we’re installing, we see some output that will point us to our first step.
Installing weak dependencies:
Weak dependencies are packages that are not required to function, but are recommended. They either provide extra behavior, or, more commonly, provide tools that may be beneficial. In our image, we probably don’t need or want those tools, at least, not without explicitly specifying them, so let’s figure out how to avoid installing weak dependencies.
From the DNF documentation, we find that there’s a configuration option for controlling the installation of weak dependencies.
install_weak_deps
boolean
When this option is set to True and a new package is about to be installed, all packages linked by weak dependency relation (Recommends or Supplements flags) with this package will be pulled into the transaction. Default is
True
.
We also find that there’s a way to specify configuration options on the command line.
--setopt=<option>=<value>
Override a configuration option from the configuration file. To override configuration options for repositories, use
repoid.option
for the<option>
. Values for configuration options likeexcludepkgs
,includepkgs
,installonlypkgs
andtsflags
are appended to the original value, they do not override it. However, specifying an empty value (e.g.--setopt=tsflags=
) will clear the option.
Combining those two pieces of information together, and we arrive at another change to our build script
--- buildah.sh.orig 2020-07-09 22:00:09.542341700 -0700
+++ buildah.sh 2020-07-09 22:00:13.872341700 -0700
@@ -4,7 +4,8 @@ container=$(buildah from scratch)
mount=$(buildah mount ${container}) # 2
dnf --installroot ${mount} --releasever 32 \
- --setopt cachedir=/var/cache/dnf install --assumeyes nginx # 3
+ --setopt cachedir=/var/cache/dnf --setopt install_weak_deps=False \
+ install --assumeyes nginx # 3
buildah config --entrypoint '["nginx"]' ${container} # 4
buildah config --cmd '-g "daemon off;"' ${container} # 5
Which we can test with our tried and true invocation line
$ time podman run --device /dev/fuse -v ~/.cache/dnf:/var/cache/dnf \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
...
Install 107 Packages
Total download size: 57 M
Installed size: 347 M
...
real 0m40.247s
user 0m0.371s
sys 0m0.434s
$ podman imagesREPOSITORY TAG IMAGE ID CREATED SIZE
localhost/buildah/nginx latest 2d43248f112d 2 minutes ago 337 MB
Skipping weak dependencies has saved us 72 packages, 37 M of download, and 124 M of unpacked install size. It also shrank our final image size by 136 MB. And best of all, it shaved another 20 seconds off of our build time, making our build take less than a minute.
While reading through the documentation, we also see
–nodocs
Do not install documentation. Sets the rpm flag ‘RPMTRANS_FLAG_NODOCS’.
Documentation isn’t likely to be a huge part of the installation, but it’s worth a shot to see if it has an impact.
--- buildah.sh.orig 2020-07-09 22:14:02.812341700 -0700
+++ buildah.sh 2020-07-09 22:13:52.572341700 -0700
@@ -3,7 +3,7 @@
container=$(buildah from scratch) # 1
mount=$(buildah mount ${container}) # 2
-dnf --installroot ${mount} --releasever 32 \
+dnf --installroot ${mount} --releasever 32 --nodocs \
--setopt cachedir=/var/cache/dnf --setopt install_weak_deps=False \
install --assumeyes nginx # 3
$ time podman run --device /dev/fuse -v ~/.cache/dnf:/var/cache/dnf \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
...
Install 107 Packages
Total download size: 57 M
Installed size: 347 M
...
real 0m41.940s
user 0m0.308s
sys 0m0.491s
$ podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/buildah/nginx latest 88e26a0b7647 About a minute ago 337 MB
quay.io/buildah/stable v1.14.8 0e58e549e48c 2 months ago 281 MB
Well, drat. No real change from removing documentation.
We still have one more large, DNF-related trick up our sleaves for making our image smaller. When an RPM specifies its dependencies, it doesn’t specify exact packages, but something more akin to labels. Other RPMs than add metadata indicating which labels they satisfy. This creates a set of potential alternatives. Frequently, these alternatives are smaller, but lack functionality or features, or have other reasons they aren’t the primarily selected package. Some of them are good candidates for us, since our focus is on specific functional requirements.
We could write a tool to investigate the alternatives and list them, but for
now, we’re just going to use the DNF --exclude
option to one-by-one go
through each installed dependency. Doing so will either cause the installation
to fail dependency solving, in which case there is not appropriate alternative,
or an alternative will be selected. By comparing the list of selected packages
without the --exclude
option to the one with the option, we can see what the
alternative was, and what impact it will have. Doing so resulted in 3 packages
I think are worth replacing.
coreutils
->coreutils-single
: Thecoreutils
package provides many common utilities. It unfortunately depends oncoreutils-common
which installs a bunch of unneeded documentation that takes up space. Fortunately, we can replace it withcoreutils-single
which provides a single multi-call binary that provides all of the required functionality.- Packages: -1
- Total Size: -3 M
- Installed Size: -15 M
fedora-release
->fedora-release-container
: Thefedora-release
package provides basic information about a Fedora release. It comes in many variants, and we want to ensure we’re using the container variant, which is not the default selection.- Packages: 0
- Total Size: 0 M
- Installed Size: 0 M
glibc-all-langpacks
->glibc-langpack-en
: By default, glibc will pull in all language packs. While this is useful for generic installs, I work in an English3 speaking environment where minimum size is desired. All of the language packs take up about half of the image’s size, so we explicitly require the English language pack.- Packages: 0
- Total Size: -18 M
- Installed Size: -205 M
Once the alternatives have been selected, they should be specfied as packages for installation. This forces them to be installed, which also causes them to be selected to satisfy any requirements they can. Applying these three alternatives gives us
--- buildah.sh.orig 2020-07-10 06:04:08.299843100 -0700
+++ buildah.sh 2020-07-10 06:04:44.079843100 -0700
@@ -5,7 +5,8 @@ mount=$(buildah mount ${container})
dnf --installroot ${mount} --releasever 32 --nodocs \
--setopt cachedir=/var/cache/dnf --setopt install_weak_deps=False \
- install --assumeyes nginx # 3
+ install --assumeyes nginx \
+ coreutils-single fedora-release-container glibc-langpack-en # 3
buildah config --entrypoint '["nginx"]' ${container} # 4
buildah config --cmd '-g "daemon off;"' ${container} # 5
$ time podman run --device /dev/fuse -v ~/.cache/dnf:/var/cache/dnf -v ~/.local/share/containers:/var/lib/containers -v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 buildah unshare /buildah.sh
...
Install 105 Packages
Total download size: 36 M
Installed size: 125 M
...
real 0m35.403s
user 0m0.173s
sys 0m0.498s
$ podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
localhost/buildah/nginx latest a8ca74d177cf 19 seconds ago 115 MB
quay.io/buildah/stable v1.14.8 0e58e549e48c 2 months ago 281 MB
Once again, we see some significant savings. We’ve shaved off 2 packages, cut down the download by 21 M, cut the installed size by 222 M, reduced the final image size by 222 MB, and built 5 seconds faster.
Caching downloaded packages
At this point, our build is likely dominated by the time to download the packages needed for installation rather than the time to actually install them. I’m not sure what we can do about the installation time, but I know what we can do about the download time. Downloaded packages are not being cached because DNF cleans up packages on successful installation. Fortunately, there’s a configuration option to control this, so let’s run another build (2 actually, since the first won’t be able to benefit from the cached packages) where we keep the downloaded packages in the cache.
--- buildah.sh.orig 2020-07-10 06:13:21.849843100 -0700
+++ buildah.sh 2020-07-10 06:13:35.109843100 -0700
@@ -4,6 +4,7 @@ container=$(buildah from scratch)
mount=$(buildah mount ${container}) # 2
dnf --installroot ${mount} --releasever 32 --nodocs \
+ --setopt keepcache=True \
--setopt cachedir=/var/cache/dnf --setopt install_weak_deps=False \
install --assumeyes nginx \
coreutils-single fedora-release-container glibc-langpack-en # 3
$ time podman run --device /dev/fuse -v ~/.cache/dnf:/var/cache/dnf -v ~/.local/share/containers:/var/lib/containers -v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 buildah unshare /buildah.sh
...
real 0m23.935s
user 0m0.129s
sys 0m0.527s
By keeping the packages downloaded, we’ve saved another 11 seconds, and are now below a 30 second build time. While definitely not instantaneous, this is fast enough to not feel overly burdensome. It’s definitely something we can work with during day-to-day development.
Final results
By shrinking the size of our installation and using aggressive caching, we managed to reduce our build time in the expected case from ≈110 seconds to ≈25 seconds. That’s a major improvement. Now that our iteration times aren’t over a minute, we’re ready to look at migrating our build to Kubernetes in the next post.
We’re reusing our build script from our last post, which, so you don’t need to go back to find it, I’m replicating here.
↩︎#!/bin/bash -ue container=$(buildah from scratch) # 1 mount=$(buildah mount ${container}) # 2 dnf --installroot ${mount} --releasever 32 install --assumeyes nginx # 3 buildah config --entrypoint '["nginx"]' ${container} # 4 buildah config --cmd '-g "daemon off;"' ${container} # 5 buildah config --port 80 ${container} # 6 buildah commit ${container} buildah/nginx:latest # 7
It’s possible this degradation is partly caused by my having put some streaming video on in the background, slowing down the rate the build can download the repository metadata and packages. ↩︎
If English isn’t your preferred language, there are language packs available for almost 200 languages. Any specific language pack should have similar benefits. The goal is to avoid installing language packs for languages we don’t need. ↩︎