Undocking - Containers without Docker
Creating images locally
I have spent a few weeks figuring out how to change my container development environment from one that uses Docker to one that (almost) completely avoids it.1
Why avoid Docker
It all started with Docker splitting into free and paid variants. While this made sense to me, it started me worrying about the future of Docker for the free and open community. This worry increased as the project further split, and renamed.
This was furthered as the Docker Desktop would ask for ratings and ask if I would recommend it. These are the sorts of intrusive, annoying, metric-for-money driven behaviors we’ve come to loathe in the app space on our phones. I was not happy about seeing them come to the developer tools I use.
I finally reached my breaking point when downloads of Docker Desktop were placed behind an email wall.2 It wasn’t very difficult to circumvent the wall, but it was a meaningless nuisance seemingly driven by Docker starting to care more about the business than about the developer.
On top of that, Dan Walsh’s points about Docker being a bloated daemon are absolutely accurate. Docker is a massive, invasive install, and comes fraught with potential security implications.
Requirements
In order to have a successful migration, I wanted it to be something similar to how I intend to deploy containerized applications to production. I strongly believe that local development should closely mirror production (but not require resources outside the local machine).
I also didn’t want to have to install VM software, or run my own VMs that I would be running programs on for development. I prefer my developer environments to be native experiences.
This leads to a complete set of requirements:
Must
- ✅ Cross platform (Windows, macOS, and Linux)
- ✅ Not require installing Docker into the native OS
- ✅ Use native VM hypervisor if VMs are necessary
- ✅ Hyper-V and/or WSL 2 on Windows
- ✅ hypervisor.framework on macOS
- ❓ No VM in Linux
- ✅ Provide a means to run containers
- ✅ Provide a means for creating container images
- ✅ Be easy and consistent for developers to set up
- ✅ Be (reasonably) fast
Should
- 🚧 Not use Docker at all
Dan Walsh is an inspiration to us all
Dan Walsh of RedHat has been railing against “big, fat daemons” for a while. This has led to the creation of several good alternatives to Docker for various pieces of the pipeline.
One of the alternatives is the CRI-O runtime for Kubernetes, which makes
Kubernetes the first piece of our new container stack. While not fantastic for
one-off things, Kubernetes has become the go-to standard for deploying
container-based applications. Kubernetes also supports hypervisor.framework via
the hyperkit
driver, Hyper-V via the hyperv
driver, and native Linux via
the (experimental) podman
driver. Additionally, with Minikube,
Kubernetes is easy to install locally for development. This meets several of
our requirements, and gets us well on our way.
Unfortunately, Kubernetes does not provide a way to create images, so we need another tool. Dan Walsh helps us out again, with Buildah.
Getting started with Buildah
Buildah is only available for Linux, so we can’t use a local build process on all of our target architectures. Instead, we’ll need to run our build process inside of Kubernetes. But for now, we need to get familiar with Buildah and figure out how to make it build our images as quickly as we can.
Since the end goal is to run Buildah in Kubernetes, we don’t want to just run it locally, as that won’t replicate anything like the final environment. Instead, we want to run Buildah in a container. Fortunately, in addition to Buildah for building containers, there’s also Podman for running them.
Podman, like Buildah, is designed around Linux containers (at least for now). It requires a working Linux OS available, either in a VM , natively, or remote. To get started, I’m going to use WSL 2 (with Ubuntu 20.04, but it should work with any distribution, as well as Linux as host OS or in a VM). After installing Podman, we’re ready to run our first Buildah container.
$ podman run quay.io/buildah/stable:v1.14.8 echo "Hello world!"
Trying to pull quay.io/buildah/stable:v1.14.8...
Getting image source signatures
Copying blob d15499bd65d4 done
Copying blob 5796af10c83e done
Copying blob b70dbda2c312 done
Copying blob 7c43a36ba5ed done
Copying blob f40340f463b1 done
Copying config 0e58e549e4 done
Writing manifest to image destination
Storing signatures
Hello world!
Using a bash script to build
Buildah supports building from Containerfiles (Dockerfiles) using the bud
command, but it also supports running individual commands to modify the image
it’s building. After doing some
reading, I chose to use the scripts to build images. It
seems a lot less complicated, and more flexible. The only downsides seem to be
the lack of portability back to Docker for builds and the loss of caching and
layers. The lack of portability to Docker is not something we’re interested in,
as we’re moving away from Docker. The loss of caching and layers is
unfortunate, but is being worked on.
To test out our new workflow, we’re going to build a custom Nginx
image. This will allow us to test the primary actions done during building an
image: create the image, install software, and configure the software. We get
started with a very simple script, which we save as buildah.sh
#!/bin/bash -ue
container=$(buildah from scratch) # 1
mount=$(buildah mount ${container}) # 2
dnf --installroot ${mount} install --assumeyes nginx # 3
buildah config --entrypoint '["nginx"]' ${container} # 4
buildah config --cmd '-g "daemon off;"' ${container} # 5
buildah config --port 80 ${container} # 6
buildah commit ${container} buildah/nginx:latest # 7
First we create a container to serve as the basis for our new image. We make it
from scratch to start with an empty filesystem. This allows us to make a
minimally-sized image (1). Then we mount the container’s file system into our
own, which will allow us to easily run commands on it (2). With the container’s
file system mounted, we can run dnf install
to install Nginx (3). We tell
DNF to install to the container’s file system by setting the --installroot
to the recorded mount point. Then we configure the entrypoint and command for
the image (4 and 5). We expose port 80 so it can be forwarded easily when we
run the resulting image (6). Finally, we commit the container to turn it into
an image (7).
To run our build, we need to mount our build script3 into the Buildah container and run it. Using Podman, we run the build.
$ podman run -v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 /buildah.sh
mount /var/lib/containers/storage/overlay:/var/lib/containers/storage/overlay, flags: 0x1000: operation not permitted
That “operation not permitted” error came from trying to mount the container’s file system. A quick search will find us a blog post on using Buildah with Podman which tells us
Because both the container and the container within the container will be using fuse-overlayfs, they won’t be happy trying to mount their respective directories over each other. So, the first step is to create a directory for the container within the container to use, and I’ve named it /var/lib/mycontainer"
…
mounts the host’s mycontainer to the container’s containers directory
So let’s make that directory, but we’ll put it in
~/.local/share/containers
,4 and then mount it into our
Buildah container as /var/lib/containers
.
$ mkdir -p ~/.local/share/containers
$ podman run -v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 /buildah.sh
mount /var/lib/containers/storage/overlay:/var/lib/containers/storage/overlay, flags: 0x1000: operation not permitted
Huh. We get the same error. If we reread our reference blog posts carefully, we find something that seems related.
One thing to remember is in rootless mode all commands have to be done in the user namespace of the user. You can enter the user namespace using the buildah unshare command. If you don’t do this, the buildah mount, command will fail. After entering the user namespace the user is allowed access to the containers root file system as a non-root user. To execute the script as a non root user, you can execute buildah unshare build_buildah_upstream.sh.
While the name of the user account we’re running as inside the container is
root
, we’re sharing a kernel with the host system, and the root
inside the
container is not the root
account for the kernel. So, from the kernel’s
perspective, we are running rootless. Let’s try using buildah unshare
to run
our script and see if that helps.
$ mkdir -p ~/.local/share/containers
$ podman run -v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
level=error msg="error unmounting /var/lib/containers/storage/overlay/87a46046716b780de38e3b8346a6cbc1a15bb2b0f315738921e0dfdeb16aed94/merged: invalid argument"
error mounting "working-container-5" container "working-container-5": error mounting build container "29f3587b5f53fdf0d996b9bedf4b962bc96349cc5774972d156816db3e990ed1": error creating overlay mount to /var/lib/containers/storage/overlay/87a46046716b780de38e3b8346a6cbc1a15bb2b0f315738921e0dfdeb16aed94/merged: using mount program /usr/bin/fuse-overlayfs: fuse: device not found, try 'modprobe fuse' first
fuse-overlayfs: cannot mount: No such file or directory
: exit status 1
level=error msg="exit status 1"
level=error msg="exit status 1"
Progress! We have a new error. A bit of searching leads to another helpful blog post
Note that using Fuse requires people running the Buildah container to provide the /dev/fuse device.
Passing along the /dev/fuse
device into our Buildah container gives us
$ mkdir -p ~/.local/share/containers
$ podman run --device /dev/fuse \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
Unable to detect release version (use '--releasever' to specify release version)
Fedora $releasever openh264 (From Cisco) - x86_ 27 kB/s | 63 kB 00:02
Errors during downloading metadata for repository 'fedora-cisco-openh264':
- Status code: 404 for https://mirrors.fedoraproject.org/metalink?repo=fedora-cisco-openh264-$releasever&arch=x86_64 (IP: 152.19.134.198)
Error: Failed to download metadata for repo 'fedora-cisco-openh264': Cannot prepare internal mirrorlist: Status code: 404 for https://mirrors.fedoraproject.org/metalink?repo=fedora-cisco-openh264-$releasever&arch=x86_64 (IP: 152.19.134.198)
Fedora Modular $releasever - x86_64 29 kB/s | 63 kB 00:02
Errors during downloading metadata for repository 'fedora-modular':
- Status code: 404 for https://mirrors.fedoraproject.org/metalink?repo=fedora-modular-$releasever&arch=x86_64&countme=1 (IP: 152.19.134.198)
- Status code: 404 for https://mirrors.fedoraproject.org/metalink?repo=fedora-modular-$releasever&arch=x86_64 (IP: 152.19.134.198)
Error: Failed to download metadata for repo 'fedora-modular': Cannot prepare internal mirrorlist: Status code: 404 for https://mirrors.fedoraproject.org/metalink?repo=fedora-modular-$releasever&arch=x86_64 (IP: 152.19.134.198)
level=error msg="exit status 1"
level=error msg="exit status 1"
More progress, with another new error! This error is from DNF. DNF needs to
know which release of the OS (distribution) we’re using so it can find and use
the correct RPM repository. The error message tells us it can’t do that
automatically, which makes sense because the empty file system we’re installing
into doesn’t have the files DNF reads to automatically figure it out. The error
message also tells us we can provide the release with --releasever
. Fedora 32
is the latest Fedora release at the time of this writing, so let’s use that
release version. We’ll need to modify the dnf
line of our build script.
--- buildah.sh.orig 2020-07-08 20:30:47.777000000 -0700
+++ buildah.sh 2020-07-08 20:55:26.839106100 -0700
@@ -3,7 +3,7 @@
container=$(buildah from scratch) # 1
mount=$(buildah mount ${container}) # 2
-dnf --installroot ${mount} install --assumeyes nginx # 3
+dnf --installroot ${mount} --releasever 32 install --assumeyes nginx # 3
buildah config --entrypoint '["nginx"]' ${container} # 4
buildah config --cmd '-g "daemon off;"' ${container} # 5
Rerunning our build yields
$ mkdir -p ~/.local/share/containers
$ podman run --device /dev/fuse \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
Fedora 32 openh264 (From Cisco) - x86_64 4.8 kB/s | 5.1 kB 00:01
Fedora Modular 32 - x86_64 1.9 MB/s | 4.9 MB 00:02
Fedora Modular 32 - x86_64 - Updates 2.2 MB/s | 3.5 MB 00:01
Fedora 32 - x86_64 - Updates 2.4 MB/s | 18 MB 00:07
Fedora 32 - x86_64 8.1 MB/s | 70 MB 00:08
Dependencies resolved.
======================================================================================
Package Arch Version Repo Size
======================================================================================
Installing:
nginx x86_64 1:1.18.0-1.fc32 updates 571 k
...
Complete!
Getting image source signatures
Copying blob sha256:2a871e5a9e7da9e807ac0b16daa365b5a6248dc8566abb25e0808eaa92674bfb
Copying config sha256:e24833ba33cab60c450b27b4a078c22a4a1b7a10d246276a56d615fbcabc0ab8
Writing manifest to image destination
Storing signatures
e24833ba33cab60c450b27b4a078c22a4a1b7a10d246276a56d615fbcabc0ab8
Excellent! We successfully built our image. Let’s try to run it to ensure everything works.
We run with -ti
so we can see the output, with --rm
to clean up the
container when we’re done, and --pull never
because we just built the image
locally, so it shouldn’t need pulling. We also don’t want to accidentally pull
it from a remote registry and end up running who knows what. We use the
--publish-all
option to get port-forwarding to port 80 with the OS selecting
a port. This makes it more portable and won’t conflict with any other servers
that may be running.
$ podman run -ti --rm --pull never --publish-all buildah/nginx:latest
Error: unable to find a name and tag match for buildah/nginx:latest in repotags: no such image
Welp, that’s not what we wanted5. Podman can’t find the image we just built, which we know we built successfully. Let’s see if we can find our newly built image in the file system and see what’s going wrong.
$ ls ~/.local/share/containers/
cache storage
$ ls ~/.local/share/containers/storage/
cache mounts overlay-containers overlay-layers tmp vfs vfs-images
libpod overlay overlay-images storage.lock userns.lock vfs-containers vfs-layers
Interesting. It looks like we have storage layouts for both OverlayFS and VFS storage drivers. We know our Buildah container is using OverlayFS, so maybe Podman is using VFS. That might explain why Podman can’t see our new container. Let’s try using OverlayFS with Podman to find our image and see if that helps.
$ podman --storage-driver overlay images
Error: database storage graph driver "vfs" does not match our storage graph driver "overlay": database configuration mismatch
I think that means that Podman is not happy that we’re changing from the VFS driver we had been using to OverlayFS. To the internet to see what we can do about this! Searching for the error message turns up an issue on GitHub, where one of the comments says
The current solution is to delete ~/.share/containers to wipe the libpod DB and start fresh.
So let’s give that a shot.6
$ sudo rm -rf ~/.local/share/containers/
$ podman --storage-driver overlay images
Error: kernel does not support overlay fs: 'overlay' is not supported over extfs at "/home/zeffron/.local/share/containers/storage/overlay": backing file system is unsupported for this graph driver
That’s an interesting error. We know it can’t be fully correct because our
Buildah container used OverlayFS just fine, and it was using the same kernel.
Again, a quick search for the error finds us a
useful GitHub issue. While this issue isn’t covering our exact
situation, and is resolved, it does shed some light on the situation. The issue
constantly mentions fuse-overlayfs
and fuse3
, which Ubuntu 20.04 does not
install by default (at least, not in WSL 2). So let’s install them and see if
that fixes anything.
$ sudo apt install fuse-overlayfs
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
fuse3 libfuse3-3
...
Processing triggers for initramfs-tools (0.136ubuntu6.2) ...
$ podman images
REPOSITORY TAG IMAGE ID CREATED SIZE
$ ls ~/.local/share/containers/storage/
libpod mounts overlay overlay-containers overlay-images overlay-layers storage.lock tmp userns.lock
That looks to have switched the storage driver used by Podman to OverlayFS, which is what we want. Unfortunately, we have no images because we wiped everything out. We need to re-run our build script and then try rerunning to resulting container.
$ podman run --device /dev/fuse \
-v ~/.local/share/containers:/var/lib/containers \
-v ./buildah.sh:/buildah.sh buildah/stable:v1.14.8 \
buildah unshare /buildah.sh
...
Complete!
Getting image source signatures
Copying blob sha256:a91104507a0d54fa8d76e73ab7a8168327475ded5928debf661bf7014a7e0606
Copying config sha256:c00695ae86d811e66f04238bc953532f559d2849a8a1e0ded5fcfd1aae21fafc
Writing manifest to image destination
Storing signatures
c00695ae86d811e66f04238bc953532f559d2849a8a1e0ded5fcfd1aae21fafc
$ $ podman run -ti --rm --pull never --publish-all buildah/nginx:latest
This time the container starts, and stays running, with no output. To confirm that, we can (in a new terminal), look at which port was forwarded, and then we can use our web browser to connect to localhost on that port.
$ podman ps --format '{% raw %}{{ .Ports }}{% endraw %}'
0.0.0.0:37345->80/tcp
For me, port 37345 has been forwarded, so I need to connect to localhost:373457 in my browser. When navigating there, I see a successful test page.
Now that we have a working build, we should make it smaller and not take so long to build, but we’ll explore that next time.
So far, some of the features required for a functional environment require Kubernetes to use the docker container runtime. I am working to identify the remaining issues with cri-o and/or containerd, and then solve or raise them with appropriate projects. Once either one of them works (with a preference for containerd), Docker can be removed entirely. ↩︎
It seems like Docker has recently undergone some changes and they no longer have an email signup wall. They may have also stopped with the other behaviors I find objectionable, but I’ve been working to replace them for long enough that I don’t see a reason to switch back just because they’re no longer doing objectionable things. Also, it’s still a big, bulky daemon. ↩︎
The build script also needs to be executable. If it’s not already, you can make it so by running
chmod a+x buildah.sh
. ↩︎This seems like a magic directory, but it comes from the documentation for the container storage configuration. The documentation has a
rootless_storage_path
key which is the path to container storage when running rootless (as a non-root user), which, as we’re about to cover, we are. The default value for that key is$XDG_DATA_HOME/containers/storage
, ifXDG_DATA_HOME
is set. Otherwise$HOME/.local/share/containers/storage
is used.According to the specification
If
$XDG_DATA_HOME
is either not set or empty, a default equal to$HOME/.local/share
should be used.These two defaults make the effective default location for the containers directory
~/.local/share/containers
. If this does not work for you, please use$XDG_DATA_HOME/containers
or check where yourrootless_storage_path
is configured to place the storage. ↩︎It’s entirely possible that you may not have encountered an error. If you keep reading, you’ll see the error is caused by Ubuntu not installing fuse-overlayfs by default, resulting in fallback to the VFS storage driver, while the Buildah container uses the OverlayFS storage driver. If your Linux environment has fuse-overlayfs already, you should not experience this issue. ↩︎
The use of
sudo
to delete the containers directory is required because we used different namespaces for building containers. This gives them different UIDs which means we can’t delete them as our user. ↩︎For some reason I don’t understand, in order for the connection from a Windows native web browser to connect successfully to my Nginx container running in WSL 2, I need to use “localhost” as the host. Using “127.0.0.1” will not work. If I’m making the request from within WSL 2 (i.e. using a WSL 2 installed web browser or cURL), then “127.0.0.1” works. ↩︎