How I Learned to Stop Worrying and Love Alpine

The there is an XKCD for everything meme is so true.

Right now, it's:

Here’s the scoop...

I need to use my new dev machine for Linux work, Windows work, and some mild gaming. So, instead of taking 15 seconds to restart the dual-booted computer to switch from Linux and Windows like... once every week or two (and then reboot back a day or two later)... I spent about 22 hours over 5 weeks to setup, script, and test a headless Alpine QEMU/KVM host, and create the associated virt-install scripts that would build some Linux VMs and a Windows VM with GPU passthrough.

But... Why?

There are so many "why"s here.

What was wrong with dual-booting?

Almost nothing.

It was the principle of the matter. Needing to restart my machine to select another OS at the boot menu, when I knew I could avoid that (eventually)... I just couldn’t help myself.

Why not use X?

My first attempt a couple years back was to use Proxmox, but seriously, I hated it. I don’t know why, because it seems to be the darling of the homelab virtualization world (and it's essentially a modded Debian distro, and I'm a Debian fan), but there was something about using it that just felt bloated and stale upon first use. I used it for a few weeks, and then reverted back to dual-booting.

More recently, I gave Fedora a shot as a virtual machine host using Cockpit. Unsurprisingly, this "felt" a lot better as I've briefly mentioned that I like Fedora as a personal machine. The virtualization tool installation was trivial and their management software looks and feels more modern and seamless. Still, something was off about my setup and I wasn’t really happy (back to dual booting).

Eventually I figured it out.

I have some intrinsic aversion to what I consider to be “wasted” resources, even when there is an abundance of available resources. You want me to install a full distro with a graphical user interface which is hogging 10GB of disk, some CPU, and two gigs of RAM? Leaving me with only… 62GB of RAM left? No dice homeslice, not on my watch.

Note: It's not just computers - I'm tilted watching this kid waste water on Sesame Street.

It's always the GUI

In my examples above, it seemed silly to me that I'd need a somewhat hefty Debian install (systemd and whatnot), or hell, a full blown desktop environment - for what should amount to a few applications and a set of kernel modules running seamlessly in the background.

I knew my ideal VM setup would be a minimal host that did almost nothing, had as few packages as possible, and was headless.

Alpine's back, from outer space

My original idea was to build a custom, stripped-down Debian distro - but I quickly remembered the pain and suffering this caused for me in the past while building Debian for a resource-constrained, embedded ARM device.

This memory brought back my time building custom Linux kernels, custom bootloaders, and custom root filesystems using Yocto and then the far superior Buildroot. ... You know what, maybe my solution doesn't need to be THAT minimal after all.

Arch felt like the obvious answer, but same as before, I didn't want to be in a situation where I might be forced into fixing something at an inopportune time (which might even cause a dreaded restart... The horror...).

I had even considered going with NixOS for the immutability aspect, but this felt like I was getting away from the "minimal" part of the equation.

On an unrelated project, I was trying to learn more about some Podman container (I think it was nginx's Alpine container) which led me back to Alpine.

All these years, I had always assumed Alpine was ONLY a container-based OS. Turns out I was wayyyy off. It comes pre-built with a slew of releases and architectures, including a dedicated Raspberry Pi release and a generic ARM release. This was wild, since I've been looking to standardize all of my headless embedded devices - some of which are Pis.

Why Alpine?

Ripping off Alpine's about page:

Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and busybox.

and

Small ... Simple ... Secure ...

Going further, what this amounted to was:

Small, fast package management (Alpine uses APK)
No heavyweight service manager (aka. systemd) (Alpine uses OpenRC)
A minimal install (as little as 130MB vs Debian's > 500MB minimal-ish install)

While this isn't an apples-to-apples comparison, I did find these container sizes amusing (again, this is NOT apples-to-apples because what each distro considers "minimal" or "necessary" is different):

alpine:3.19.1: 3.25MB compressed
debian:bookworm-20230904-slim: 27.78MB compressed (8.5x larger)
debian:bookworm-20230904: 47.26MB compressed (14.5x larger)

That's not to say I don't still use debian-slim when I need to as a base, but I kinda just laugh whenever I see alpine next to something called "slim".

Anyways, settled. Alpine Linux as my KVM host.

Why did it take so long?

It didn’t need to, but when I get this vision of what I see as the “ideal” system - I can’t walk away from that, even if it would save me a ton of time.

22 hours over 5 weeks

Okay, so those numbers are a bit more nuanced. The "5 weeks" in this case means from "I conceptualized this third attempt at a KVM host" to "publishing this post and associated scripts on Github". There was also about 2-3 weeks in the middle where I had a solution, installed a bunch of VMs and used them as a testing period. Then I destroyed everything to re-build up from scratch using my scripts a couple of days ago.

The "22 hours" includes all of the introductory reading, learning about Alpine, learning about APK, and reading/re-reading every piece of content about QEMU, KVM, virt-install, and GPU passthrough I could find.

The "actual" time spent installing/re-installing Alpine, configuring KVM, and fiddling around with virt-install scripts was closer to 8 hours over 2 weeks. But the other 14 hours of research were still absolutely necessary to reach the end goal.

Why all the reading?

Whelp, for starters, there is the problem that when it comes to Linux, Linux kernel, drivers, insanely complicated (but often updated) software like QEMU, and ridiculous edge cases... Wikis and articles on these topics fall mostly out of date within a year.

Annoyingly, some people write about these topics like they’re evergreen content and then don’t even have a date or versions anywhere. Seriously... Look at all the QEMU, KVM, virt-install samples and documentation online and let me know how many of them have version numbers or dates, I'll wait.

Less annoying is that there is an insane amount of cargo culting in the QEMU and KVM worlds. I don't blame anyone for that, because as alluded to before, there is just so much goddamn complexity that it's really hard to know exactly why each permutation of flags and options are required - but your computer crashes without them. Oops.

A great example of some of these problems is the Alpine wiki on KVM:

No mention of latest tested Alpine version (or Linux kernel version)
No mention of QEMU version
No mention of libvirt version
Last edit date (November 2023) isn't exactly reflective of overall freshness
Reference to kernel module that no longer exists
References to unnecessary installed packages

This was an absolutely critical resource for me, and thanks to the Alpine team for having it - but if the Alpine wiki itself could have issues like these, what chance do we have with reddit and random articles scattering the internet.

What's next?

A breakdown of the Alpine configuration script
An explanation of the headless Linux virt-install scripts
An explanation of the Windows (with GPU passthrough) virt-install script

I've put all of the associated scripts on Github here: https://github.com/sureshjoshi/alpine-kvm

The scripts are the "what" and "how", and my associated posts are the "why" for those of you who will need to modify them.