Provisioning Linux virtual machines with virt-install

Continuing on from my previous post about configuring an Alpine Linux installation with a QEMU/KVM setup script, this post will explain the script(s) I use when creating headless Linux virtual machines using virt-install.

The virt-install scripts are located at: https://github.com/sureshjoshi/alpine-kvm/tree/main/guests.

This configuration assumes no GUI or desktop environment - it's, again, all from the terminal.

Prerequisites

A configured Alpine Linux host
A working internet connection

What is virt-install?

virt-install is the command-line tool equivalent to virt-manager - which helps to provision virtual machines.

After the initial provisioning, I use the virsh command in the libvirt package to startup, teardown, and handle any runtime management.

This begs the question, why use virt-install at all? Why not just use the qemu command-line to provision machines? For me, this comes down to not wanting to need to keep up with the qemu provisioning arguments, and also keeping up with the deprecations, changing defaults, and new commands.

virt-install is kept pretty well updated, and has sane defaults where I've missed a command. It can also shorten the number of args needed to generate the same VM as using qemu, but in my case - there are some defaults I re-write anyways, in the event they ever change from under me.

I still recommend reviewing the generated XML and reading through libvirt and qemu on what each configuration item means, but virt-install can be a good-enough shortcut (and it sometimes tells you when you've generated an invalid config).

virt-install scripts

The virt-install part of the scripts themselves are pretty short and Alpine and Debian are very similar. The command explanations beneath these scripts apply to both usages.

Alpine

virt-install \
  --cdrom $ISO_PATH \
  --connect qemu:///system \
  --disk size=$DISK_GB,format=raw,bus=virtio \
  --graphics none \
  --hvm \
  --memory $MEMORY_MB \
  --name $VM_NAME \
  --osinfo alpinelinux3.18 \
  --network network=default,model=virtio \
  --vcpus $CPUS \
  --virt-type kvm

virt-install Debian

virt-install \
  --connect qemu:///system \
  --disk size=$DISK_GB,format=raw,bus=virtio \
  --extra-args="console=ttyS0,115200n8 serial" \
  --graphics none \
  --hvm \
  --location https://debian.osuosl.org/debian/dists/stable/main/installer-amd64/ \
  --memory $MEMORY_MB \
  --name $VM_NAME \
  --osinfo debian12 \
  --network network=default,model=virtio \
  --vcpus $CPUS \
  --virt-type kvm

General options

# https://man.archlinux.org/man/virt-install.1#GENERAL_OPTIONS
--name

Pretty self explanatory - the name of the virtual machine. This will be visible when using virsh list --all and must be unique to that list.

# https://man.archlinux.org/man/virt-install.1#GENERAL_OPTIONS
--memory

Memory to allocate to the guest VM (in MiB). There are sub-options if you want to get clever about having a maximum amount of memory allocated, while only provisioning with less. For my usage, deterministic memory across a small number of VMs is more useful to me - so I haven't needed those sub-options yet.

# https://man.archlinux.org/man/virt-install.1#GENERAL_OPTIONS
--vcpus

Number of virtual CPUs allocated to the guest VM. Similar to memory, there are sub-options to allow a "current" and "max" number of CPUs, but again, I haven't needed those yet.

Installation options

# https://man.archlinux.org/man/virt-install.1#INSTALLATION_OPTIONS
--cdrom

The locally downloaded installation ISO, typically downloaded to the host or located on a shared/mounted drive.

# https://man.archlinux.org/man/virt-install.1#INSTALLATION_OPTIONS
--extra-args
--location

If the distro has an installation tree, that can be used in location, though I've never had success using Debian with the ISO location format. extra-args are kernel line arguments passed during install, but using this argument is only possible if using location (not when using cdrom).

Storage options

# https://man.archlinux.org/man/virt-install.1#STORAGE_OPTIONS
--disk

The storage allocated to the virtual machine (in GB). This can have some nuance, so reading the docs is pretty important for this one.

For my simple Linux dev machines, I pick the raw format as it's roughly the fastest option at the expense of some of the bells-and-whistles that qcow2 provides (e.g. snapshots, migrations, extra corruption-protection, etc). I do this knowing that at any point, I'm willing to wipe and re-create these Linux VMs from scratch if there is a problem.

If that's not your use case, consider using the default qcow2 format at the potential expense of performance. I should note that on my machine, I didn't notice a large performance hit. It was under 10% in my basic tests - but that's 10% on a ridiculously fast NVME storage device - so probably not noticeable for me in real-world usage outside of large project compilation.

bus=virtio was the only sub-option I set for my Linux VMs as virtio is generally the fastest interface other than PCI passthrough.

I could probably squeeze out more performance by tailoring some of the other sub-options, but this got me the "good enough" numbers I needed for my ephemeral dev containers. I generally only need these to live for a few days, or a handful of compilations/tests - so I wasn't going to spend too much time tailoring.

If I run more testing in the future, I will update the scripts and this post with modified options/parameters.

Networking options

# https://man.archlinux.org/man/virt-install.1#NETWORKING_OPTIONS
--network

This argument provides several ways for the guest to connect to the host. In the Linux examples above, I'm connecting to the automatically generated default network, which NATs the guest to the network. Functionally, this means that - by default - the guest has internet access, but there is no direct connection to the physical router/switch as the host handles this networking layer.

model=virtio is to force using virtio as an optimization step.

Graphics options

# https://man.archlinux.org/man/virt-install.1#none~2
--graphics none

This is doubling down on the headless approach. No graphical console (spice/vnc) is allocated, and this is more like plugging in a serial cable into a machine (or ssh'ing) to provision and use it. Connect to the device using virsh console {my-vm} after provisioning.

Guest OS options

# https://man.archlinux.org/man/virt-install.1#GUEST_OS_OPTIONS
--osinfo

Performs guest OS-specific optimizations. The documentation suggests that this must be detectable or specified for performance critical features to be enabled (e.g. virtio).

Since the OS is already specified via cdrom or location, it's worth specifying this arg explicitly rather than trying to auto-detect. To find a list of possible IDs, use the "Short ID" value emitted from osinfo-query os:

osinfo-query os

 Short ID             | Name                                               | Version  | ID
----------------------+----------------------------------------------------+----------+-----------------------------------------
 almalinux8           | AlmaLinux 8                                        | 8        | http://almalinux.org/almalinux/8
 almalinux9           | AlmaLinux 9                                        | 9        | http://almalinux.org/almalinux/9
 alpinelinux3.10      | Alpine Linux 3.10                                  | 3.10     | http://alpinelinux.org/alpinelinux/3.10
 alpinelinux3.11      | Alpine Linux 3.11                                  | 3.11     | http://alpinelinux.org/alpinelinux/3.11
...

Libvirt connection

# https://man.archlinux.org/man/virt-install.1#CONNECTING_TO_LIBVIRT
--connect system

Use the system libvirtd instance. If calling from root, this is what is used, but --connect session if calling from a non-root user is possible too. I always want to use the system libvirtd and the docs say that virt-manager uses system, but the docs also say that libvirt will choose a default if this arg is not specified.

Virtualization options

Here are some settings which should already be the defaults, but I make explicit to avoid ambiguity or virt-install/libvirt changing a value on upgrade.

# https://man.archlinux.org/man/virt-install.1#VIRTUALIZATION_OPTIONS
--hvm

Uses full virtualization instead of para-virtualization. This should be the default value when using QEMU, but the documentation also mentions that paravirt might be the default instead of full if the guest supports both.

# https://man.archlinux.org/man/virt-install.1#VIRTUALIZATION_OPTIONS
--virt-type kvm

Selects the hypervisor. I don't know if kvm is the default offhand, but since kvm and qemu are both valid on my machine - I want to ensure plain qemu is never used by accident.

Debugging

# https://man.archlinux.org/man/virt-install.1#MISCELLANEOUS_OPTIONS
--dry-run
--print-xml

Use --dry-run and --print-xml in order to see the output of virt-install without actually creating anything. The XML values can be compared against the libvirt docs to see what's actually happening.

What's next?

An explanation of the Windows (with GPU passthrough) virt-install script

References

ArchLinux virt-install docs - Manual page for virt-install
virt-install source code - Useful to see defaults, or reduce ambiguity, in docs
libvirt XML explanation - Provides details on the output of virt-install
Heiko Seiger's blog on tuning disk performance - Good reference info, if I need to optimize the storage
Benchmarking VM disk performance - Information on a VM-centric disk testing tool