Sleep is Important
I have 10 VMs running on my workstation. Home Assistant. The Build Swarm orchestrator. A testing sandbox. If the NVMe drive died today, they would be gone.
I spent a Saturday writing a script to fix that anxiety.
The Strategy
Backing up a running KVM (libvirt) VM involves two things:
- The Definition: The XML configuration (CPU, RAM, Network map).
- The Disk: The
.qcow2image.
Crucially, you cannot just copy the disk while the VM is writing to it. You get corruption. So we have two choices:
- Snapshot Mode: Use
virsh blockcommit(Complex, efficient). - The Sledgehammer: Shut down, copy, start up (Simple, disruptive).
Since these are homelab VMs, I chose the Sledgehammer (scheduled for 3 AM).
The Script
#!/bin/bash
# /usr/local/bin/vm-backup.sh
BACKUP_ROOT="/backups/vms"
DATE=$(date +%Y%m%d)
TARGET_DIR="$BACKUP_ROOT/$DATE"
mkdir -p "$TARGET_DIR"
# Get list of running VMs
VMS=$(virsh list --name)
for VM in $VMS; do
echo "Processing $VM..."
# 1. Dump XML Config
virsh dumpxml "$VM" > "$TARGET_DIR/$VM.xml"
# 2. Get Disk Path
DISK_PATH=$(virsh domblklist "$VM" --details | grep file | awk '{print $4}')
DISK_NAME=$(basename "$DISK_PATH")
# 3. Shutdown
echo "Stopping $VM..."
virsh shutdown "$VM"
# Wait for shutdown (timeout 60s)
TIMEOUT=0
while virsh list --name | grep -q "^$VM$"; do
sleep 5
let TIMEOUT=TIMEOUT+5
if [ $TIMEOUT -ge 60 ]; then
echo "Timeout waiting for shutdown. Forcing..."
virsh destroy "$VM"
break
fi
done
# 4. Copy Disk
echo "Backing up $DISK_NAME..."
# Use sparse copy to save space!
cp --sparse=always "$DISK_PATH" "$TARGET_DIR/$DISK_NAME"
# 5. Start
echo "Starting $VM..."
virsh start "$VM"
done
# Cleanup old backups (Keep 7 days)
find "$BACKUP_ROOT" -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;
The “Sparse” Trick
The flag cp --sparse=always is magic.
My Windows VM has a 100GB allocated disk. But it only uses 20GB of space.
A normal copy creates a 100GB file.
A sparse copy creates a 100GB logical file that only takes up 20GB on disk.
Automation
I added a systemd timer (because cron is so 2010).
/etc/systemd/system/vm-backup.service:
[Unit]
Description=VM Backup Script
[Service]
Type=oneshot
ExecStart=/usr/local/bin/vm-backup.sh
/etc/systemd/system/vm-backup.timer:
[Unit]
Description=Run VM Backup Daily at 3 AM
[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true
[Install]
WantedBy=timers.target
Result
Every morning, I wake up to a folder full of .qcow2 images.
I’ve restored from them twice. It works perfectly.
Cost: $0.
Peace of mind: Infinite.