Managing Remote Hypervisors
Not all infrastructure is local. I run Proxmox VE at a secondary location — a site I can’t just walk over to if something goes wrong.
This changes the approach. Reliability becomes paramount. Self-healing matters. Remote management tools become essential.
The Remote Setup
Two primary hosts at the remote site:
- Compute node - Heavy lifting, VMs, builds
- Storage server - Unraid/ZFS array, backups, low-power services
Both connected to my local network via Tailscale. Managed through the Proxmox web interface as if they were local.
Why Proxmox for Remote?
When you’re managing systems you can’t physically access, you need:
1. Web Interface
The Proxmox GUI provides visibility without SSH:
- VM and container status at a glance
- Console access through the browser
- Resource graphs (CPU, memory, storage)
- Snapshot management
One browser tab per site. Full visibility.
2. LXC Containers
VMs are heavy. LXC containers are lightweight.
For services that don’t need full isolation:
- Faster startup
- Lower memory overhead
- Easier snapshots
Example: A build agent in an LXC container needs VPN access:
# /etc/pve/lxc/100.conf
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net/tun dev/net/tun none bind,create=file
That config allows TUN device passthrough — required for Tailscale inside the container.
3. ZFS for Storage
Benefits for remote storage:
Snapshots:
zfs snapshot tank/vms@before-upgrade
# Do risky thing
zfs rollback tank/vms@before-upgrade
Replication:
zfs send tank/data@snapshot | ssh local-nas zfs recv backup/remote
Send snapshots across the VPN for off-site backup.
Self-healing:
zfs scrub tank
Detects and corrects bit rot automatically.
Network Architecture
Remote site on 192.168.20.x. Local site on 10.42.0.x. Connected via Tailscale mesh.
Subnet Routing
One node at each site advertises its local subnet:
tailscale up --advertise-routes=192.168.20.0/24
Now local machines reach remote machines by IP. No port forwarding required.
VLAN Isolation
Even at remote sites, network segmentation matters:
| VLAN | Purpose |
|---|---|
| 10 | Management (Proxmox UI, SSH) |
| 20 | Services (containers, VMs) |
| 66 | IoT/Quarantine |
If a container is compromised, it can’t reach the hypervisor.
Backup Strategy
Remote infrastructure needs local backups. Can’t rely on the remote site to back up to itself.
Proxmox Backup Server
Runs locally. Remote hosts push backups across the VPN:
Remote Proxmox → VPN → Local PBS
ZFS Replication
For raw data (not VM images):
# Nightly cron on local server
ssh remote-nas zfs send tank/data@today | zfs recv backup/remote
Two copies: remote (production) and local (backup).
Monitoring
Remote means no blinking lights. Monitoring fills the gap.
Uptime Kuma - Service availability checks, alerts via Discord/email
Prometheus + Grafana - Metrics collection, dashboards for CPU/memory/disk
Netdata - Real-time debugging when things go wrong
If the remote Proxmox starts struggling, I know before users notice.
Failure Recovery
”I can’t reach the Proxmox UI”
- Check Tailscale status (is the node online?)
- SSH directly to the host
- Check if
pveproxyis running - Worst case: contact someone on-site
”A VM won’t start”
- Check Proxmox GUI for errors
- Verify storage health (is ZFS okay? Pool full?)
- Check resource limits
- Restore from snapshot
”I need to reboot the host”
Proxmox handles reboots gracefully. VMs/containers auto-start if configured:
Options → Start at boot: Yes
Options → Start/Shutdown order: 1
Lessons Learned
Automate everything. Can’t SSH in at 3 AM to fix things. Self-healing required.
Monitor aggressively. Know there’s a problem before anyone else.
Test failover. Deliberately break things. Know the recovery path.
Document physical access. Where is the server? What’s the management IP? Who can power cycle it?
Remote infrastructure is real infrastructure. It just requires more planning.