Managing Remote Hypervisors

Not all infrastructure is local. I run Proxmox VE at a secondary location — a site I can’t just walk over to if something goes wrong.

This changes the approach. Reliability becomes paramount. Self-healing matters. Remote management tools become essential.

The Remote Setup

Two primary hosts at the remote site:

Compute node - Heavy lifting, VMs, builds
Storage server - Unraid/ZFS array, backups, low-power services

Both connected to my local network via Tailscale. Managed through the Proxmox web interface as if they were local.

Why Proxmox for Remote?

When you’re managing systems you can’t physically access, you need:

1. Web Interface

The Proxmox GUI provides visibility without SSH:

VM and container status at a glance
Console access through the browser
Resource graphs (CPU, memory, storage)
Snapshot management

One browser tab per site. Full visibility.

2. LXC Containers

VMs are heavy. LXC containers are lightweight.

For services that don’t need full isolation:

Faster startup
Lower memory overhead
Easier snapshots

Example: A build agent in an LXC container needs VPN access:

# /etc/pve/lxc/100.conf
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net/tun dev/net/tun none bind,create=file

That config allows TUN device passthrough — required for Tailscale inside the container.

3. ZFS for Storage

Benefits for remote storage:

Snapshots:

zfs snapshot tank/vms@before-upgrade
# Do risky thing
zfs rollback tank/vms@before-upgrade

Replication:

zfs send tank/data@snapshot | ssh local-nas zfs recv backup/remote

Send snapshots across the VPN for off-site backup.

Self-healing:

zfs scrub tank

Detects and corrects bit rot automatically.

Network Architecture

Remote site on 192.168.20.x. Local site on 10.42.0.x. Connected via Tailscale mesh.

Subnet Routing

One node at each site advertises its local subnet:

tailscale up --advertise-routes=192.168.20.0/24

Now local machines reach remote machines by IP. No port forwarding required.

VLAN Isolation

Even at remote sites, network segmentation matters:

VLAN	Purpose
10	Management (Proxmox UI, SSH)
20	Services (containers, VMs)
66	IoT/Quarantine

If a container is compromised, it can’t reach the hypervisor.

Backup Strategy

Remote infrastructure needs local backups. Can’t rely on the remote site to back up to itself.

Proxmox Backup Server

Runs locally. Remote hosts push backups across the VPN:

Remote Proxmox → VPN → Local PBS

ZFS Replication

For raw data (not VM images):

# Nightly cron on local server
ssh remote-nas zfs send tank/data@today | zfs recv backup/remote

Two copies: remote (production) and local (backup).

Monitoring

Remote means no blinking lights. Monitoring fills the gap.

Uptime Kuma - Service availability checks, alerts via Discord/email

Prometheus + Grafana - Metrics collection, dashboards for CPU/memory/disk

Netdata - Real-time debugging when things go wrong

If the remote Proxmox starts struggling, I know before users notice.

Failure Recovery

”I can’t reach the Proxmox UI”

Check Tailscale status (is the node online?)
SSH directly to the host
Check if pveproxy is running
Worst case: contact someone on-site

”A VM won’t start”

Check Proxmox GUI for errors
Verify storage health (is ZFS okay? Pool full?)
Check resource limits
Restore from snapshot

”I need to reboot the host”

Proxmox handles reboots gracefully. VMs/containers auto-start if configured:

Options → Start at boot: Yes
Options → Start/Shutdown order: 1

Lessons Learned

Automate everything. Can’t SSH in at 3 AM to fix things. Self-healing required.

Monitor aggressively. Know there’s a problem before anyone else.

Test failover. Deliberately break things. Know the recovery path.

Document physical access. Where is the server? What’s the management IP? Who can power cycle it?

Remote infrastructure is real infrastructure. It just requires more planning.