The Kernel That Panicked Every Three Minutes
Date: 2026-01-21 Duration: About an hour (felt longer) Issue: Continuous reboot loop Root Cause: K3s pod crashes → kernel panic → auto-reboot → repeat
The Problem
Alpha-Centauri kept rebooting. Every 1-3 minutes. No warning, no pattern, just — reboot.
SSH in, run a command, and before I could finish typing, the connection dropped. System was back up 30 seconds later. Started the investigation again. Dropped again.
The Timeline
18:30 - System boot
18:33 - System reboot (3 min uptime)
18:34 - System boot
18:35 - System reboot (1 min uptime)
18:36 - System boot
18:39 - System reboot (3 min uptime)
... continues for 45 minutes ...
The system couldn’t stay up long enough to debug itself.
The Hunt
Had to work fast. SSH in, run one command, copy the output before the connection died.
First check: what happened before the last reboot?
journalctl -b -1 -n 50
CNI bridge state changes. Hundreds of them.
cni0: port 1(veth...) entered disabled state
cni0: port 1(veth...) entered forwarding state
cni0: port 1(veth...) entered disabled state
The Kubernetes CNI network was thrashing. Interfaces being created and destroyed faster than I could scroll.
The Pods
Checked K3s:
kubectl get pods --all-namespaces
NAME READY STATUS RESTARTS
openwebui-xxx 0/1 CrashLoopBackOff 47
quartz-vault-xxx 0/1 CrashLoopBackOff 39
Two pods crash-looping. 47 restarts. 39 restarts. They’d been crashing for hours.
Every crash:
- Pod dies
- Network namespace destroyed
- CNI bridge interface removed
- Pod restarts
- Network namespace created
- CNI bridge interface added
- Pod crashes again
- Repeat
Hundreds of network interface state changes per minute.
The Kernel Panic
The network stack couldn’t handle it. All those rapid interface state changes, the memory churn from pod restarts, the CNI bridge thrashing — something broke deep in the kernel.
Panic.
But I didn’t see the panic. Because of this:
sysctl kernel.panic
# kernel.panic = 10
Ubuntu’s default. When the kernel panics, wait 10 seconds, then automatically reboot.
For production servers with monitoring, this is smart — automatic recovery.
For debugging, this is a nightmare — the system reboots before you can read the panic message.
The Loop
The full sequence:
- System boots
- K3s starts
- Pods start crashing (within 30 seconds)
- CNI network thrashes
- Kernel panics (1-3 minutes)
- Wait 10 seconds
- Automatic reboot
- Return to step 1
Every. Single. Time.
The Fix
First, break the reboot loop:
sysctl -w kernel.panic=0
echo 'kernel.panic = 0' >> /etc/sysctl.conf
Now the system will halt on panic instead of rebooting. Not ideal for production, but essential for debugging.
Second, stop the chaos:
systemctl stop k3s
No K3s, no pod crashes, no CNI thrashing, no kernel panic.
The system stayed up. First stable boot in an hour.
Was It My Code?
I was running a custom gateway service. First instinct: I broke something.
Searched the entire codebase:
grep -r "reboot\|shutdown.*-r\|systemctl.*reboot" bin/ lib/ scripts/
Zero matches. My code doesn’t reboot anything.
Checked the systemd service:
[Service]
Restart=always
Restart the process on failure. Not reboot the system.
The gateway was a victim, not a perpetrator. It crashed because the kernel underneath it crashed.
The Actual Culprit
Primary cause: K3s pods crash-looping
Contributing factors:
- 8GB RAM shared between K3s, monitoring, gateway, and other services
- Aggressive pod restart policy
- CNI network bridge instability under rapid state changes
Trigger: Something caused the pods to start crashing (OOM? config error? dependency failure?)
Amplifier: kernel.panic=10 turned crashes into an unbreakable loop
The Lessons
Ubuntu’s panic default is dangerous for development. Set kernel.panic=0 on any machine you might need to debug.
K3s on 8GB RAM is risky. The control plane alone wants 1GB. Add pods, and you’re living on the edge. Consider 16GB+ or dedicated K3s nodes.
Crash-looping pods can take down a host. The CNI network changes cascade into kernel-level instability. Resource limits and proper health checks matter.
Check system logs before blaming your code. I spent 15 minutes suspecting my gateway before checking journalctl. The kernel panic was right there in the logs.
The Prevention
Immediate:
kernel.panic=0prevents the reboot loop- K3s stopped until pods are fixed
Short-term:
- Fix or delete the crash-looping pods
- Add resource limits to K3s workloads
- Migrate gateway to isolated LXC container
Long-term:
- Dedicated K3s node(s) with more RAM
- Proper monitoring with reboot alerts
- Health checks that prevent infinite crash loops
The kernel wasn’t broken. It was being tortured by Kubernetes. When pods crash 47 times in an hour, something has to give.