Skip to main content

The Logs

"Document Everything."
Dev Logs, Personal Ramblings, and the raw reality of the lab. ⚠️ Raw Output

journal_tree.exe
$ pstree -p journal
journal | ├── 2026 \u2502 ├── 03 ├── 02 └── 01 ├── 2025 \u2502 ├── 12 ├── 11 ├── 10 ├── 09 ├── 08 ├── 06 ├── 04 └── 03 ├── 2024 \u2502 ├── 11 └── 05 └── 2023 \u2502 ├── 12 ├── 09 └── 08

The Silent Monitoring Problem

55 monitors running in Uptime Kuma. Zero notification channels. I had monitoring that monitored nothing to nobody. Also: rewired OpenClaw from 63 daily LLM calls down to 9, and split the status page so my homelab instability doesn't tank the public uptime score.

What OpenClaw Does at 3 AM

I have an AI agent running 24/7 on my homelab. It checks 55 monitors, audits security configs, tests playground health, and reports to Telegram. Here's what it actually does when I'm asleep — and what happens when it breaks.

The Recovery Script Meets Production

March 9, 2026. A 28KB Bash script with 8 phases, 5 log files, and zero interactivity. It's supposed to fix a Gentoo system that lost authentication. I also extracted two more ArgoBox modules and restarted the RAG embedding pipeline. Some days you fix infrastructure. Some days you build new infrastructure while fixing old infrastructure.

Forty-Two Files, One Saturday

Woke up and decided to harden every API endpoint in ArgoBox. Error messages leaking internal URLs. Stack traces in production. A KV cache that retried on success and gave up on failure. 42 files changed across 3 commits by the end of the day.

The Placeholder That Broke Telegram

OpenClaw's Telegram bot was in a restart loop for who knows how long. The root cause? A config value that literally said REPLACE_WITH_TELEGRAM_BOT_TOKEN. Nobody replaced it. Not me, not the deployment process, not anyone.

The System That Wouldn't Let Me In

March 4-6, 2026. A Gentoo system that crashed on March 2 finally boots on March 4. The emerge claims it finished. The system boots. But then... your password is wrong. Your password is always wrong. Even though you KNOW you're typing it correctly.

The Crash You Don't See Coming

March 2, 2026, 10:02 AM. A Gentoo @world update decided to unmerge X11 libraries while the display server was actively using them. The entire kernel locked up in less than a second. The weird part? I didn't notice.

Five Projects Burning

March 1, 2026. The day I kicked off a Gentoo @world update and then ignored it. Also extracted EdgeMail from production, wrote 6,500 lines of code, optimized deployment pipelines, and built a Wompus Protocol to save tokens. Normal Tuesday.

RAG Multi-Model: The Day I Discovered My Vector Database Was Broken

Rebuilt the RAG system to support three embedding models across three content tiers. Found a mixed-dimension bug that broke vector search. Fixed it. Created tooling to prevent it happening again. 294,000 chunks and counting.

Contrast Fixes and Module Audits: Two Small Tasks That Matter

Fixed documentation page contrast so text is actually readable on mobile. Audited all 28 ArgoBox modules and resolved orphaned pages. Two small tasks that freed up cognitive space for the next thing.

Knowledge Graph Physics: Bounding Boxes and Settle Detection

Spent a Saturday evening tuning the physics engine for the blog knowledge graph. Added viewport bounds clamping and settle detection to Tendril. Made the graph stop bouncing around when you're trying to read it.

Bug Fixes, Identity Systems, and Responsive Design

Fixed three critical bugs in the job automation external ATS flow. Built an identity ground truth system for the site. Redesigned the admin area to actually work on mobile phones.

Modularization: Building Foundations That Scale

Extracted job automation into a standalone package, built a module system for ArgoBox, and created user settings infrastructure. One day of intentional architecture that enables future features without breaking the present.

Applyr Standalone: When Internal Tools Become Products

Extracted a job automation system into a standalone Python package. Multi-platform (LinkedIn + Indeed), evidence capture for compliance, modular architecture. When your internal tool is good enough to use yourself, it's good enough to open source.

Infrastructure and Automation: One Day, Two Worlds

Built a dynamic homelab dashboard and started on job automation in a single day. One session debugging infrastructure APIs and proxy routes. Another session laying foundations for a system that applies to 50 jobs while you sleep.

The Admin Panel Fix & The Duplicate Build Crisis

3 port mismatches that broke admin connectivity, 690MB of binpkgs stuck in staging, duplicate drone builds costing compute, and the design of a package control system to prevent it all happening again.

The Crash Recovery & The Lab Engine Goes Live

System crash at 00:30, crash recovery, fleet package drift discovered, lab engine deployed to production, UX fixed, and multi-container support planned. 4 parallel sessions, 7 critical infrastructure issues resolved.

The Security Overhaul and the Hardcoded IP

Network page rewrite, 5-layer lab engine hardening, and the internal IP that shouldn't have been in production code. 6 files, 1 embarrassing discovery.

25 Bugs, One Night

12.5 hours, 25+ bugs across 4 components, deployed to every node before sunrise. The v2.5.1 overnight marathon.

Monitoring Stack in 60 Minutes

Deployed Prometheus, Grafana, Loki, cAdvisor, Smokeping, Healthchecks, and Promtail to Altair-Link. Also, nodejs decided it needed to compile.

The Stale Queue That Grounded My Drones

7 hours diagnosing why 5 drones sat idle while 149 packages needed building. Version mismatch, corrupt binpkgs, and a fresh start.

The 3-Hour Physics Session

I finally cracked why my knowledge graph felt dead. Variable edge lengths. Spines and bridges. The Obsidian feel, achieved at last.

The Dashboard That Lied To Me

I built a fancy NOC dashboard with storage metrics and network stats. Then I noticed the numbers changed every time I refreshed. Math.random() had been running production monitoring for who knows how long.

The Orchestrator That Rebooted My Workstation

I built an auto-healing build swarm. Then it SSH'd into my development machine and ran 'reboot'. The container reported the wrong IP and the orchestrator executed its cleanup protocol. On my workstation. At 8:30 PM.

81MB of Markdown

Obsidian was crashing every time it tried to index my vault. Turns out I had 81MB of conversation archives in there. The indexer was not amused.

The Local Library Refactor

The knowledge graph was fragile. Remote dependencies kept breaking. Today I ripped it all out and built @argobox/tendril-graph locally. It works now.

Three Rsync Bugs In One Day

A 7-hour debugging session uncovered three separate rsync bugs: missing timeouts, an invalid SSH flag, and uploading 3GB instead of one package. Also built a CLI tool because I was tired of SSHing everywhere.

Tendril Goes Open Source

The knowledge graph is no longer just an ArgoBox feature. Today I extracted it into its own library. MIT license. Free forever.

The Binpkg Cleanup Bug

Drones were hoarding 5GB of binary packages because the cleanup code was looking for directories that don't exist anymore. Three hours to fix what should have been obvious.

The Drone That Wasn't A Drone

I SSH'd into my drone to debug why it was offline. Got a Zorin OS login prompt instead of Gentoo. Spent 20 minutes troubleshooting the wrong machine because two devices had the same IP address.

The Printer That Forgot Its Subnet

The printer was on the same physical network. CUPS could see it via mDNS. But packets weren't going anywhere. Turns out a power surge knocked it back to a static IP from an old network config — 192.168.0.104 on a 10.42.0.x network.

The Drone That Rebooted the Wrong Server

When the orchestrator tried to reboot a misbehaving build drone, it accidentally rebooted the gateway instead — four times in one day. NAT masquerade, Tailscale routing, hardcoded blocks from Past Me, and the full story of getting 58 cores back online.

The Language Server That Froze Everything

System hard-locked during a coding session. The culprit: a language server using 41% CPU and 3.3GB RAM while 'idle', with active connections to Google's cloud.

The Taskbar That Stopped Responding

Plasma frozen. Three plasmashell processes. Three weather widgets. One evening of debugging that ended with a better reset script.

The Tunnel With Two Heads

My websites were flipping between working and broken at random. Same URL, same moment, different results. Turns out I had two cloudflared instances fighting over the same tunnel — and Cloudflare was helpfully load-balancing between them.

The Race Condition That Ate My Binaries

Drones were deleting their packages before the orchestrator could validate them. Also: a Docker container crash-looping because it was looking for SSH keys that don't exist in the new architecture.

The AudioBooks Folder That Ate Itself Three Times

When you lose your Claude context mid-cleanup and discover your Unraid server has 3.5TB of audiobooks nested three folders deep with 3,582 empty placeholder folders for good measure.

The Kernel That Panicked Every Three Minutes

Server rebooting every 1-3 minutes. Couldn't stay up long enough to investigate. Turned out K3s pods were crash-looping so hard they destabilized the kernel, and Ubuntu's default panic setting auto-rebooted before I could catch it.

The Reboot Loop That Blamed the Wrong Code

When Alpha-Centauri started rebooting every 90 seconds, I was convinced my build swarm code had achieved sentience and was trying to escape. Spoiler: it was innocent.

The 1.36 Million Segment Stream

When someone clicks play on an audiobook and the server tries to transcode 2,272 hours of audio in one stream, you know something's wrong. Also discovered 3,584 empty folders and a deleted log file eating 112MB of RAM.

The RAID That Almost Ate Christmas

Four HGST drives. One dying Synology. USB docks that kept disconnecting. The week between Christmas and New Year's became a crash course in mdadm, LVM, Btrfs, and why you should never trust USB for data recovery.

The Nix That Contaminated Everything

SDDM wouldn't log me in. X11 wouldn't start. Wayland was dead. The culprit? Home Manager had quietly rewritten my shell configuration.

The Extent Tree That Vanished

ERROR: could not find extent tree. Seven words that meant my Btrfs filesystem had lost track of which blocks were in use. Recovery time.

The Package Manager That Deleted Itself

Home Manager and nix-env conflicted. The solution was to remove all nix-env packages. Including Home Manager. Which was installed via nix-env.

The RAID That Refused to Rebuild

Synology NAS RAID degraded. One drive failed. The replacement wouldn't integrate. 86 messages across two days to figure out why - and it wasn't the drive's fault.

The 244-Message Waybar

Customizing Waybar for Hyprland. Modules, colors, spacing, hover effects - 244 messages to get a status bar that looks exactly right. Sometimes the details matter more than the function.

The Namespace Steam Demanded

Steam wanted user namespaces. Gentoo said 'what namespaces?' Turns out when you compile your own kernel, you have to actually enable the features your software needs.

The TTY That Saved Everything

Gentoo wouldn't boot to GUI. KDE Plasma broken. SDDM wouldn't start. 322 messages, multiple recovery attempts, and the realization that Ctrl+Alt+F2 is the most important shortcut in Linux.

The VM That Couldn't Find Its Disk

Dracut complained about missing /dev/nbd0p3. The VM's XML had PCI slots at 0. And GRUB had the root device listed twice. Three problems. One boot failure.

The Monitors That Windows Stole

Rebooted from Windows. Two of three monitors vanished. The fix involved removing NVIDIA 390, installing NVIDIA 580, and realizing the kernel module never got installed.

The GRUB That Forgot Everything

Deleted a corrupted GRUB. Now /etc/grub.d/ was empty. os-prober couldn't see Windows or CachyOS. NVIDIA parameters were wrong. Found the working config in a backup file I didn't know existed.

The VPN That Only Worked the Second Time

LibreWolf through a VPN namespace. Worked perfectly — on the second launch. First try always failed. Turned out the fix that was supposed to help made everything worse.

The 88 Reboots Mystery

88 reboots in 3 weeks. Every login was a coin flip. Turned out PCIe Gen4 and my aging motherboard were having a disagreement about timing. Fixed it, then immediately broke my right monitor.

The Mounts That Wouldn't Come Back

Lost network connectivity. NAS mounts died. Network came back. Mounts didn't. Device busy, no such file, stale handles everywhere. Found duplicates in fstab and an ancient SMB version.

The Intruder That Wasn't

ruTorrent containers stopped working. An unknown 'admin' user appeared in the logs. External IP addresses hitting the login page. Was it a hacker? The investigation said no.

The Phone That Kept Redirecting

My daughter's phone was redirecting speedtest.net to bbump-me-push.com. Then to Etsy affiliate links. Antivirus found nothing. Play Protect found nothing. Turned out to be a game that modified the APN settings.

The Clock That Forgot Its Timezone

Installed Linux next to Windows. Now Windows thinks it's seven hours earlier. Every time. Turns out Windows and Linux disagree on what 'time' even means at the hardware level.

The Hour That Kept Shifting

Dual-booted EndeavourOS next to Windows. Now my clock is wrong. Mountain Time, but off by an hour. Turns out Windows and Linux disagree about what time the hardware clock should store.

The BIOS That Wouldn't Show

ASUS board with a 4790K. Wireless keyboard. Four monitors connected to a 4070 Ti. I was mashing F2 and Delete for ten minutes. Turns out I was probably getting into BIOS the whole time.

The VLAN for the Surveillance Phone

Work phone with MDM. Wanted to see what it was sending home. Set up a quarantine VLAN on the MikroTik, plugged in a WAP. Phone kept getting the wrong IP. Turned out I was connecting to the wrong SSID.

The Build That Panicked

Astro build failing on Cloudflare Pages with 'panic: html: bad parser state: originalIM was set twice'. Spent an hour debugging SVG components. The real issue? Using 'latest' for dependencies.

The Obsidian Graph Dream

12:20 AM. Staring at my blog. Something's missing. I want it to feel like my Obsidian vault. Time to build a knowledge graph.

The Obsidian Container That Wouldn't Connect

Obsidian running in a K3s pod via XPRA. Works internally. 502 Bad Gateway externally. The container was alive, the process was running, but something between XPRA and Cloudflare wasn't speaking the same language.

The Namespace That Wouldn't Die

cattle-system and cert-manager stuck in 'Terminating' for 15 days. Force deletes did nothing. JSON patches did nothing. Turns out you can't delete a namespace when the API server still thinks a stale custom resource exists.

The Update That Broke Storage Manager

DSM update. SMB reinstall. Now 9 services won't start. 'Storage abnormalities detected.' Even Storage Manager itself was broken. 116TB of data sitting there, accessible but unmanageable.

The Pool That Refused to Import

Fresh Proxmox install over an old one. 'Failed to start Import ZFS pool' on every boot. No pools listed. But there was a pool - it just wouldn't admit it.

The Plex That Couldn't See

Plex on one machine. Media on the NAS. Same network. But the library was empty. The files existed. The shares were mounted. Plex just... couldn't see them.

The 34-Message tmux Install

Installing tmux on a Synology NAS. Should be simple. Except DSM isn't standard Linux, and package managers don't exist. Enter Entware and 34 messages of troubleshooting.

The Vault That Opens Itself

Daily notes should exist whether I'm at the computer or not. A bash script, a cron job, and the obsidian:// URI scheme. Now the vault maintains itself.

The Dataview That Pulled Everything

52 messages to write one Dataview query. Pulling text from specific subheadings, across dated folders, displaying in chronological order. When the query finally worked, it felt like magic.

The 463-Message Saga

I spent 4 days and 463 ChatGPT messages trying to get Docker and Traefik working. Day 3 alone was 238 messages. But when that curl finally returned 200 OK at 11 PM on day 4, I may have scared the neighbors.

The Vault That Needed Boundaries

Work notes, personal journal, letters to my daughter, technical documentation - all in one Obsidian vault. Time to create structure without losing connections.

The 176-Message Obsidian Setup

Setting up Obsidian journaling templates. 176 messages to get daily notes, templater, and dataview working together. The result: a second brain that actually thinks.

The Scope That Could Save You

My employer wanted me to pentest a client from my home IP. Without a signed scope of work. This conversation might have saved my career.

The VNC That Wouldn't Connect

126 messages to get VNC working on Debian. Residual configs from failed attempts, conflicting packages, systemd units that wouldn't die. Sometimes you have to burn it all down and start fresh.

The Honeypots That Lie in Wait

Researching honeypot options for the home lab. Kippo, Cowrie, Dionaea, Honeyd - each one a different trap for a different kind of attacker. The question: which one catches the most interesting flies?

The NAS That Needed a Fence

August 2023. I wanted to access my Synology from work. The question: VPN or expose it to the internet? The answer involved pfSense firewall rules, port restrictions, and learning why 'just forward port 445' is a terrible idea.

The Lab That Started It All

ArgoBox didn't start in 2023. It started around 2011 as a seedbox - ruTorrent, Plex, bare metal scripts. Then ESXi. Then distributed. Then unified. August 2023 was just when I started documenting the journey.