The Argo OS Journey - Part 3: The Cloud & The Code

November 28, 2025

By late November, I had an operating system that could survive a bad update. Btrfs snapshots had saved me from the Qt6/elogind disaster. I was feeling invincible.

Then I looked at my desk. Coffee mug. Precariously close to the keyboard. And I realized:

If I spill this on the motherboard, every snapshot on that NVMe dies with it.

Snapshots are local. They’re on the same drive as the data they’re protecting. A drive failure, a fire, a particularly aggressive cup of coffee—and five weeks of work vanishes.

I needed off-site backups. But I’m running Gentoo. I’m not installing some proprietary sync client that scans my files and phones home to the mothership.


The Off-Site Requirement

I had 2TB of Google Drive storage doing absolutely nothing. Might as well use it.

But there was a non-negotiable requirement: Encryption at rest. Google doesn’t get to see my files. Ever.

Enter Rclone

Rclone is rsync for the cloud. It supports 50+ storage backends, and the killer feature is the crypt remote—client-side encryption that happens before anything leaves your machine.

The architecture:

┌─────────────────────────────────────────────────────────────┐
│                    RCLONE ENCRYPTION FLOW                    │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  LOCAL DATA                    ENCRYPTED REMOTE              │
│  /home/commander/Obsidian/ →  gdrive-crypt:Obsidian/        │
│                                                              │
│  ┌──────────────┐             ┌──────────────────────┐      │
│  │ notes.md     │  encrypt →  │ 83d9f8a7b2c1e4f6.bin │      │
│  │ projects/    │  encrypt →  │ f7a2b9c8d1e6f3a4/    │      │
│  │ journal.md   │  encrypt →  │ a1b2c3d4e5f6g7h8.bin │      │
│  └──────────────┘             └──────────────────────┘      │
│                                                              │
│  Google sees: Random binary blobs with gibberish names       │
│  I see: My files, decrypted on-the-fly                      │
└─────────────────────────────────────────────────────────────┘

The Setup:

# Configure the base Google Drive remote
rclone config

# n) New remote
# name> gdrive
# Storage> drive
# ... OAuth flow ...

# Now create the encrypted overlay
# n) New remote
# name> gdrive-crypt
# Storage> crypt
# remote> gdrive:encrypted
# filename_encryption> standard
# directory_name_encryption> true
# Password> [generate 128-char random string]
# Password2> [generate another 128-char random string]

Critical: Those passwords are everything. Lose them, and your data is gone forever. I store them in a hardware security key and a printed copy in a fireproof safe. Yes, I’m paranoid. No, I don’t care.

The rclone.conf

[gdrive]
type = drive
client_id = [redacted]
client_secret = [redacted]
scope = drive
token = {"access_token":"...","expiry":"..."}
team_drive =

[gdrive-crypt]
type = crypt
remote = gdrive:encrypted
password = [encrypted password blob]
password2 = [encrypted password2 blob]
filename_encryption = standard
directory_name_encryption = true

The OpenRC Challenge

Here’s where Gentoo makes things interesting.

Rclone can mount a remote as a FUSE filesystem. You run rclone mount gdrive-crypt: ~/gdrive and suddenly your cloud storage looks like a local folder.

But I wanted it to mount automatically at boot. Rclone is a user-space program. OpenRC is a system init. They don’t naturally get along.

Failure #1: The Hung Boot

My first init script:

#!/sbin/openrc-run
# /etc/init.d/rclone-mount (BROKEN VERSION)

command="/usr/bin/rclone"
command_args="mount gdrive-crypt: /home/commander/gdrive"

depend() {
    need net
}

Result: Boot hung forever. The script never returned because rclone mount runs in the foreground. OpenRC was waiting for it to “finish” starting.

Failure #2: DNS Wasn’t Ready

#!/sbin/openrc-run
# /etc/init.d/rclone-mount (STILL BROKEN)

command="/usr/bin/rclone"
command_args="mount gdrive-crypt: /home/commander/gdrive"
command_background="yes"  # Fixed the hang!
pidfile="/run/rclone-mount.pid"

depend() {
    need net
}

Result: Started, then immediately crashed. The logs:

Failed to mount: failed to get oauth token: Get "https://oauth2.googleapis.com/token": dial tcp: lookup oauth2.googleapis.com: no such host

The net service was “ready” but DNS resolution wasn’t working yet. Classic race condition.

Failure #3: Permission Denied

#!/sbin/openrc-run
# /etc/init.d/rclone-mount (ALMOST WORKING)

command="/usr/bin/rclone"
command_args="mount gdrive-crypt: /home/commander/gdrive --config /home/commander/.config/rclone/rclone.conf"
command_background="yes"
pidfile="/run/rclone-mount.pid"

depend() {
    need net
    after dns  # Wait for DNS!
}

Result: It mounted! But only root could read it. My user got “Permission denied” on every file.

The Working Version

#!/sbin/openrc-run
# /etc/init.d/rclone-mount (FINALLY WORKS)

name="rclone-mount"
description="Mount encrypted Google Drive via rclone"

command="/usr/bin/rclone"
command_args="mount gdrive-crypt: /home/commander/gdrive \
    --config /home/commander/.config/rclone/rclone.conf \
    --allow-other \
    --vfs-cache-mode full \
    --vfs-cache-max-size 10G \
    --vfs-read-ahead 128M \
    --dir-cache-time 72h \
    --poll-interval 15s \
    --log-file /var/log/rclone-mount.log \
    --log-level INFO"

command_background="yes"
command_user="commander"
pidfile="/run/rclone-mount.pid"

depend() {
    need net localmount
    after dns bootmisc
    use logger
}

start_pre() {
    # Ensure mount point exists
    checkpath --directory --owner commander:commander --mode 0755 /home/commander/gdrive

    # Wait for DNS to actually work (belt and suspenders)
    local attempts=0
    while ! host oauth2.googleapis.com >/dev/null 2>&1; do
        attempts=$((attempts + 1))
        if [ $attempts -ge 30 ]; then
            eerror "DNS not ready after 30 seconds"
            return 1
        fi
        sleep 1
    done
}

stop() {
    # Graceful unmount
    fusermount -u /home/commander/gdrive 2>/dev/null
    eend $?
}

The key fixes:

  • command_user="commander": Run as my user, not root
  • --allow-other: Let other users (including my GUI session) access the mount
  • --vfs-cache-mode full: Cache files locally for performance
  • start_pre() DNS check: Don’t even try until we can resolve Google’s OAuth endpoint
  • after dns bootmisc: Explicit ordering dependencies

Now ~/gdrive just exists. It looks like a folder. It acts like a folder. The encryption is completely transparent.


The Backup Scripts

Having the mount is step one. Actually backing up is step two.

The Obsidian Sync

My Obsidian vault is my second brain. Losing it would be catastrophic.

#!/bin/bash
# /usr/local/bin/backup-obsidian

SOURCE="/home/commander/Obsidian"
DEST="gdrive-crypt:Obsidian"
LOG="/var/log/backup-obsidian.log"

echo "=== Obsidian Backup Started: $(date) ===" >> "$LOG"

rclone sync "$SOURCE" "$DEST" \
    --config /home/commander/.config/rclone/rclone.conf \
    --exclude ".obsidian/workspace.json" \
    --exclude ".obsidian/workspace-mobile.json" \
    --exclude ".trash/**" \
    --progress \
    --stats-one-line \
    >> "$LOG" 2>&1

EXIT_CODE=$?

if [ $EXIT_CODE -eq 0 ]; then
    echo "Backup completed successfully" >> "$LOG"
else
    echo "BACKUP FAILED with exit code $EXIT_CODE" >> "$LOG"
    # Send notification (optional)
    notify-send "Obsidian Backup Failed" "Check $LOG for details"
fi

echo "=== Backup Ended: $(date) ===" >> "$LOG"

The Btrfs Send/Receive

For system snapshots, I don’t sync files—I send entire Btrfs snapshots.

#!/bin/bash
# /usr/local/bin/backup-snapshot-to-cloud

SNAPSHOT_DIR="/.snapshots"
DEST="gdrive-crypt:snapshots"
TEMP_DIR="/tmp/btrfs-backup"

# Get the latest snapshot
LATEST=$(snapper -c root list --columns number | tail -1 | tr -d ' ')

if [ -z "$LATEST" ]; then
    echo "No snapshots found"
    exit 1
fi

SNAPSHOT_PATH="$SNAPSHOT_DIR/$LATEST/snapshot"

# Create temp directory
mkdir -p "$TEMP_DIR"

# Send snapshot to file (compressed)
echo "Sending snapshot $LATEST to temp file..."
btrfs send "$SNAPSHOT_PATH" | zstd -9 > "$TEMP_DIR/snapshot-$LATEST.btrfs.zst"

# Upload to cloud
echo "Uploading to cloud..."
rclone copy "$TEMP_DIR/snapshot-$LATEST.btrfs.zst" "$DEST/" \
    --config /home/commander/.config/rclone/rclone.conf \
    --progress

# Cleanup
rm -rf "$TEMP_DIR"

echo "Snapshot $LATEST backed up to cloud"

To restore: Download the file, decompress with zstd -d, pipe to btrfs receive.


Building apkg: The Accidental Package Manager

I had solved storage. Now I had a UI problem.

emerge is powerful. It’s also verbose. To install Firefox from my binhost:

emerge --usepkg --getbinpkg --with-bdeps=n --ask www-client/firefox

I’m lazy. I wanted:

apkg install firefox

So I wrote a wrapper. It started as 10 lines of Bash.

It’s now 2,146 lines.

What Happened: Scope Creep

The initial version:

#!/bin/bash
# apkg v0.1 - 10 lines

case "$1" in
    install) emerge --usepkg --getbinpkg --ask "${@:2}" ;;
    remove)  emerge --unmerge --ask "${@:2}" ;;
    update)  emerge --update --deep --usepkg --getbinpkg @world ;;
    search)  emerge --search "${@:2}" ;;
    *)       echo "Usage: apkg {install|remove|update|search} [package]" ;;
esac

Fine. Done. Ship it.

Except… I kept thinking of edge cases.

The Service Detection Feature

When you install Docker on Gentoo, you get the package. You don’t get it running. You have to manually:

  1. rc-update add docker default
  2. rc-service docker start

I always forgot. So I added service detection:

post_install_check() {
    local package="$1"
    local pkg_name=$(echo "$package" | sed 's/.*\///')

    # Check if the package installed a service file
    if [ -f "/etc/init.d/$pkg_name" ]; then
        echo ""
        echo "┌─────────────────────────────────────────────────────┐"
        echo "│  SERVICE DETECTED: $pkg_name"
        echo "├─────────────────────────────────────────────────────┤"
        echo "│  This package installed an OpenRC service file.     │"
        echo "│                                                     │"
        echo "│  To enable at boot:                                 │"
        echo "│    rc-update add $pkg_name default                  │"
        echo "│                                                     │"
        echo "│  To start now:                                      │"
        echo "│    rc-service $pkg_name start                       │"
        echo "└─────────────────────────────────────────────────────┘"
        echo ""

        read -p "Add to default runlevel? [y/N] " choice
        if [[ "$choice" =~ ^[Yy]$ ]]; then
            rc-update add "$pkg_name" default
            read -p "Start service now? [y/N] " start_choice
            if [[ "$start_choice" =~ ^[Yy]$ ]]; then
                rc-service "$pkg_name" start
            fi
        fi
    fi
}

Now when I run apkg install docker, it doesn’t just install. It asks:

SERVICE DETECTED: docker

This package installed an OpenRC service file.

Add to default runlevel? [y/N] y
Adding docker to default runlevel...
Start service now? [y/N] y
Starting docker...

The Snapshot Integration

From Part 2, I had Portage hooks for snapshots. I moved them into apkg:

pre_emerge() {
    if ! snapper -c root create --description "Pre-apkg: $*"; then
        echo "CRITICAL: Could not create snapshot."
        echo "Disk full? Snapper broken? Fix this first."
        read -p "Continue anyway? [y/N] " choice
        [[ ! "$choice" =~ ^[Yy]$ ]] && exit 1
    fi
}

post_emerge() {
    snapper -c root create --description "Post-apkg: $*"

    # Verify critical services
    local issues=0
    for service in dbus elogind; do
        if ! rc-service "$service" status 2>/dev/null | grep -q "started"; then
            echo "WARNING: $service is not running!"
            ((issues++))
        fi
    done

    if [ $issues -gt 0 ]; then
        echo ""
        echo "Your session might break on next login."
        echo "Consider restarting these services before logging out."
    fi
}

The GitHub Feature

Then I got ambitious.

Gentoo’s repos are massive, but sometimes you want that one random tool from GitHub that isn’t packaged anywhere.

apkg github https://github.com/someone/cool-tool

What it does:

github_install() {
    local url="$1"
    local repo_name=$(basename "$url" .git)
    local clone_dir="/tmp/apkg-github/$repo_name"

    # Clone
    git clone --depth 1 "$url" "$clone_dir"
    cd "$clone_dir"

    # Detect language and build
    if [ -f "Cargo.toml" ]; then
        echo "Detected: Rust project"
        cargo build --release
        BINARY=$(find target/release -maxdepth 1 -type f -executable | head -1)
    elif [ -f "go.mod" ]; then
        echo "Detected: Go project"
        go build -o "$repo_name"
        BINARY="./$repo_name"
    elif [ -f "setup.py" ] || [ -f "pyproject.toml" ]; then
        echo "Detected: Python project"
        pip install --user .
        BINARY=""  # pip handles installation
    elif [ -f "Makefile" ]; then
        echo "Detected: Makefile project"
        make
        BINARY=$(find . -maxdepth 1 -type f -executable | head -1)
    else
        echo "Unknown project type. Aborting."
        return 1
    fi

    # Install binary
    if [ -n "$BINARY" ] && [ -f "$BINARY" ]; then
        sudo cp "$BINARY" "/usr/local/bin/$repo_name"

        # Track in JSON for later uninstall
        local track_file="/var/lib/apkg/github-packages.json"
        jq --arg name "$repo_name" \
           --arg url "$url" \
           --arg date "$(date -Iseconds)" \
           '. += [{"name": $name, "url": $url, "installed": $date}]' \
           "$track_file" > "${track_file}.tmp" && mv "${track_file}.tmp" "$track_file"
    fi

    # Cleanup
    rm -rf "$clone_dir"

    echo "Installed $repo_name to /usr/local/bin/"
}

Is it a “proper” package manager? No. It doesn’t handle dependencies. It doesn’t track files. It’s held together with string and hope.

Does it work? Yes. And that’s what matters at 2 AM when you just want to install that one CLI tool.

The Sanity Check

After the Qt6/elogind incident, I added a pre-flight check:

sanity_check() {
    local issues=0

    echo "Running pre-update sanity check..."

    # D-Bus must be running
    if ! rc-service dbus status | grep -q "started"; then
        echo "ERROR: D-Bus is not running"
        echo "       Session management will fail"
        ((issues++))
    fi

    # elogind must be running
    if ! rc-service elogind status | grep -q "started"; then
        echo "ERROR: elogind is not running"
        echo "       You won't be able to log in graphically"
        ((issues++))
    fi

    # Check if we can create sessions
    if ! loginctl list-sessions &>/dev/null; then
        echo "ERROR: loginctl not responding"
        echo "       Session management is broken"
        ((issues++))
    fi

    # Check disk space
    local root_usage=$(df / | tail -1 | awk '{print $5}' | tr -d '%')
    if [ "$root_usage" -gt 90 ]; then
        echo "WARNING: Root filesystem is ${root_usage}% full"
        echo "         Snapshot creation may fail"
        ((issues++))
    fi

    # Check if snapper is working
    if ! snapper -c root list &>/dev/null; then
        echo "ERROR: Snapper is not responding"
        echo "       No rollback capability"
        ((issues++))
    fi

    if [ $issues -gt 0 ]; then
        echo ""
        echo "Found $issues issue(s). Fix before updating!"
        return 1
    fi

    echo "All checks passed."
    return 0
}

Now every apkg update runs this first. If something’s wrong, I know before I break my system.


The “Production” Crisis (November 30)

I was feeling good. Deployed Argo OS to my laptop. Everything syncing. Backups flowing to the cloud.

I rebooted my desktop.

KDE Plasma crashed immediately.

The logs:

Nov 30 14:22:31 kwin_wayland[1234]: Authorization denied
Nov 30 14:22:31 kwin_wayland[1234]: Failed to connect to socket
Nov 30 14:22:31 polkitd[567]: Unregistered Authentication Agent

The Debugging

# What's running?
rc-status

# Output (trimmed):
# dbus            [ started ]
# elogind         [ started ]
# sddm            [ started ]
# ...
# udisks2         [ stopped ]   <-- WAIT WHAT

udisks2 was stopped. And I had removed it from the default runlevel.

Why?

Because two weeks ago, in my zeal to optimize, I thought: “I don’t need auto-mounting. I’ll mount drives manually like a Real Linux User.”

The Dependency Chain of Doom

udisks2 (stopped)
    → polkit can't query storage permissions
        → KDE's device manager fails initialization
            → Plasma shell crashes during startup
                → Session ends immediately
                    → Back to SDDM login screen

It wasn’t even about mounting drives. KDE just… needs udisks2. It queries it during startup to enumerate available storage devices.

The Fix

Drop to TTY2:

rc-update add udisks2 default
rc-service udisks2 start

Restart SDDM. Desktop loads.

Time lost: 45 minutes of debugging Lesson learned: Don’t remove services because you think you don’t need them. You probably do.

The Expanded Sanity Check

I added udisks2 to the sanity check:

# In sanity_check()

# KDE session requirements
for service in dbus elogind udisks2 polkit; do
    if ! rc-service "$service" status 2>/dev/null | grep -q "started"; then
        echo "ERROR: $service is not running"
        echo "       KDE session will likely fail"
        ((issues++))
    fi
done

Now apkg warns me if I’ve accidentally stopped something critical.


The Statistics

By the end of November:

Backup System:

  • Cloud storage used: 847 GB of 2 TB
  • Obsidian vault: 12 GB (encrypted)
  • System snapshots: 15 full snapshots (compressed with zstd)
  • Backup frequency: Obsidian every 2 hours, snapshots weekly

apkg:

  • Lines of code: 2,146
  • Functions: 47
  • Supported commands: install, remove, update, search, info, files, deps, rdeps, sync, clean, github, untrack
  • GitHub packages tracked: 23

Recovery Capabilities:

  • Local snapshot rollback: ~2 minutes
  • Cloud snapshot restore: ~30 minutes (limited by download speed)
  • Full system rebuild from scratch: Never tested. Don’t want to.

What I Learned

Local Backups Are Not Backups

If your backups are on the same physical device as your data, they’re not backups. They’re a false sense of security.

The 3-2-1 rule exists for a reason:

  • 3 copies of your data
  • 2 different storage media
  • 1 off-site

I now have: NVMe (working copy) + Btrfs snapshots (local backup) + Google Drive (off-site). Three copies, two media types, one off-site.

Encryption Is Non-Negotiable

Cloud storage providers can:

  • Read your files
  • Comply with government requests
  • Get breached
  • Change their terms of service

Client-side encryption means none of that matters. Even if Google gets hacked, the attackers get encrypted blobs with encrypted filenames. Useless without my keys.

Scope Creep Can Be Good

apkg was supposed to be 10 lines. It’s now 2,146. That sounds like a failure of project management.

But every feature I added solved a real problem I actually had. Service detection saves me from forgetting to enable daemons. Snapshot integration ensures I can rollback. The sanity check prevents me from breaking my session.

Sometimes scope creep is just… building the tool you actually need.

Don’t Remove Things You Don’t Understand

I removed udisks2 because I thought I understood what it did. I was wrong.

The lesson: If you’re going to remove a system service, first understand everything that depends on it. equery depends udisks2 would have shown me that half of KDE needs it.


The Recovery Checklist (Updated)

For future reference, when something breaks:

# 1. Don't panic. Switch to TTY.
Ctrl+Alt+F2

# 2. Check critical services
rc-status
loginctl list-sessions

# 3. If session-related
rc-service elogind restart
rc-service dbus restart
rc-service udisks2 restart
rc-service polkit restart

# 4. If package-related
apkg sanity  # Run the sanity check

# 5. If all else fails
snapper list
# Find last working snapshot
snapper rollback <number>
reboot

# 6. If local snapshots are toast
# Download from cloud:
rclone copy gdrive-crypt:snapshots/snapshot-XX.btrfs.zst /tmp/
zstd -d /tmp/snapshot-XX.btrfs.zst
btrfs receive / < /tmp/snapshot-XX.btrfs

In Part 4, we look 10 years into the future. I realize that managing dotfiles with Bash scripts is a losing game. Gentoo is perfect for hardware optimization—but what about configuration reproducibility? The answer is Nix. And no, I’m not switching distros.

Continue to Part 4: The Hybrid Vision →