Skip to main content
Build Swarm

Build Swarm Handbook

Complete reference for operating the Gentoo Build Swarm - daily operations, workflows, and troubleshooting

January 25, 2026

Gentoo Build Swarm - Complete Handbook

Version: 2.6

This is the complete reference for operating the Gentoo Build Swarm. Start here.

Quick Start

The One Command You Need

build-swarm status

This shows you what’s happening AND tells you what to do next:

═══ BUILD SWARM STATUS ═══

Gateway:      ✓ 10.42.0.199
Orchestrator: ✓ 10.42.0.201 (API Online)

Build Progress:
  Needed:    0
  Building:  2
  Complete:  75
  Blocked:   1

═══ NEXT ACTION ═══
⚠  1 package(s) blocked:
     • =www-client/brave-browser-1.86.139

  → Run: build-swarm fix-blocked

The Complete Workflow

  1. Run build-swarm status
  2. Do whatever it tells you
  3. Repeat

That’s it.

Daily Operations

Starting a Fresh Build

When you want to update your system:

# Option A: Fresh start (recommended after issues)
build-swarm fresh

# Option B: Continue from previous state
build-swarm release

What fresh does:

  1. Clears staging directory
  2. Resets orchestrator state
  3. Syncs portage trees on all nodes
  4. Discovers needed packages
  5. Distributes builds to drones

Monitoring Progress

# Interactive TUI dashboard
build-swarm monitor

# Or just check status
build-swarm status

Monitor keybindings:

  • q - Quit
  • b - Balance workload
  • u - Unblock failed packages
  • R - Reset swarm (careful!)

When Builds Complete

# 1. Verify the build is safe
build-swarm verify

# 2. If Risk Score < 10, release to production
build-swarm finalize

# 3. Update your desktop
sudo apkg update

Handling Blocked Packages

# Diagnose the issue
build-swarm fix-blocked

# Common fixes:
build-swarm sync-overlays     # Missing overlay
build-swarm build-local <pkg> # Kernel-specific (nvidia-drivers)
build-swarm unblock           # Retry transient failures
build-swarm sync-fix          # Out-of-sync portage trees

Build Workflows

Standard Update Workflow

┌─────────────────────────────────────────────────────────────┐
│                    YOUR TYPICAL DAY                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   1. build-swarm fresh          # Start a clean build       │
│          │                                                   │
│          ▼                                                   │
│   2. build-swarm monitor        # Watch progress (optional) │
│          │                                                   │
│          ▼                                                   │
│   3. build-swarm status         # Check when complete       │
│          │                                                   │
│          ▼                                                   │
│   4. build-swarm verify         # Safety check              │
│          │                                                   │
│          ▼                                                   │
│   5. build-swarm finalize       # Release to production     │
│          │                                                   │
│          ▼                                                   │
│   6. sudo apkg update           # Update your desktop       │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Quick Reference Table

I want to…Run this
Check what’s happeningbuild-swarm status
Start a fresh buildbuild-swarm fresh
Watch builds livebuild-swarm monitor
See blocked packagesbuild-swarm fix-blocked
Retry failed packagesbuild-swarm unblock
Check if release is safebuild-swarm verify
Release to productionbuild-swarm finalize
Update my desktopsudo apkg update
Update a dronebuild-swarm update-drone <name>

Staging vs Production

The swarm uses a two-tier storage model for safety:

STAGING (/var/cache/binpkgs-staging/)
  ↓ Drones upload here
  ↓ NOT served to clients
  ↓ Work in progress

    ──── build-swarm finalize ────

PRODUCTION (/var/cache/binpkgs/)
  ↓ Nginx serves from here
  ↓ Atomic release (all or nothing)
  ↓ What clients actually see

Key point: Packages in staging are invisible to your desktop until you run finalize.

Command Reference

Status & Monitoring

CommandDescription
build-swarm statusShow current status and next action
build-swarm monitorLaunch interactive TUI dashboard
build-swarm infoShow connection info (Gateway IP, Orchestrator IP)
build-swarm logs <drone>Stream live build logs from a drone

Build Operations

CommandDescription
build-swarm releaseFull pipeline: sync → build → verify → finalize
build-swarm freshReset everything and start clean build
build-swarm verifyRun safety checks (Risk Score)
build-swarm finalizeMove staging → production (release packages)

Troubleshooting

CommandDescription
build-swarm fix-blockedDiagnose blocked packages and suggest fixes
build-swarm unblockRetry all blocked packages
build-swarm sync-overlaysSync custom overlays to all drones
build-swarm sync-verifyCheck if all drones have synced portage
build-swarm sync-fixAuto-fix out-of-sync drones
build-swarm sync-allFull portage sync (upstream + all nodes)
build-swarm build-local <pkg>Build package locally and upload

Infrastructure Management

CommandDescription
build-swarm pushGit pull + deploy code to all nodes
build-swarm push dronesDeploy only to drones
build-swarm push <name>Deploy to specific node
build-swarm rename <ip|name> <new>Rename a node
build-swarm testRun full system integration tests
build-swarm stress [N]Stress test with N dummy packages

Code Deployment

# The standard workflow for updating swarm code:
cd ~/Development/gentoo-build-swarm
git add -A && git commit -m "Your changes" && git push  # Save to Gitea
build-swarm push                                         # Deploy to all nodes

Troubleshooting

Package Build Failed

# 1. See what's blocked
build-swarm fix-blocked

# 2. Check drone logs for details
build-swarm logs drone-Izar-Host

# 3. Fix based on error type:
Error TypeFix
Missing overlaybuild-swarm sync-overlays
Portage tree mismatchbuild-swarm sync-fix
Kernel-specific (nvidia)build-swarm build-local nvidia-drivers
Transient failurebuild-swarm unblock
USE flag mismatchUpdate /etc/portage/ and build-swarm fresh

Drone Offline

# Check if drone can reach gateway
ssh root@<drone-ip> 'curl -s http://10.42.0.199:8090/health'

# Check service status
ssh root@<drone-ip> 'rc-service swarm-drone status'

# Check logs
ssh root@<drone-ip> 'tail -50 /var/log/build-swarm/drone.log'

# Restart if needed
ssh root@<drone-ip> 'rc-service swarm-drone restart'

Orchestrator Unreachable

# Check orchestrator status
build-swarm status

# If primary down, gateway auto-routes to backup
# Check which orchestrator is active:
curl -s http://10.42.0.199:8090/api/v1/orchestrator

Packages Stuck in Staging

# Check staging count
build-swarm status

# If builds complete but packages not visible:
build-swarm verify   # Check if safe
build-swarm finalize # Move to production

Configuration

Drone Configuration

File: /etc/build-swarm/drone.conf

# Required
GATEWAY_URL="http://10.42.0.199:8090"

# Optional
NODE_NAME="my-drone"           # Display name (default: hostname)
REPORT_IP="100.x.x.x"          # Override reported IP (Tailscale)
UPLOAD_HOST="100.x.x.x"        # Override upload destination
HEARTBEAT_INTERVAL=30          # Seconds between gateway heartbeats
POLL_INTERVAL=30               # Seconds between work polling
AUTO_REBOOT=true               # Kill on stuck builds (1hr timeout)

Portage Configuration (Drones)

File: /etc/portage/make.conf

# Set to core count
MAKEOPTS="-j16"

# Required features
FEATURES="buildpkg fail-clean -getbinpkg -binpkg-multi-instance"

Orchestrator Configuration

File: /etc/build-swarm/orchestrator.conf

GATEWAY_URL="http://10.42.0.199:8090"
ORCHESTRATOR_PORT=8080
BUILD_MODE="delegate_first"    # delegate_only, delegate_first, hybrid

Maintenance

Weekly Tasks

# Check for blocked packages
build-swarm status

# Verify all drones are synced
build-swarm sync-verify

# Check drone disk space
for drone in drone-Izar-Host drone-Tarn; do
  ssh root@$drone 'df -h /var/cache/binpkgs'
done

Updating Swarm Code

cd ~/Development/gentoo-build-swarm
git pull                        # Get latest
build-swarm push                # Deploy to all nodes

Adding a New Drone

# Method A: Remote installation
build-swarm add drone Worker-01 --ip 10.42.0.50

# Method B: Local installation (on the drone)
git clone https://github.com/Arcturus-Prime/gentoo-build-swarm.git
cd gentoo-build-swarm
sudo ./install.sh drone 10.42.0.199

Renaming Nodes

build-swarm rename 10.42.0.184 drone-Tau-Host
build-swarm rename drone-old drone-new

Quick Reference Card

┌────────────────────────────────────────────────────────────┐
│              GENTOO BUILD SWARM - QUICK REFERENCE          │
├────────────────────────────────────────────────────────────┤
│                                                            │
│  CHECK STATUS:     build-swarm status                      │
│  START BUILD:      build-swarm fresh                       │
│  WATCH PROGRESS:   build-swarm monitor                     │
│  FIX PROBLEMS:     build-swarm fix-blocked                 │
│  VERIFY SAFE:      build-swarm verify                      │
│  RELEASE:          build-swarm finalize                    │
│  UPDATE DESKTOP:   sudo apkg update                        │
│                                                            │
│  DEPLOY CODE:      build-swarm push                        │
│  VIEW LOGS:        build-swarm logs <drone>                │
│  SYNC PORTAGE:     build-swarm sync-all                    │
│                                                            │
├────────────────────────────────────────────────────────────┤
│  Gateway:      10.42.0.199:8090                            │
│  Orchestrator: 10.42.0.201:8080                            │
│  Binhost:      http://10.42.0.201/packages                 │
└────────────────────────────────────────────────────────────┘
build-swarmoperationscommandstroubleshooting