Argo Knowledge RAG
The brain behind ArgoBox's AI assistant
Hybrid BM25 + vector search across 166K+ chunks — 100% local, zero API cost
What It Powers
Public AI Assistant
The “Ask Argonaut” chat on every page uses this RAG system to answer questions about ArgoBox with accurate, sourced responses.
Try it live →Private Knowledge Engine
Admin-only tiers search across 10,000+ documents from Obsidian vaults, legal PDFs, and raw session transcripts — all running locally.
Claude Code Integration
Powers Claude Code's AI context system, letting it find relevant documentation and past decisions across the entire project history.
Try It
Click a topic below or type your own query
How It Works
Ingest
Reads MD, PDF, DOCX, RTF, EML, XLSX — extracts text and frontmatter
Chunk
Paragraph-aware splitting (400 words, 80 overlap) preserves semantic units
Embed
GPU-accelerated via Ollama (qwen3-embedding or nomic-embed-text)
Search
Hybrid FTS5 BM25 + vector cosine similarity with configurable weights
Four-Tier Architecture
Each tier adds scope and security. Data flows down — never up.
Blog posts, docs, project pages. Served from CDN via Cloudflare Pages. Powers the public AI assistant.
Sanitized content safe for external AI. Runs through 148 regex patterns that strip real IPs, hostnames, and usernames.
Full Obsidian vaults with raw content. Local access only — never leaves the machine.
Everything including legal documents, personal notes, and raw transcripts. Maximum security.
Key Features
100% Local
All embedding runs on a local GPU. Data never leaves the machine. Zero per-query cost for local tiers.
Hybrid Search
FTS5 BM25 for keyword precision + cosine similarity for semantic understanding. Configurable 0.3/0.7 weighting.
Multi-Format Parser
Ingests MD, PDF, DOCX, RTF, EML, MSG, XLSX via subprocess parsers. Handles corrupted files gracefully.
Content-Hash Dedup
SHA-256 hashing prevents duplicate ingestion. Re-run safely — only new or changed files are processed.
Identity Sanitization
148-pattern regex system strips real names, IPs, and credentials for safe external sharing.
Streaming Vector Scan
Never loads all 166K chunks into memory. Streams through the database for constant memory usage at any scale.