RAG Embedding Auto-Detection
Automatic embedding model detection for RAG stores to prevent dimension mismatches
Overview
The RAG system now includes automatic embedding model detection to prevent errors when initializing RAG stores with different embedding models.
The Problem
When using multiple RAG stores with different embedding models, dimension mismatches can occur:
Store A: Created with nomic-embed-text (768-dimensional embeddings)
Store B: Created with qwen3-embedding:0.6b (1024-dimensional embeddings)
If Agent initializes Store B with nomic-embed-text:
❌ ERROR: Vector dimension mismatch (768 vs 1024)
❌ Semantic search fails
❌ RAG queries error out
The Solution
The Agent now probes each RAG store to detect the actual embedding dimensions and automatically selects the matching model:
// Detect what model was used to create the store
const dims = probeStore(store); // Returns 768, 1024, or 4096
// Map to correct model
const model = {
768: 'nomic-embed-text', // 768-dimensional
1024: 'qwen3-embedding:0.6b', // 1024-dimensional
4096: 'qwen3-embedding:8b' // 4096-dimensional
}[dims];
// Use correct model automatically
createRAGEngine(store, { model }); // ✅ Works!
How It Works
Detection Process
-
Probe the store — Check if stored embeddings exist
SELECT embedding FROM chunks WHERE embedding IS NOT NULL LIMIT 1 -
Measure dimensions — Calculate byte length of embedding vector
dims = embedding.byteLength / 4 // Each float32 is 4 bytes -
Map to model — Match dimensions to known models
768-dim → nomic-embed-text 1024-dim → qwen3-embedding:0.6b 4096-dim → qwen3-embedding:8b -
Use detected model — Initialize RAG engine with correct model
Behavior
- Silent success: If detected model matches configured → no logging
- Log on mismatch: If auto-detected differs from configured → logs:
[Agent] Auto-detected 768-dim embeddings, using nomic-embed-text (configured: qwen3:0.6b) - Fallback: If detection fails → uses configured model as default
Applied To
The detection works across all RAG stores:
Main RAG Store
const ragModel = await detectEmbeddingModel(mainStore, 'nomic-embed-text');
this.rag = createRAGEngine(mainStore, { model: ragModel });
Blog RAG Store
const blogModel = await detectEmbeddingModel(blogStore, 'nomic-embed-text');
this.blogRag = createRAGEngine(blogStore, { model: blogModel });
Vaults RAG Store
const vaultsModel = await detectEmbeddingModel(vaultsStore, 'nomic-embed-text');
this.vaultsRag = createRAGEngine(vaultsStore, { model: vaultsModel });
Use Cases
Safe Model Upgrades
Gradually upgrade from one embedding model to another:
Phase 1: Store A has nomic-embed-text (768-dim)
Agent initializes with nomic
Phase 2: Rebuild Store A with qwen3:0.6b (1024-dim)
Agent auto-detects → switches to qwen3
No configuration change needed!
Phase 3: Gradually migrate other stores
Each detected and used automatically
Mixed-Generation Stores
If your stores use different models:
Store 1 (old): nomic-embed-text (768-dim)
Store 2 (new): qwen3:0.6b (1024-dim)
Store 3 (new): qwen3:8b (4096-dim)
Agent automatically:
✅ Detects each store's dimensions
✅ Uses correct model for each
✅ No errors or configuration needed
Fallback Behavior
If detection fails:
try {
const dims = queryStore(store);
return DIM_TO_MODEL[dims] || defaultModel;
} catch {
// Detection failed, use default
return defaultModel;
}
Limitations
What Auto-Detection Does
✅ Detect embedding dimensions from stored data ✅ Select matching embedding model ✅ Prevent dimension mismatch errors ✅ Enable safe model upgrades
What Auto-Detection Does NOT
✗ Rebuild existing stores to new dimensions ✗ Migrate data between models ✗ Validate embedding quality ✗ Handle corrupted/invalid embeddings
Configuration
Default Models
Each RAG store has a configured default:
// Main store default
model: this.config.embeddingModel ?? 'nomic-embed-text'
// Blog store default
model: this.config.embeddingModel ?? 'nomic-embed-text'
// Vaults store default
model: this.config.embeddingModel ?? 'nomic-embed-text'
Override Default
To specify a default model:
const agent = new Agent({
embeddingModel: 'qwen3-embedding:8b',
// Will use 8b unless detection finds different dimensions
});
Troubleshooting
”Auto-detected different model”
Log: [Agent] Auto-detected 1024-dim embeddings, using qwen3-embedding:0.6b (configured: nomic-embed-text)
Meaning: The store was created with one model, but config specifies another.
Actions:
- ✅ This is fine — auto-detection fixed it
- Update config to match if you want consistency
- Or rebuild the store with configured model
Detection failed, using default
Meaning: The RAG store had no embeddings to probe.
Reasons:
- Store is empty (no chunks with embeddings)
- Store is corrupted or incomplete
- Permission issue accessing the data
Actions:
- Check if store has data:
SELECT COUNT(*) FROM chunks - Verify embeddings exist:
SELECT COUNT(*) FROM chunks WHERE embedding IS NOT NULL - Rebuild store if corrupted
Embedding dimension mismatch still occurring
Meaning: Auto-detection ran but didn’t catch the issue.
Reasons:
- Store has mixed dimensions (multiple embedding models in same store)
- Store has invalid embeddings
- Detection skipped due to error
Actions:
- Check logs for detection errors
- Query store:
SELECT DISTINCT embedding.length FROM chunks - Consider rebuilding the store with single model
Performance Impact
Startup Cost
- Detection time: ~50-100ms per store (single database query)
- When it runs: Once per RAG engine initialization (startup only)
- Impact: Negligible (human-imperceptible)
Query Performance
- No impact: Detection is build-time only
- Semantic search: Same speed as before
- Response time: Unaffected
Implementation Details
Detection Code
Located in: packages/argonaut/src/core/agent.ts
async function detectEmbeddingModel(store: RAGStore, defaultModel: string): Promise<string> {
if (typeof (store as any).db?.prepare !== 'function') return defaultModel;
try {
const row = (store as any).db.prepare(
'SELECT embedding FROM chunks WHERE embedding IS NOT NULL AND deleted = 0 LIMIT 1'
).get() as { embedding: Buffer } | undefined;
if (!row) return defaultModel;
const dims = row.embedding.byteLength / 4;
const match = DIM_TO_MODEL[dims];
if (match && match !== defaultModel) {
console.log(`[Agent] Auto-detected ${dims}-dim embeddings, using ${match} (configured: ${defaultModel})`);
return match;
}
} catch { /* probe failed, use default */ }
return defaultModel;
}
Dimension Mapping
const DIM_TO_MODEL: Record<number, string> = {
768: 'nomic-embed-text', // ~50MB per 10k chunks
1024: 'qwen3-embedding:0.6b', // ~68MB per 10k chunks
4096: 'qwen3-embedding:8b', // ~270MB per 10k chunks
};
Best Practices
1. Keep stores consistent
Good: All stores use same model
knowledge store: nomic-embed-text (768-dim)
blog store: nomic-embed-text (768-dim)
vaults store: nomic-embed-text (768-dim)
Okay: Different models, but each detected correctly
knowledge store: nomic (768-dim)
blog store: qwen3:8b (4096-dim) ← Different, but both work
Avoid: Mixing models in same store
chunks 1-1000: nomic (768-dim)
chunks 1001+: qwen3:8b (4096-dim) ← ❌ Causes errors
2. Document your model choice
# Knowledge Base
Using `nomic-embed-text` because:
- Fast (CPU-only inference)
- Good quality (OpenAI compatible)
- Small dimensions (768-d, ~50MB per 10k chunks)
- Widely supported
3. Test after model changes
# After updating a store's embedding model:
1. Rebuild the store with new model
2. Restart the Agent (to trigger detection)
3. Run sample RAG queries
4. Verify results quality
5. Check logs for auto-detection message
4. Monitor logs
Watch for auto-detection messages:
✅ Good: No logs = detected model matches configured
⚠️ Okay: Logs show detection = mismatch caught automatically
❌ Bad: Errors about dimension mismatch = detection failed
Related Topics
- Embedding Models — Choosing embedding models
- RAG Architecture — How RAG system works
- Argonaut Agent — Agent initialization and configuration
Deployed Version
Feature: Auto-detection of embedding model dimensions Commit: 84d182f Date: 2026-03-08 Status: Production-ready
Last Updated
Documentation: 2026-03-08 Code: 2026-03-08 (commit 84d182f)