RAG Embedding Auto-Detection

Overview

The RAG system now includes automatic embedding model detection to prevent errors when initializing RAG stores with different embedding models.

The Problem

When using multiple RAG stores with different embedding models, dimension mismatches can occur:

Store A: Created with nomic-embed-text (768-dimensional embeddings)
Store B: Created with qwen3-embedding:0.6b (1024-dimensional embeddings)

If Agent initializes Store B with nomic-embed-text:
❌ ERROR: Vector dimension mismatch (768 vs 1024)
❌ Semantic search fails
❌ RAG queries error out

The Solution

The Agent now probes each RAG store to detect the actual embedding dimensions and automatically selects the matching model:

// Detect what model was used to create the store
const dims = probeStore(store);  // Returns 768, 1024, or 4096

// Map to correct model
const model = {
  768: 'nomic-embed-text',        // 768-dimensional
  1024: 'qwen3-embedding:0.6b',   // 1024-dimensional
  4096: 'qwen3-embedding:8b'      // 4096-dimensional
}[dims];

// Use correct model automatically
createRAGEngine(store, { model });  // ✅ Works!

How It Works

Detection Process

Probe the store — Check if stored embeddings exist

SELECT embedding FROM chunks WHERE embedding IS NOT NULL LIMIT 1

Measure dimensions — Calculate byte length of embedding vector
```
dims = embedding.byteLength / 4  // Each float32 is 4 bytes
```

Map to model — Match dimensions to known models

768-dim  → nomic-embed-text
1024-dim → qwen3-embedding:0.6b
4096-dim → qwen3-embedding:8b

Use detected model — Initialize RAG engine with correct model

Behavior

Silent success: If detected model matches configured → no logging

Log on mismatch: If auto-detected differs from configured → logs:

[Agent] Auto-detected 768-dim embeddings, using nomic-embed-text (configured: qwen3:0.6b)

Fallback: If detection fails → uses configured model as default

Applied To

The detection works across all RAG stores:

Main RAG Store

const ragModel = await detectEmbeddingModel(mainStore, 'nomic-embed-text');
this.rag = createRAGEngine(mainStore, { model: ragModel });

Blog RAG Store

const blogModel = await detectEmbeddingModel(blogStore, 'nomic-embed-text');
this.blogRag = createRAGEngine(blogStore, { model: blogModel });

Vaults RAG Store

const vaultsModel = await detectEmbeddingModel(vaultsStore, 'nomic-embed-text');
this.vaultsRag = createRAGEngine(vaultsStore, { model: vaultsModel });

Use Cases

Safe Model Upgrades

Gradually upgrade from one embedding model to another:

Phase 1: Store A has nomic-embed-text (768-dim)
         Agent initializes with nomic

Phase 2: Rebuild Store A with qwen3:0.6b (1024-dim)
         Agent auto-detects → switches to qwen3
         No configuration change needed!

Phase 3: Gradually migrate other stores
         Each detected and used automatically

Mixed-Generation Stores

If your stores use different models:

Store 1 (old):  nomic-embed-text (768-dim)
Store 2 (new):  qwen3:0.6b (1024-dim)
Store 3 (new):  qwen3:8b (4096-dim)

Agent automatically:
✅ Detects each store's dimensions
✅ Uses correct model for each
✅ No errors or configuration needed

Fallback Behavior

If detection fails:

try {
  const dims = queryStore(store);
  return DIM_TO_MODEL[dims] || defaultModel;
} catch {
  // Detection failed, use default
  return defaultModel;
}

Limitations

What Auto-Detection Does

✅ Detect embedding dimensions from stored data ✅ Select matching embedding model ✅ Prevent dimension mismatch errors ✅ Enable safe model upgrades

What Auto-Detection Does NOT

✗ Rebuild existing stores to new dimensions ✗ Migrate data between models ✗ Validate embedding quality ✗ Handle corrupted/invalid embeddings

Configuration

Default Models

Each RAG store has a configured default:

// Main store default
model: this.config.embeddingModel ?? 'nomic-embed-text'

// Blog store default
model: this.config.embeddingModel ?? 'nomic-embed-text'

// Vaults store default
model: this.config.embeddingModel ?? 'nomic-embed-text'

Override Default

To specify a default model:

const agent = new Agent({
  embeddingModel: 'qwen3-embedding:8b',
  // Will use 8b unless detection finds different dimensions
});

Troubleshooting

"Auto-detected different model"

Log: [Agent] Auto-detected 1024-dim embeddings, using qwen3-embedding:0.6b (configured: nomic-embed-text)

Meaning: The store was created with one model, but config specifies another.

Actions:

✅ This is fine — auto-detection fixed it
Update config to match if you want consistency
Or rebuild the store with configured model

Detection failed, using default

Meaning: The RAG store had no embeddings to probe.

Reasons:

Store is empty (no chunks with embeddings)
Store is corrupted or incomplete
Permission issue accessing the data

Actions:

Check if store has data: SELECT COUNT(*) FROM chunks
Verify embeddings exist: SELECT COUNT(*) FROM chunks WHERE embedding IS NOT NULL
Rebuild store if corrupted

Embedding dimension mismatch still occurring

Meaning: Auto-detection ran but didn't catch the issue.

Reasons:

Store has mixed dimensions (multiple embedding models in same store)
Store has invalid embeddings
Detection skipped due to error

Actions:

Check logs for detection errors
Query store: SELECT DISTINCT embedding.length FROM chunks
Consider rebuilding the store with single model

Performance Impact

Startup Cost

Detection time: ~50-100ms per store (single database query)
When it runs: Once per RAG engine initialization (startup only)
Impact: Negligible (human-imperceptible)

Query Performance

No impact: Detection is build-time only
Semantic search: Same speed as before
Response time: Unaffected

Implementation Details

Detection Code

Located in: packages/argonaut/src/core/agent.ts

async function detectEmbeddingModel(store: RAGStore, defaultModel: string): Promise<string> {
  if (typeof (store as any).db?.prepare !== 'function') return defaultModel;
  try {
    const row = (store as any).db.prepare(
      'SELECT embedding FROM chunks WHERE embedding IS NOT NULL AND deleted = 0 LIMIT 1'
    ).get() as { embedding: Buffer } | undefined;

    if (!row) return defaultModel;

    const dims = row.embedding.byteLength / 4;
    const match = DIM_TO_MODEL[dims];

    if (match && match !== defaultModel) {
      console.log(`[Agent] Auto-detected ${dims}-dim embeddings, using ${match} (configured: ${defaultModel})`);
      return match;
    }
  } catch { /* probe failed, use default */ }
  return defaultModel;
}

Dimension Mapping

const DIM_TO_MODEL: Record<number, string> = {
  768: 'nomic-embed-text',        // ~50MB per 10k chunks
  1024: 'qwen3-embedding:0.6b',   // ~68MB per 10k chunks
  4096: 'qwen3-embedding:8b',     // ~270MB per 10k chunks
};

Best Practices

1. Keep stores consistent

Good: All stores use same model

knowledge store: nomic-embed-text (768-dim)
blog store: nomic-embed-text (768-dim)
vaults store: nomic-embed-text (768-dim)

Okay: Different models, but each detected correctly

knowledge store: nomic (768-dim)
blog store: qwen3:8b (4096-dim)  ← Different, but both work

Avoid: Mixing models in same store

chunks 1-1000: nomic (768-dim)
chunks 1001+: qwen3:8b (4096-dim)  ← ❌ Causes errors

2. Document your model choice

# Knowledge Base

Using `nomic-embed-text` because:
- Fast (CPU-only inference)
- Good quality (OpenAI compatible)
- Small dimensions (768-d, ~50MB per 10k chunks)
- Widely supported

3. Test after model changes

# After updating a store's embedding model:
1. Rebuild the store with new model
2. Restart the Agent (to trigger detection)
3. Run sample RAG queries
4. Verify results quality
5. Check logs for auto-detection message

4. Monitor logs

Watch for auto-detection messages:

✅ Good: No logs = detected model matches configured
⚠️ Okay: Logs show detection = mismatch caught automatically
❌ Bad: Errors about dimension mismatch = detection failed

Deployed Version

Feature: Auto-detection of embedding model dimensions Commit: 84d182f Date: 2026-03-08 Status: Production-ready

Last Updated

Documentation: 2026-03-08 Code: 2026-03-08 (commit 84d182f)

Overview

The Problem

The Solution

How It Works

Detection Process

Behavior

Applied To

Main RAG Store

Blog RAG Store

Vaults RAG Store

Use Cases

Safe Model Upgrades

Mixed-Generation Stores

Fallback Behavior

Limitations

What Auto-Detection Does

What Auto-Detection Does NOT

Configuration

Default Models

Override Default

Troubleshooting

"Auto-detected different model"

Detection failed, using default

Embedding dimension mismatch still occurring

Performance Impact

Startup Cost

Query Performance

Implementation Details

Detection Code

Dimension Mapping

Best Practices

1. Keep stores consistent

2. Document your model choice

3. Test after model changes

4. Monitor logs

Related Topics

Deployed Version

Last Updated