The Knowledge module supports advanced retrieval strategies beyond basic vector similarity search. Combine full-text search (BM25), diversity reranking (MMR), and temporal decay to get better, more diverse, and fresher results.Documentation Index
Fetch the complete documentation index at: https://docs.definable.ai/llms.txt
Use this file to discover all available pages before exploring further.
Hybrid Search (Vector + Full-Text)
Hybrid search merges vector similarity results with BM25 full-text search for better recall, especially for keyword-heavy queries.HybridSearchConfig
Weight for vector similarity scores in the merged results.
Weight for BM25 full-text search scores.
"rrf" (Reciprocal Rank Fusion) or "weighted" (normalized score combination).RRF smoothing constant. Higher values reduce the impact of rank differences.
Fetch this many times
limit from FTS to ensure good coverage before merging.FTSIndex
The full-text search index uses SQLite FTS5 for fast keyword matching.When used with
Knowledge, the FTS index is populated automatically during aadd() — no manual indexing needed.Merge Strategies
RRF (Reciprocal Rank Fusion) — default. Combines rankings from both sources usingscore = sum(1 / (k + rank)). Works well when score distributions differ between vector and text search.
Weighted — normalizes both score sets to [0, 1] then combines: score = vector_weight * vector_score + text_weight * text_score. Best when scores are comparable.
MMR (Maximal Marginal Relevance)
MMR reranking balances relevance with diversity. It prevents returning near-duplicate results by penalizing documents that are too similar to already-selected ones.Balance between relevance (1.0 = pure relevance) and diversity (0.0 = maximum diversity).
Whether MMR reranking is active.
- Compute relevance score for each document (cosine similarity to query embedding, or Jaccard text similarity as fallback)
- For each selection step:
MMR = lambda * relevance - (1 - lambda) * max_similarity_to_selected - Select the document with the highest MMR score
MMR works best with embeddings. If documents lack embeddings, it falls back to Jaccard text similarity, which is less accurate but still provides diversity.
Temporal Decay
Score documents lower as they age. Useful for news, social media, and other time-sensitive content.Number of days until a document’s score is halved. Smaller values = faster decay.
Whether temporal decay is active.
Timestamps
Temporal decay reads timestamps from document metadata:Evergreen Documents
Mark documents as evergreen to exempt them from decay:Combining Strategies
All scoring strategies compose. The search pipeline applies them in order:Fallback Embedder
Automatically fail over across multiple embedding providers:- Tries providers in order (primary first)
- Automatically classifies errors (auth, rate limit, timeout, network)
- Switches to the next provider on failure
- Call
embedder.reset()to return to the primary provider