OpenAIEmbedder
Uses OpenAI’s embedding models. The most common choice.OpenAI embedding model. Options include
text-embedding-3-small, text-embedding-3-large, and text-embedding-ada-002.Output vector dimensions. Defaults to the model’s native dimensions.
text-embedding-3-small supports up to 1536.OpenAI API key. Defaults to the
OPENAI_API_KEY environment variable.Model Comparison
| Model | Dimensions | Performance | Cost |
|---|---|---|---|
text-embedding-3-large | 3072 | Highest | Higher |
text-embedding-3-small | 1536 | Good | Low |
text-embedding-ada-002 | 1536 | Good | Low |
VoyageAIEmbedder
Uses Voyage AI’s embedding models, which excel at domain-specific and multilingual content.Voyage AI model. Options include
voyage-2, voyage-large-2, and others.Output vector dimensions.
Voyage AI API key. Defaults to the
VOYAGE_API_KEY environment variable.Requires the
voyageai package. Install with pip install voyageai.Using Embedders
With Knowledge
Pass an embedder when creating a knowledge base:Standalone
Generate embeddings directly:Batch Embedding
Embed multiple texts efficiently in a single API call:FallbackEmbedder
Automatically fail over across multiple embedding providers. If the primary provider fails (rate limit, auth error, timeout), the next one is tried.dimensions from the primary provider and automatically switches providers on failure. Call embedder.reset() to return to the primary provider.
Errors are classified by type (auth, rate limit, timeout, network) using duck typing on exception class names and messages — no provider SDK imports needed.
Creating a Custom Embedder
SubclassEmbedder and implement the embedding methods:
Embedder Interface
| Method | Description |
|---|---|
get_embedding(text) -> List[float] | Get embedding synchronously |
async_get_embedding(text) -> List[float] | Get embedding asynchronously |
get_embedding_and_usage(text) | Get embedding with usage stats |
async_get_embedding_and_usage(text) | Async variant with usage stats |