Skip to main content
Embedders convert text into high-dimensional vectors. Similar texts produce similar vectors, which enables semantic search in a vector database.

OpenAIEmbedder

Uses OpenAI’s embedding models. The most common choice.
from definable.embedder import OpenAIEmbedder

embedder = OpenAIEmbedder(
    id="text-embedding-3-small",
    dimensions=1536,
)
id
str
default:"text-embedding-3-small"
OpenAI embedding model. Options include text-embedding-3-small, text-embedding-3-large, and text-embedding-ada-002.
dimensions
int
Output vector dimensions. Defaults to the model’s native dimensions. text-embedding-3-small supports up to 1536.
api_key
str
OpenAI API key. Defaults to the OPENAI_API_KEY environment variable.

Model Comparison

ModelDimensionsPerformanceCost
text-embedding-3-large3072HighestHigher
text-embedding-3-small1536GoodLow
text-embedding-ada-0021536GoodLow

VoyageAIEmbedder

Uses Voyage AI’s embedding models, which excel at domain-specific and multilingual content.
from definable.embedder import VoyageAIEmbedder

embedder = VoyageAIEmbedder(
    id="voyage-2",
    dimensions=1024,
)
id
str
default:"voyage-2"
Voyage AI model. Options include voyage-2, voyage-large-2, and others.
dimensions
int
default:"1024"
Output vector dimensions.
api_key
str
Voyage AI API key. Defaults to the VOYAGE_API_KEY environment variable.
Requires the voyageai package. Install with pip install voyageai.

Using Embedders

With Knowledge

Pass an embedder when creating a knowledge base:
from definable.embedder import OpenAIEmbedder
from definable.knowledge import Knowledge
from definable.vectordb import InMemoryVectorDB

knowledge = Knowledge(
    vector_db=InMemoryVectorDB(),
    embedder=OpenAIEmbedder(),
)

Standalone

Generate embeddings directly:
embedding = embedder.get_embedding("Hello, world!")
print(len(embedding))  # 1536

Batch Embedding

Embed multiple texts efficiently in a single API call:
texts = ["First document", "Second document", "Third document"]
embeddings, usages = await embedder.async_get_embeddings_batch_and_usage(texts)

FallbackEmbedder

Automatically fail over across multiple embedding providers. If the primary provider fails (rate limit, auth error, timeout), the next one is tried.
from definable.knowledge import FallbackEmbedder
from definable.embedder import OpenAIEmbedder, VoyageAIEmbedder

embedder = FallbackEmbedder(providers=[
    OpenAIEmbedder(),       # Primary
    VoyageAIEmbedder(),     # Fallback
])

# Use with Knowledge
knowledge = Knowledge(
    vector_db=InMemoryVectorDB(),
    embedder=embedder,
)
The fallback embedder inherits dimensions from the primary provider and automatically switches providers on failure. Call embedder.reset() to return to the primary provider.
Errors are classified by type (auth, rate limit, timeout, network) using duck typing on exception class names and messages — no provider SDK imports needed.

Creating a Custom Embedder

Subclass Embedder and implement the embedding methods:
from definable.knowledge.embedder import Embedder

class LocalEmbedder(Embedder):
    dimensions: int = 384

    def get_embedding(self, text: str) -> list[float]:
        from sentence_transformers import SentenceTransformer
        model = SentenceTransformer("all-MiniLM-L6-v2")
        return model.encode(text).tolist()

    async def async_get_embedding(self, text: str) -> list[float]:
        return self.get_embedding(text)

Embedder Interface

MethodDescription
get_embedding(text) -> List[float]Get embedding synchronously
async_get_embedding(text) -> List[float]Get embedding asynchronously
get_embedding_and_usage(text)Get embedding with usage stats
async_get_embedding_and_usage(text)Async variant with usage stats
Make sure the dimensions on your embedder matches the dimensions on your vector database. Mismatched dimensions will cause errors during search.