Skip to main content
Vector databases store document embeddings and enable fast similarity search. Definable includes two implementations and a base class for building your own.

InMemoryVectorDB

Stores everything in memory. Great for development, testing, and small datasets.
from definable.knowledge import InMemoryVectorDB

vector_db = InMemoryVectorDB(
    collection_name="my_docs",
    dimensions=1536,
)
collection_name
str
default:"default"
Name for this collection of documents.
dimensions
int
default:"1536"
Vector dimensions. Must match your embedder output.
Characteristics:
  • No external dependencies
  • Uses cosine similarity for search
  • Supports metadata filtering
  • Data is lost when the process exits

PgVectorDB

Uses PostgreSQL with the pgvector extension. Suitable for production workloads with persistent storage and scalable search.
from definable.knowledge import PgVectorDB

vector_db = PgVectorDB(
    connection_string="postgresql://user:pass@localhost:5432/mydb",
    table_name="documents",
    dimensions=1536,
)
connection_string
str
PostgreSQL connection string. Defaults to the DATABASE_URL environment variable.
table_name
str
default:"documents"
Database table name for storing documents and embeddings.
dimensions
int
default:"1536"
Vector dimensions. Must match your embedder.
Requires psycopg[binary] and pgvector. Install with:
pip install "psycopg[binary]" pgvector
Your PostgreSQL instance must have the pgvector extension enabled:
CREATE EXTENSION IF NOT EXISTS vector;
Characteristics:
  • Persistent storage across restarts
  • Uses IVFFlat index for fast approximate nearest neighbor search
  • Supports metadata filtering
  • Scales to millions of documents

Using with Knowledge

from definable.knowledge import Knowledge, OpenAIEmbedder, InMemoryVectorDB

knowledge = Knowledge(
    vector_db=InMemoryVectorDB(),
    embedder=OpenAIEmbedder(),
)

VectorDB Interface

Both implementations share the same interface:
MethodDescription
add(documents) -> List[str]Add documents, returns their IDs
aadd(documents) -> List[str]Async add
search(query_embedding, top_k, filter)Search by vector similarity
asearch(query_embedding, top_k, filter)Async search
delete(ids)Delete documents by ID
clear()Remove all documents
count() -> intNumber of stored documents

Metadata Filtering

Filter search results by document metadata:
results = knowledge.search(
    "Python tutorials",
    filter={"category": "tutorial", "language": "python"},
)
The exact filter syntax depends on the vector database implementation.

Creating a Custom VectorDB

Subclass VectorDB to integrate your preferred vector store:
from definable.knowledge.vector_dbs import VectorDB
from definable.knowledge import Document

class QdrantVectorDB(VectorDB):
    collection_name: str = "default"
    dimensions: int = 1536

    def __init__(self, url: str = "localhost", port: int = 6333, **kwargs):
        super().__init__(**kwargs)
        from qdrant_client import QdrantClient
        self.client = QdrantClient(url=url, port=port)

    def add(self, documents: list[Document]) -> list[str]:
        # Implementation: upsert points to Qdrant
        ...

    def search(self, query_embedding, top_k=10, filter=None) -> list[Document]:
        # Implementation: search Qdrant collection
        ...

    def delete(self, ids: list[str]) -> None:
        # Implementation: delete points from Qdrant
        ...

    def clear(self) -> None:
        # Implementation: delete and recreate collection
        ...

    def count(self) -> int:
        # Implementation: count points in collection
        ...

Choosing a Vector Database

InMemoryVectorDBPgVectorDB
SetupNoneRequires PostgreSQL
PersistenceNoYes
ScaleThousands of docsMillions of docs
Best forDevelopment, testing, demosProduction