Vector Databases - Definable AI

Vector databases store document embeddings and enable fast similarity search. Definable includes seven implementations in the definable.vectordb module and a base class for custom backends.

All vector DB classes are imported from definable.vectordb, not definable.knowledge. The definable.knowledge module re-exports InMemoryVectorDB for backward compatibility but will show a deprecation warning.

InMemoryVectorDB

Stores everything in memory. Great for development, testing, and small datasets.

from definable.vectordb import InMemoryVectorDB

vector_db = InMemoryVectorDB(name="my_docs")

Characteristics:

No external dependencies (requires numpy)
Uses cosine similarity for search
Data is lost when the process exits

PgVector

Uses PostgreSQL with the pgvector extension. Suitable for production workloads with persistent storage and scalable search.

from definable.vectordb import PgVector

vector_db = PgVector(
    db_url="postgresql://user:pass@localhost:5432/mydb",
    table_name="documents",
)

Requires psycopg[binary] and pgvector. Install with:

pip install "psycopg[binary]" pgvector

Your PostgreSQL instance must have the pgvector extension enabled:

CREATE EXTENSION IF NOT EXISTS vector;

Qdrant

High-performance vector search engine.

from definable.vectordb import Qdrant

vector_db = Qdrant(
    url="localhost",
    port=6333,
    collection="my_docs",
    dimensions=1536,
)

ChromaDb

from definable.vectordb import ChromaDb

vector_db = ChromaDb(
    collection="my_docs",
    path="./chroma_data",  # Omit for in-memory mode
)

MongoDb

MongoDB Atlas vector search.

from definable.vectordb import MongoDb

vector_db = MongoDb(
    connection_string="mongodb+srv://...",
    database="mydb",
    collection="documents",
    dimensions=1536,
)

RedisDB

Redis with RediSearch for vector similarity.

from definable.vectordb import RedisDB

vector_db = RedisDB(
    url="redis://localhost:6379",
    index_name="my_docs",
    dimensions=1536,
)

PineconeDb

Pinecone managed vector database.

from definable.vectordb import PineconeDb

vector_db = PineconeDb(
    api_key="your-pinecone-api-key",
    index_name="my_docs",
    dimensions=1536,
)

Using with Knowledge

Pass any vector DB instance to Knowledge:

from definable.embedder import OpenAIEmbedder
from definable.knowledge import Knowledge
from definable.vectordb import InMemoryVectorDB

knowledge = Knowledge(
    vector_db=InMemoryVectorDB(),
    embedder=OpenAIEmbedder(),
)

VectorDB Interface

All implementations share the same base interface from definable.vectordb.VectorDB:

Method	Description
`create()`	Create the collection / table if it doesn’t exist
`insert(content_hash, documents)`	Insert pre-embedded documents
`upsert(content_hash, documents)`	Insert or update pre-embedded documents
`search(query, limit, filters)`	Search by text query (backend embeds internally)
`count() -> int`	Number of stored documents
`delete_by_id(id)`	Delete a document by its ID
`delete()`	Delete the entire collection / table
`drop()`	Drop the collection / table from the backend
`ainsert(content_hash, documents)`	Async insert
`asearch(query, limit, filters)`	Async search

Creating a Custom VectorDB

Subclass VectorDB from definable.vectordb to integrate any vector store. The key abstract methods to implement are:

from definable.vectordb import VectorDB
from definable.knowledge import Document

class MyVectorDB(VectorDB):
    def create(self) -> None:
        # Create collection/table if it doesn't exist
        ...

    async def async_create(self) -> None:
        self.create()

    def insert(self, content_hash: str, documents: list[Document], filters=None) -> None:
        # Store pre-embedded documents
        ...

    async def async_insert(self, content_hash: str, documents: list[Document], filters=None) -> None:
        self.insert(content_hash, documents, filters)

    def upsert(self, content_hash: str, documents: list[Document], filters=None) -> None:
        self.insert(content_hash, documents, filters)

    async def async_upsert(self, content_hash: str, documents: list[Document], filters=None) -> None:
        self.upsert(content_hash, documents, filters)

    def search(self, query: str, limit: int = 5, filters=None) -> list[Document]:
        # Embed query and search
        ...

    async def async_search(self, query: str, limit: int = 5, filters=None) -> list[Document]:
        return self.search(query, limit, filters)

    def get_count(self) -> int:
        ...

    def delete(self) -> bool:
        # Delete the entire collection
        ...

    def delete_by_id(self, id: str) -> bool:
        ...

    def delete_by_name(self, name: str) -> bool:
        ...

    def delete_by_metadata(self, metadata: dict) -> bool:
        ...

    def delete_by_content_id(self, content_id: str) -> bool:
        ...

    def drop(self) -> None:
        ...

    async def async_drop(self) -> None:
        self.drop()

    def exists(self) -> bool:
        ...

    async def async_exists(self) -> bool:
        return self.exists()

    def name_exists(self, name: str) -> bool:
        ...

    def async_name_exists(self, name: str) -> bool:
        ...

    def id_exists(self, id: str) -> bool:
        ...

    def content_hash_exists(self, content_hash: str) -> bool:
        ...

    def get_supported_search_types(self) -> list[str]:
        return ["vector"]

Choosing a Vector Database

	InMemoryVectorDB	PgVector	Qdrant	ChromaDb	MongoDb	RedisDB	PineconeDb
Setup	None	PostgreSQL + pgvector	Qdrant server	None (in-memory) or local dir	MongoDB Atlas	Redis + RediSearch	Managed
Persistence	No	Yes	Yes	Optional	Yes	Yes	Yes
Scale	Thousands	Millions	Millions	Thousands–millions	Millions	Millions	Billions
Best for	Dev, testing	Existing PG infra	High performance	Local dev	Existing Mongo	Low latency	Serverless

Documentation Index

​InMemoryVectorDB

​PgVector

​Qdrant

​ChromaDb

​MongoDb

​RedisDB

​PineconeDb

​Using with Knowledge

​VectorDB Interface

​Creating a Custom VectorDB

​Choosing a Vector Database

InMemoryVectorDB

PgVector

Qdrant

ChromaDb

MongoDb

RedisDB

PineconeDb

Using with Knowledge

VectorDB Interface

Creating a Custom VectorDB

Choosing a Vector Database