Skip to main content
CognitiveMemory gives agents the ability to remember information across conversations. It stores episodes (conversation turns), extracts knowledge atoms (facts), learns procedures (behavioral patterns), and predicts topic transitions — all automatically.

Quick Example

from definable.agents import Agent
from definable.memory import CognitiveMemory, SQLiteMemoryStore
from definable.models import OpenAIChat

store = SQLiteMemoryStore(db_path="./memory.db")
memory = CognitiveMemory(store=store)

agent = Agent(
  model=OpenAIChat(id="gpt-4o"),
  instructions="You are a helpful assistant.",
  memory=memory,
)

# First conversation — memory stores what it learns
output = agent.run("My name is Alice and I work at Acme Corp.", user_id="alice")

# Later conversation — memory recalls relevant context
output = agent.run("Where do I work?", user_id="alice")
print(output.content)  # "You work at Acme Corp."

How It Works

Before each model call, the agent recalls relevant memories and injects them into the system prompt. After each response, the agent stores the conversation as new episodes in the background (fire-and-forget).

Architecture

CognitiveMemory has four memory tiers:
TierTypeDescription
EpisodesRaw conversationsIndividual messages with embeddings, topics, and sentiment
Knowledge AtomsExtracted factsSubject-predicate-object triples (e.g., “Alice works-at Acme Corp”)
ProceduresBehavioral patternsTrigger-action pairs learned from repeated behavior
Topic TransitionsPredictiveTracks which topics follow which, for anticipatory recall

CognitiveMemory Constructor

from definable.memory import CognitiveMemory

memory = CognitiveMemory(
  store=store,                    # Required: MemoryStore backend
  token_budget=500,               # Max tokens for recalled context (default: 500)
  embedder=None,                  # Optional: Embedder for semantic search
  distillation_model=None,        # Optional: Model for summarization/extraction
  config=None,                    # Optional: MemoryConfig for fine-tuning
)
store
MemoryStore
required
The storage backend. See Memory Stores for available options.
token_budget
int
default:500
Maximum number of tokens to include in the recalled memory context injected into the system prompt.
embedder
Embedder
default:"None"
An embedder for generating vector representations. When None, semantic search falls back to text matching on recent episodes.
distillation_model
Model
default:"None"
A model used to compress episodes into summaries, facts, and atoms. When None, distillation is skipped.
config
MemoryConfig
default:"None"
Fine-tuning configuration. See MemoryConfig below.

Public Methods

MethodDescription
await recall(query, *, user_id=None, session_id=None)Retrieve relevant memories as a MemoryPayload
await store_messages(messages, *, user_id=None, session_id=None)Store conversation messages as episodes
await run_distillation(*, user_id=None)Manually trigger distillation (episode compression)
await forget(*, user_id=None, session_id=None)Delete stored memories for a user or session
await close()Close the underlying store connection
CognitiveMemory also supports async context manager (async with):
async with CognitiveMemory(store=store) as memory:
    payload = await memory.recall("What's Alice's job?", user_id="alice")
    print(payload.context)

MemoryPayload

The recall() method returns a MemoryPayload:
FieldTypeDescription
contextstrFormatted XML string ready for system prompt injection
tokens_usedintNumber of tokens in the context
chunks_includedintNumber of memory chunks included
chunks_availableintTotal chunks available before budget trimming

MemoryConfig

Fine-tune memory behavior with MemoryConfig:
from definable.memory import MemoryConfig, ScoringWeights

config = MemoryConfig(
  decay_half_life_days=14.0,          # Recency decay half-life
  scoring_weights=ScoringWeights(
    semantic_similarity=0.35,          # How relevant to current query
    recency=0.25,                      # How recent the memory is
    access_frequency=0.15,             # How often accessed
    predicted_need=0.15,               # Topic transition prediction
    emotional_salience=0.10,           # Sentiment intensity
  ),
  retrieval_top_k=20,                  # Candidates to score
  recent_episodes_limit=5,            # Always include N most recent
  distillation_stage_0_age=3600.0,    # Raw → summary after 1 hour
  distillation_stage_1_age=86400.0,   # Summary → facts after 1 day
  distillation_stage_2_age=604800.0,  # Facts → atoms after 1 week
  distillation_stage_3_age=2592000.0, # Final compression after 30 days
  distillation_batch_size=10,
  reinforcement_boost=0.15,
  topic_transition_min_count=3,
  topic_transition_min_probability=0.3,
)
The five ScoringWeights must sum to 1.0. Validation runs at construction time.

Distillation

Distillation progressively compresses memories over time to stay within storage and retrieval budgets:
StageNameTriggered AfterDescription
0RawOriginal message text
1Summary1 hourCondensed version of the episode
2Facts1 dayKey facts extracted from the summary
3Atoms1 weekSubject-predicate-object triples
Distillation requires a distillation_model to be set. Without it, episodes remain at stage 0 indefinitely.

What’s Next