Quick Example
How It Works
Before each model call, the agent recalls relevant memories and injects them into the system prompt. After each response, the agent stores the conversation as new episodes in the background (fire-and-forget).Architecture
CognitiveMemory has four memory tiers:| Tier | Type | Description |
|---|---|---|
| Episodes | Raw conversations | Individual messages with embeddings, topics, and sentiment |
| Knowledge Atoms | Extracted facts | Subject-predicate-object triples (e.g., “Alice works-at Acme Corp”) |
| Procedures | Behavioral patterns | Trigger-action pairs learned from repeated behavior |
| Topic Transitions | Predictive | Tracks which topics follow which, for anticipatory recall |
CognitiveMemory Constructor
The storage backend. See Memory Stores for available options.
Maximum number of tokens to include in the recalled memory context injected into the system prompt.
An embedder for generating vector representations. When
None, semantic search falls back to text matching on recent episodes.A model used to compress episodes into summaries, facts, and atoms. When
None, distillation is skipped.Fine-tuning configuration. See MemoryConfig below.
Public Methods
| Method | Description |
|---|---|
await recall(query, *, user_id=None, session_id=None) | Retrieve relevant memories as a MemoryPayload |
await store_messages(messages, *, user_id=None, session_id=None) | Store conversation messages as episodes |
await run_distillation(*, user_id=None) | Manually trigger distillation (episode compression) |
await forget(*, user_id=None, session_id=None) | Delete stored memories for a user or session |
await close() | Close the underlying store connection |
async with):
MemoryPayload
Therecall() method returns a MemoryPayload:
| Field | Type | Description |
|---|---|---|
context | str | Formatted XML string ready for system prompt injection |
tokens_used | int | Number of tokens in the context |
chunks_included | int | Number of memory chunks included |
chunks_available | int | Total chunks available before budget trimming |
MemoryConfig
Fine-tune memory behavior withMemoryConfig:
The five
ScoringWeights must sum to 1.0. Validation runs at construction time.Distillation
Distillation progressively compresses memories over time to stay within storage and retrieval budgets:| Stage | Name | Triggered After | Description |
|---|---|---|---|
| 0 | Raw | — | Original message text |
| 1 | Summary | 1 hour | Condensed version of the episode |
| 2 | Facts | 1 day | Key facts extracted from the summary |
| 3 | Atoms | 1 week | Subject-predicate-object triples |
distillation_model to be set. Without it, episodes remain at stage 0 indefinitely.