memory= to an Agent, the agent automatically recalls relevant context before each model call and stores new memories after each response.
Setup
Execution Flow
- Recall — Before calling the model, the agent calls
memory.recall()with the user’s query. The returned context is injected into the system prompt. - Model call — The model receives the conversation plus the memory context, allowing it to reference past interactions.
- Store — After the response, the agent fires off
memory.store_messages()as a background task. This never blocks the response.
Multi-User Scoping
Passuser_id to scope memories per user:
user_id flows through RunContext and is passed to both recall() and store_messages().
Multi-Turn Conversations
Memory works alongside session-based multi-turn conversations. Thesession_id is used to group episodes within a conversation, while user_id scopes across all conversations:
Memory Events
When usingrun_stream() or arun_stream(), memory operations emit events:
| Event | Key Fields | Description |
|---|---|---|
MemoryRecallStarted | query | Memory recall began |
MemoryRecallCompleted | query, tokens_used, chunks_included, chunks_available, duration_ms | Memory recall finished |
MemoryUpdateStarted | message_count | Memory storage began |
MemoryUpdateCompleted | message_count, duration_ms | Memory storage finished |
Non-Fatal Behavior
All memory operations are designed to be non-fatal. If a memory store is unavailable, the recall returns empty context and the store operation is logged as a warning. The agent continues to function normally without memory.RunContext Fields
Memory adds two fields toRunContext:
| Field | Type | Description |
|---|---|---|
memory_context | str | None | The formatted memory XML injected into the system prompt |
user_id | str | None | User ID for memory scoping |