Documentation Index
Fetch the complete documentation index at: https://docs.definable.ai/llms.txt
Use this file to discover all available pages before exploring further.
Streaming displays tokens as they are generated instead of waiting for the full response. This dramatically improves perceived latency.
Agent Streaming
for event in agent.run_stream("Tell me a story."):
if event.event == "RunContent" and event.content:
print(event.content, end="", flush=True)
Event Types
| Event | Description |
|---|
RunStarted | Agent execution began |
RunContent | A chunk of the agent’s text response |
RunContentCompleted | Content generation done |
ToolCallStarted | A tool call is about to execute |
ToolCallCompleted | A tool call finished |
ToolCallError | A tool call failed |
ReasoningStarted | Thinking phase began |
ReasoningStep | A reasoning step |
RunCompleted | Entire run finished (includes final RunOutput) |
RunError | Run failed |
Full Event Handling
for event in agent.run_stream("Research quantum computing."):
match event.event:
case "RunContent":
print(event.content, end="", flush=True)
case "ToolCallStarted":
print(f"\n> Calling {event.tool.tool_name}...")
case "ToolCallCompleted":
print(f" Done: {event.content[:80]}")
case "RunCompleted":
print(f"\n\nTokens: {event.metrics.total_tokens}")
Model Streaming
Stream directly from a model:
from definable.model.openai import OpenAIChat
from definable.model.message import Message
model = OpenAIChat(id="gpt-4o")
for chunk in model.invoke_stream(
messages=[Message(role="user", content="Explain DNS.")],
assistant_message=Message(role="assistant", content=""),
):
if chunk.content:
print(chunk.content, end="", flush=True)
The RunCompleted event in agent streaming contains the full RunOutput object in event.output, giving you access to aggregated metrics and the complete message history.