Streaming - Definable AI

Streaming displays tokens as they are generated instead of waiting for the full response. This dramatically improves perceived latency.

Agent Streaming

for event in agent.run_stream("Tell me a story."):
    if event.event == "RunContent" and event.content:
        print(event.content, end="", flush=True)

async for event in agent.arun_stream("Tell me a story."):
    if event.event == "RunContent" and event.content:
        print(event.content, end="", flush=True)

Event Types

Event	Description
`RunStarted`	Agent execution began
`RunContent`	A chunk of the agent’s text response
`RunContentCompleted`	Content generation done
`ToolCallStarted`	A tool call is about to execute
`ToolCallCompleted`	A tool call finished
`ToolCallError`	A tool call failed
`ReasoningStarted`	Thinking phase began
`ReasoningStep`	A reasoning step
`RunCompleted`	Entire run finished (includes final `RunOutput`)
`RunError`	Run failed

Full Event Handling

for event in agent.run_stream("Research quantum computing."):
    match event.event:
        case "RunContent":
            print(event.content, end="", flush=True)
        case "ToolCallStarted":
            print(f"\n> Calling {event.tool.tool_name}...")
        case "ToolCallCompleted":
            print(f"  Done: {event.content[:80]}")
        case "RunCompleted":
            print(f"\n\nTokens: {event.metrics.total_tokens}")

Model Streaming

Stream directly from a model:

from definable.model.openai import OpenAIChat
from definable.model.message import Message

model = OpenAIChat(id="gpt-4o")

for chunk in model.invoke_stream(
    messages=[Message(role="user", content="Explain DNS.")],
    assistant_message=Message(role="assistant", content=""),
):
    if chunk.content:
        print(chunk.content, end="", flush=True)

async for chunk in model.ainvoke_stream(
    messages=[Message(role="user", content="Explain DNS.")],
    assistant_message=Message(role="assistant", content=""),
):
    if chunk.content:
        print(chunk.content, end="", flush=True)

The RunCompleted event in agent streaming contains the full RunOutput object in event.output, giving you access to aggregated metrics and the complete message history.

​Agent Streaming

​Event Types

​Full Event Handling

​Model Streaming

Agent Streaming

Event Types

Full Event Handling

Model Streaming