Middleware

Middleware lets you wrap agent execution with reusable logic such as logging, retries, metrics collection, and custom pre/post-processing. Middleware composes cleanly and executes in a predictable order.

How Middleware Works

Middleware wraps the agent’s core execution. Each middleware receives a RunContext and a next_handler function, and decides what to do before and after calling next_handler: The response travels back through the same chain in reverse order. The last middleware added is the outermost wrapper.

Using Middleware

Add middleware with the .use() method:

from definable.agents import Agent, LoggingMiddleware, RetryMiddleware, MetricsMiddleware
from definable.models import OpenAIChat
import logging

logger = logging.getLogger("my_agent")

agent = (
    Agent(
        model=OpenAIChat(id="gpt-4o"),
        instructions="You are a helpful assistant.",
    )
    .use(LoggingMiddleware(logger))
    .use(RetryMiddleware(max_retries=3))
    .use(MetricsMiddleware())
)

output = agent.run("Hello!")

Built-in Middleware

LoggingMiddleware

Logs the start, completion, and any errors for each run.

from definable.agents import LoggingMiddleware
import logging

logger = logging.getLogger("my_agent")
agent.use(LoggingMiddleware(logger, level=logging.INFO))

logger

logging.Logger

required

The Python logger instance to write to.

level

int

default:"logging.INFO"

Log level for normal events. Errors are always logged at ERROR.

RetryMiddleware

Retries the entire run on transient errors with exponential backoff.

from definable.agents import RetryMiddleware

agent.use(RetryMiddleware(
    max_retries=3,
    backoff_base=1.0,
    backoff_max=60.0,
))

max_retries

int

default:"3"

Maximum number of retry attempts.

backoff_base

float

default:"1.0"

Base delay in seconds (doubles on each retry).

backoff_max

float

default:"60.0"

Maximum backoff delay in seconds.

Retries are triggered for ConnectionError, TimeoutError, and OSError.

MetricsMiddleware

Collects timing and count metrics across runs.

from definable.agents import MetricsMiddleware

metrics_mw = MetricsMiddleware()
agent.use(metrics_mw)

# After running the agent
output = agent.run("Hello!")

print(f"Total runs:       {metrics_mw.run_count}")
print(f"Total errors:     {metrics_mw.error_count}")
print(f"Avg latency (ms): {metrics_mw.average_latency_ms:.1f}")

KnowledgeMiddleware

Automatically retrieves relevant documents from a knowledge base and injects them into the agent’s context. See Agent Integration for details.

Writing Custom Middleware

Implement the Middleware protocol — a callable that accepts context and next_handler:

from definable.agents import Middleware
from definable.run import RunContext, RunOutput

class TimingMiddleware:
    """Measures and prints execution time for each run."""

    async def __call__(self, context: RunContext, next_handler) -> RunOutput:
        import time
        start = time.perf_counter()

        result = await next_handler(context)

        elapsed = time.perf_counter() - start
        print(f"Run {context.run_id} took {elapsed:.2f}s")
        return result

agent.use(TimingMiddleware())

Modifying Context

Middleware can modify the RunContext before passing it to the next handler:

class InjectMetadataMiddleware:
    async def __call__(self, context, next_handler):
        context.metadata = context.metadata or {}
        context.metadata["source"] = "web_app"
        context.metadata["version"] = "2.1"
        return await next_handler(context)

Error Handling

Middleware can catch and handle errors:

class ErrorNotificationMiddleware:
    async def __call__(self, context, next_handler):
        try:
            return await next_handler(context)
        except Exception as e:
            notify_team(f"Agent error: {e}")
            raise

Execution Order

Middleware executes in reverse registration order (last added = outermost):

agent.use(A())  # Innermost — runs closest to the agent core
agent.use(B())  # Middle
agent.use(C())  # Outermost — runs first on request, last on response

Request flow: C → B → A → Agent Core Response flow: Agent Core → A → B → C This means if you want retries to wrap logging, add logging first, then retries.

Getting Started

Models

Agents

Tools

Toolkits

Interfaces

Memory

Readers

Knowledge

MCP

Advanced

How Middleware Works

Using Middleware

Built-in Middleware

LoggingMiddleware

RetryMiddleware

MetricsMiddleware

KnowledgeMiddleware

Writing Custom Middleware

Modifying Context

Error Handling

Execution Order

Getting Started

Models

Agents

Tools

Toolkits

Interfaces

Memory

Readers

Knowledge

MCP

Advanced

​How Middleware Works

​Using Middleware

​Built-in Middleware

​LoggingMiddleware

​RetryMiddleware

​MetricsMiddleware

​KnowledgeMiddleware

​Writing Custom Middleware

​Modifying Context

​Error Handling

​Execution Order

How Middleware Works

Using Middleware

Built-in Middleware

LoggingMiddleware

RetryMiddleware

MetricsMiddleware

KnowledgeMiddleware

Writing Custom Middleware

Modifying Context

Error Handling

Execution Order