Documentation Index
Fetch the complete documentation index at: https://docs.definable.ai/llms.txt
Use this file to discover all available pages before exploring further.
Definable automatically tracks token usage and calculates costs for every model call. This data flows through the entire stack — from individual model invocations to aggregated agent runs.
Per-Call Metrics
Every model call returns usage metrics:
from definable.model import OpenAIChat
from definable.model.message import Message
model = OpenAIChat(id="gpt-4o")
response = model.invoke(
messages=[Message(role="user", content="Hello!")],
assistant_message=Message(role="assistant", content=""),
)
metrics = response.response_usage
print(f"Input: {metrics.input_tokens} tokens")
print(f"Output: {metrics.output_tokens} tokens")
print(f"Total: {metrics.total_tokens} tokens")
print(f"Cost: ${metrics.cost:.6f}")
Agent Run Metrics
Agent runs aggregate metrics across all model calls in the run (including tool execution loops):
from definable.agent import Agent
agent = Agent(model=model, tools=[my_tool])
output = agent.run("Analyze this data and create a summary.")
print(f"Total tokens: {output.metrics.total_tokens}")
print(f"Total cost: ${output.metrics.cost:.4f}")
print(f"Duration: {output.metrics.duration:.2f}s")
UsageTracker (Recommended)
The simplest way to track costs across runs. Enable with usage=True on the Agent constructor:
from definable.agent import Agent
agent = Agent(model="openai/gpt-4o-mini", usage=True)
await agent.arun("What is 2+2?")
await agent.arun("What is the capital of France?")
tracker = agent.usage_tracker
print(tracker.session_total) # Usage(350 tokens, $0.0012, 2 runs)
print(tracker.last_run) # Most recent run only
print(tracker.run_count) # 2
The UsageSnapshot provides:
| Property | Type | Description |
|---|
input_tokens | int | Total input tokens |
output_tokens | int | Total output tokens |
total_tokens | int | Combined tokens |
estimated_cost | float | Estimated cost in USD |
runs | int | Number of runs in this snapshot |
Snapshots support addition (a + b) and serialization (to_dict()).
Tracking Across Multiple Runs (Manual)
For more control, use Metrics addition to aggregate costs across a session or batch:
from definable.model.metrics import Metrics
session_metrics = Metrics()
for question in customer_questions:
output = agent.run(question)
session_metrics = session_metrics + output.metrics
print(f"Session total:")
print(f" Tokens: {session_metrics.total_tokens}")
print(f" Cost: ${session_metrics.cost:.4f}")
Or use Python’s sum():
all_metrics = [agent.run(q).metrics for q in questions]
total = sum(all_metrics)
print(f"Batch cost: ${total.cost:.4f}")
MetricsMiddleware
The MetricsMiddleware tracks aggregate stats across all runs for an agent:
from definable.agent import Agent, MetricsMiddleware
metrics_mw = MetricsMiddleware()
agent = Agent(model=model).use(metrics_mw)
# Run the agent multiple times
for q in questions:
agent.run(q)
print(f"Total runs: {metrics_mw.run_count}")
print(f"Error count: {metrics_mw.error_count}")
print(f"Avg latency (ms): {metrics_mw.average_latency_ms:.0f}")
Cost Breakdown
The Metrics class tracks all cost dimensions:
| Field | Description |
|---|
input_tokens | Tokens in the prompt |
output_tokens | Tokens generated |
cache_read_tokens | Tokens served from provider cache |
cache_write_tokens | Tokens written to provider cache |
reasoning_tokens | Tokens used for chain-of-thought |
audio_input_tokens | Audio input tokens |
audio_output_tokens | Audio output tokens |
cost | Total estimated cost in USD |
Pricing Registry
Definable includes a built-in pricing registry (model_pricing.json) with rates for all supported models. Cost is calculated automatically based on the model and token counts.
Prices are defined per million tokens:
{
"openai": {
"gpt-4o": {
"input_per_million": 2.50,
"output_per_million": 10.00,
"cached_input_per_million": 1.25
}
}
}
Serializing Metrics
Export metrics for logging, dashboards, or billing systems:
metrics_dict = output.metrics.to_dict()
# {
# 'input_tokens': 150,
# 'output_tokens': 87,
# 'total_tokens': 237,
# 'cost': 0.001245,
# 'duration': 1.83
# }
Zero values and None fields are excluded automatically for clean output.
Cost Budgets
Implement a simple cost guard using middleware:
class CostBudgetMiddleware:
def __init__(self, max_cost: float):
self.max_cost = max_cost
self.total_cost = 0.0
async def __call__(self, context, next_handler):
result = await next_handler(context)
if result.metrics and result.metrics.cost:
self.total_cost += result.metrics.cost
if self.total_cost > self.max_cost:
raise Exception(
f"Cost budget exceeded: ${self.total_cost:.4f} > ${self.max_cost:.4f}"
)
return result
# Limit spending to $1.00
agent.use(CostBudgetMiddleware(max_cost=1.00))