Token Usage
Access usage metrics on anyModelResponse:
The Metrics Class
TheMetrics dataclass tracks all usage dimensions:
| Field | Type | Description |
|---|---|---|
input_tokens | int | Tokens in the prompt |
output_tokens | int | Tokens generated |
total_tokens | int | Total tokens consumed |
reasoning_tokens | int | Tokens used for chain-of-thought reasoning |
cache_read_tokens | int | Tokens served from cache |
cache_write_tokens | int | Tokens written to cache |
audio_input_tokens | int | Audio input tokens |
audio_output_tokens | int | Audio output tokens |
cost | float | Estimated cost in USD |
duration | float | Total call duration in seconds |
time_to_first_token | float | Time to first token in seconds |
Cost Calculation
Definable includes a built-in pricing registry with per-token rates for all supported models. Cost is calculated automatically when available:model_pricing.json and covers input, output, cached, reasoning, and audio token rates for each model.
Aggregating Metrics
Metrics objects can be added together, which is useful for tracking total usage across multiple calls:
Metrics class also works with Python’s built-in sum():
Agent-Level Metrics
When using agents, metrics are aggregated across all model calls in a run:Serialization
Convert metrics to a dictionary for logging or storage. Zero values andNone fields are excluded automatically: