Skip to main content
Models provide a consistent interface for invoking any supported LLM provider. Call synchronously, asynchronously, or via streaming — with the same API regardless of provider.

Supported Providers

ProviderClassDefault ModelInstall Extra
OpenAIOpenAIChatgpt-4o(core)
DeepSeekDeepSeekChatdeepseek-chat(core)
MoonshotMoonshotChatkimi-k2-turbo-preview(core)
xAIxAIgrok-3(core)
AnthropicClaudeclaude-sonnet-4-20250514pip install 'definable[anthropic]'
MistralMistralChatmistral-large-latestpip install 'definable[mistral]'
GoogleGeminigemini-2.0-flash-001pip install 'definable[google]'
PerplexityPerplexitypplx-70b-online(core)
OllamaOllamallama3pip install 'definable[ollama]'
OpenRouterOpenRouter(core)
CustomOpenAILike(core)

Basic Usage

from definable.model.openai import OpenAIChat
from definable.model.message import Message

model = OpenAIChat(id="gpt-4o")
response = model.invoke(
    messages=[Message(role="user", content="Hello!")],
    assistant_message=Message(role="assistant", content=""),
)
print(response.content)

Common Parameters

All model classes accept these parameters:
id
str
required
The model identifier (e.g., "gpt-4o", "deepseek-chat").
api_key
str
API key for authentication. Defaults to the provider’s environment variable.
base_url
str
Override the API base URL. Useful for proxies or self-hosted endpoints.
temperature
float
Sampling temperature (0.0 to 2.0). Lower values are more deterministic.
max_tokens
int
Maximum number of tokens to generate.
timeout
float
Request timeout in seconds.
max_retries
int
Maximum number of retries on transient failures.

Invocation Methods

Every model supports four ways to call it:
MethodSync/AsyncStreamingReturns
invoke()SyncNoModelResponse
ainvoke()AsyncNoModelResponse
invoke_stream()SyncYesIterator[ModelResponse]
ainvoke_stream()AsyncYesAsyncIterator[ModelResponse]
from definable.model.message import Message

response = model.invoke(
    messages=[Message(role="user", content="Hello!")],
    assistant_message=Message(role="assistant", content=""),
)
print(response.content)

Retry Configuration

All models support automatic retries with configurable backoff:
model = OpenAIChat(
    id="gpt-4o",
    retries=3,
    delay_between_retries=1,
    exponential_backoff=True,
)
retries
int
default:"0"
Number of retry attempts on failure.
delay_between_retries
int
default:"1"
Base delay in seconds between retries.
exponential_backoff
bool
default:"false"
Whether to use exponential backoff for retries.

Response Caching

Cache model responses locally for development and testing:
model = OpenAIChat(
    id="gpt-4o",
    cache_response=True,
    cache_ttl=3600,
    cache_dir=".cache/models",
)

Model Resolution from Strings

Agents accept string model shorthand — no explicit model import needed:
from definable.agent import Agent

# Bare model name (defaults to OpenAI)
agent = Agent(model="gpt-4o-mini", instructions="Hello")

# Provider/model-id format
agent = Agent(model="anthropic/claude-sonnet-4-20250514", instructions="Hello")
agent = Agent(model="google/gemini-2.0-flash-001", instructions="Hello")
You can also resolve strings programmatically:
from definable.model.utils import resolve_model_string

model = resolve_model_string("openai/gpt-4o")
model = resolve_model_string("deepseek/deepseek-chat")
Supported providers: openai, deepseek, moonshot, xai, anthropic, mistral, google, perplexity, ollama, openrouter

Next Steps

String Shorthand

Use string format instead of class imports.

Model Resilience

Key rotation, failover, and rate limit handling.

Streaming

Stream responses token-by-token.

Structured Output

Return Pydantic models from LLM calls.

Multimodal

Images, audio, video, and files.

Metrics & Pricing

Token counting and cost calculation.