Models Overview

Models are the foundation of Definable. The Model base class provides a consistent interface for invoking any supported LLM provider, whether synchronously, asynchronously, or via streaming.

Supported Providers

Provider	Class	Default Model	Docs
OpenAI	`OpenAIChat`	`gpt-4o`	OpenAI
DeepSeek	`DeepSeekChat`	`deepseek-chat`	DeepSeek
Moonshot	`MoonshotChat`	`kimi-k2-turbo-preview`	Moonshot
xAI	`xAI`	`grok-beta`	xAI
Custom	`OpenAILike`	—	OpenAI-Like

Basic Usage

from definable.models import OpenAIChat

model = OpenAIChat(id="gpt-4o")
response = model.invoke(messages=[{"role": "user", "content": "Hello!"}])
print(response.content)

Common Parameters

All model classes accept these parameters:

str

required

The model identifier (e.g., "gpt-4o", "deepseek-chat").

api_key

str

API key for authentication. Defaults to the provider’s environment variable.

base_url

str

Override the API base URL. Useful for proxies or self-hosted endpoints.

temperature

float

Sampling temperature (0.0 to 2.0). Lower values are more deterministic.

max_tokens

int

Maximum number of tokens to generate.

timeout

float

Request timeout in seconds.

max_retries

int

Maximum number of retries on transient failures.

Invocation Methods

Every model supports four ways to call it:

Method	Sync/Async	Streaming	Returns
`invoke()`	Sync	No	`ModelResponse`
`ainvoke()`	Async	No	`ModelResponse`
`invoke_stream()`	Sync	Yes	`Iterator[ModelResponse]`
`ainvoke_stream()`	Async	Yes	`AsyncIterator[ModelResponse]`

response = model.invoke(messages=[{"role": "user", "content": "Hello!"}])
print(response.content)

Retry Configuration

All models support automatic retries with configurable backoff:

model = OpenAIChat(
    id="gpt-4o",
    retries=3,
    delay_between_retries=1,
    exponential_backoff=True,
)

retries

int

default:"0"

Number of retry attempts on failure.

delay_between_retries

int

default:"1"

Base delay in seconds between retries.

exponential_backoff

bool

default:"false"

Whether to use exponential backoff for retries.

Response Caching

Cache model responses locally for development and testing:

model = OpenAIChat(
    id="gpt-4o",
    cache_response=True,
    cache_ttl=3600,
    cache_dir=".cache/models",
)

Model Resolution from Strings

You can resolve a model from a "provider:model-id" string:

from definable.models.utils import get_model

model = get_model("openai:gpt-4o")
model = get_model("deepseek:deepseek-chat")

Getting Started

Models

Agents

Tools

Toolkits

Interfaces

Memory

Readers

Knowledge

MCP

Advanced

Supported Providers

Basic Usage

Common Parameters

Invocation Methods

Retry Configuration

Response Caching

Model Resolution from Strings

Next Steps

Streaming

Structured Output

Getting Started

Models

Agents

Tools

Toolkits

Interfaces

Memory

Readers

Knowledge

MCP

Advanced

​Supported Providers

​Basic Usage

​Common Parameters

​Invocation Methods

​Retry Configuration

​Response Caching

​Model Resolution from Strings

​Next Steps

Streaming

Structured Output

Supported Providers

Basic Usage

Common Parameters

Invocation Methods

Retry Configuration

Response Caching

Model Resolution from Strings

Next Steps