Skip to main content
The deep research layer conducts automated multi-wave web research before the agent responds. It decomposes queries into sub-questions, searches the web, reads pages, compresses them into Compressed Knowledge Units (CKUs), accumulates knowledge with deduplication and contradiction detection, and synthesizes the results into context for the agent’s system prompt.

Quick Start

from definable.agent import Agent
from definable.model.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    instructions="You are a research assistant.",
    deep_research=True,
)

output = await agent.arun("Compare React and Vue frameworks in 2025.")
print(output.content)  # Response informed by live web research
With deep_research=True, the agent automatically:
  1. Breaks the question into sub-questions
  2. Searches the web for each sub-question
  3. Reads and compresses relevant pages
  4. Accumulates facts and detects contradictions
  5. Injects the research context into the system prompt
  6. Generates a response grounded in the research

How It Works

User Query


┌─────────────┐
│  Decompose  │ Break into sub-questions
└──────┬──────┘

  ┌────▼────┐
  │ Wave N  │ ◄─── Repeat until coverage sufficient
  │         │
  │ Search  │ Parallel web searches
  │   ▼     │
  │  Read   │ Fetch + extract page content
  │   ▼     │
  │Compress │ Extract CKUs via cheap model
  │   ▼     │
  │Accumulate│ Knowledge graph + dedup + contradiction detection
  │   ▼     │
  │Gap Check│ Identify remaining knowledge gaps
  └────┬────┘

  ┌────▼────┐
  │Synthesize│ Format context for system prompt
  └─────────┘

Configuration

Simple Enable

# Uses standard depth (3 waves, 15 sources, DuckDuckGo)
agent = Agent(model=model, deep_research=True)

Custom Configuration

from definable.agent.research import DeepResearchConfig

agent = Agent(
    model=model,
    deep_research=DeepResearchConfig(
        depth="deep",                       # 5 waves, 30 sources
        search_provider="duckduckgo",       # Free, no API key
        include_citations=True,
        include_contradictions=True,
        context_format="xml",
        max_context_tokens=4000,
    ),
)

Via DeepResearch Engine

You can also pass a pre-built DeepResearch engine instance directly:
from definable.agent.research import DeepResearch, DeepResearchConfig
from definable.agent.research.search import create_search_provider

researcher = DeepResearch(
    model=model,
    search_provider=create_search_provider("duckduckgo"),
    config=DeepResearchConfig(depth="deep"),
)
agent = Agent(model=model, deep_research=researcher)

Depth Presets

PresetWavesMax SourcesParallel SearchesBest For
"quick"183Fast lookups, simple questions
"standard"3155Balanced research (default)
"deep"5308Thorough investigation, complex topics
# Quick — single wave, fast
agent = Agent(model=model, deep_research=DeepResearchConfig(depth="quick"))

# Deep — thorough multi-wave research
agent = Agent(model=model, deep_research=DeepResearchConfig(depth="deep"))

Search Providers

DuckDuckGo (Default)

Free, no API key required. Works out of the box.
agent = Agent(model=model, deep_research=True)  # Uses DuckDuckGo by default

Google Custom Search Engine

Requires a Google API key and Custom Search Engine ID.
from definable.agent.research import DeepResearchConfig

agent = Agent(
    model=model,
    deep_research=DeepResearchConfig(
        search_provider="google",
        search_provider_config={
            "api_key": "your-google-api-key",
            "cse_id": "your-cse-id",
        },
    ),
)

SerpAPI

Requires a SerpAPI key.
agent = Agent(
    model=model,
    deep_research=DeepResearchConfig(
        search_provider="serpapi",
        search_provider_config={"api_key": "your-serpapi-key"},
    ),
)

Custom Search Function

Provide any async callable that returns search results:
from definable.agent.research.search.base import SearchResult

async def my_search(query: str, max_results: int = 10) -> list[SearchResult]:
    # Your custom search logic
    return [SearchResult(url="...", title="...", snippet="...")]

agent = Agent(
    model=model,
    deep_research=DeepResearchConfig(search_fn=my_search),
)

Trigger Modes

Control when research runs:
ModeDescription
"always"Run research on every arun() call (default)
"auto"Model decides whether the query needs research
"tool"Research only runs when explicitly invoked as a tool
agent = Agent(
    model=model,
    deep_research=DeepResearchConfig(trigger="auto"),
)

Standalone Usage

Use DeepResearch directly without an agent:
from definable.model.openai import OpenAIChat
from definable.agent.research import DeepResearch, DeepResearchConfig
from definable.agent.research.search import create_search_provider

model = OpenAIChat(id="gpt-4o-mini")
researcher = DeepResearch(
    model=model,
    search_provider=create_search_provider("duckduckgo"),
    config=DeepResearchConfig(depth="deep"),
)

result = await researcher.arun("What are the latest AI safety developments?")
print(result.context)         # Formatted context string
print(result.report)          # Standalone report
print(result.sources)         # List of SourceInfo
print(result.facts)           # Extracted facts
print(result.contradictions)  # Contradictions found
print(result.metrics)         # ResearchMetrics

Events

When streaming, the research pipeline emits progress events:
async for event in agent.arun_stream("Compare React and Vue"):
    match event.event:
        case "DeepResearchStarted":
            print(f"Research started: {event.query}")
        case "DeepResearchProgress":
            print(f"Wave {event.wave}: {event.sources_read} sources, "
                  f"{event.facts_extracted} facts, {event.gaps_remaining} gaps")
        case "DeepResearchCompleted":
            print(f"Done: {event.sources_used} sources, "
                  f"{event.facts_extracted} facts in {event.duration_ms:.0f}ms")
        case "RunContent":
            print(event.content, end="", flush=True)
Eventevent.event valueKey Fields
DeepResearchStartedEvent"DeepResearchStarted"query, depth
DeepResearchProgressEvent"DeepResearchProgress"wave, sources_read, facts_extracted, gaps_remaining, message
DeepResearchCompletedEvent"DeepResearchCompleted"sources_used, facts_extracted, waves_executed, duration_ms, contradictions_found

Output Types

ResearchResult

FieldTypeDescription
contextstrFormatted context for system prompt
reportstrStandalone research report
sourcesList[SourceInfo]Sources consulted
factsList[Fact]Extracted facts
gapsList[TopicGap]Remaining knowledge gaps
contradictionsList[Contradiction]Contradictions between sources
sub_questionsList[str]Decomposed sub-questions
metricsResearchMetricsPerformance metrics

Configuration Reference

depth
str
default:"standard"
Research depth preset: "quick", "standard", or "deep".
search_provider
str
default:"duckduckgo"
Search backend: "duckduckgo", "google", or "serpapi".
search_provider_config
Dict[str, Any]
Backend-specific config (API keys, CSE ID, etc.).
search_fn
Callable
Custom async search callable. Overrides search_provider.
compression_model
Model
Model for CKU extraction. Defaults to the agent’s model.
max_sources
int
default:"15"
Maximum unique sources across all waves.
max_waves
int
default:"3"
Maximum number of research waves.
parallel_searches
int
default:"5"
Concurrent search queries per wave.
parallel_reads
int
default:"10"
Concurrent page reads.
min_relevance
float
default:"0.3"
Minimum relevance score for CKU inclusion.
include_citations
bool
default:"true"
Include source citations in research context.
include_contradictions
bool
default:"true"
Surface contradictions between sources.
context_format
str
default:"xml"
Format for injected context: "xml" or "markdown".
max_context_tokens
int
default:"4000"
Approximate token budget for the context block.
early_termination_threshold
float
default:"0.2"
Stop when novelty ratio drops below this between waves.
trigger
str
default:"always"
When to run: "always", "auto", or "tool".
description
str
Description shown in the layer guide injected into the system prompt. If None, uses the default description.

Installation

Deep research requires the research extra:
pip install 'definable[research]'
This installs duckduckgo-search and curl-cffi for TLS-impersonated web reading.