Guardrails let you enforce content policies on every agent run — blocking dangerous input, redacting PII from output, and restricting which tools the model can call.
Quick Example
from definable.agent import Agent
from definable.agent.guardrail import Guardrails, max_tokens, pii_filter, tool_blocklist
from definable.model.openai import OpenAIChat
agent = Agent(
model = OpenAIChat( id = "gpt-4o" ),
guardrails = Guardrails(
input = [max_tokens( 500 )],
output = [pii_filter()],
tool = [tool_blocklist({ "delete_all" })],
),
)
output = agent.run( "What's my account balance?" )
How It Works
Three checkpoints run automatically on every arun() / arun_stream() call:
Input — after memory recall, before the model call. Can block or modify the user message.
Tool — inside the tool call loop, before each tool execution. Blocked tools send an error result back to the model.
Output — after the model response, before memory store. Can block, modify, or redact the response.
Guardrails Constructor
from definable.agent.guardrail import Guardrails
guardrails = Guardrails(
input = [ ... ],
output = [ ... ],
tool = [ ... ],
mode = "fail_fast" ,
on_block = "raise" ,
)
input
List[InputGuardrail]
default: "[]"
Guardrails that check the user message before the LLM call.
output
List[OutputGuardrail]
default: "[]"
Guardrails that check the model response after the LLM call.
tool
List[ToolGuardrail]
default: "[]"
Guardrails that check each tool call before execution.
"fail_fast" stops at the first block. "run_all" runs every guardrail and collects all results.
"raise" throws InputCheckError or OutputCheckError. "return_message" returns a RunOutput with status=RunStatus.blocked.
Built-in Guardrails
from definable.agent.guardrail import max_tokens, block_topics, regex_filter
max_tokens(n, model_id='gpt-4o')
Blocks input that exceeds n tokens. Uses the specified model’s tokenizer for counting.
Blocks input containing any keyword from the topics list (case-insensitive substring match).
regex_filter(patterns, action='block')
Blocks or redacts input matching any of the given regex patterns. Set action="modify" to redact matches instead of blocking.
Output Guardrails
from definable.agent.guardrail import pii_filter, max_output_tokens
pii_filter(action='modify')
Detects PII (credit cards, SSN, email, phone) and redacts it with tokens like [CREDIT_CARD], [SSN], [EMAIL], [PHONE]. Set action="block" to block the entire response instead.
max_output_tokens(n, model_id='gpt-4o')
Blocks output that exceeds n tokens.
from definable.agent.guardrail import tool_allowlist, tool_blocklist
Only allows tools whose names appear in the allowed set. All others are blocked.
Blocks tools whose names appear in the blocked set. All others are allowed.
Custom Guardrails
Using Decorators
The fastest way to create a custom guardrail:
from definable.agent.guardrail import input_guardrail, GuardrailResult
@input_guardrail
async def no_profanity ( text : str , context ) -> GuardrailResult:
banned = [ "badword" , "offensive" ]
if any (word in text.lower() for word in banned):
return GuardrailResult.block( "Profanity detected" )
return GuardrailResult.allow()
@input_guardrail ( name = "length_check" )
async def check_length ( text : str , context ) -> GuardrailResult:
if len (text) > 10000 :
return GuardrailResult.block( "Input too long" )
return GuardrailResult.allow()
Also available: @output_guardrail and @tool_guardrail.
Class-Based
Implement the protocol directly for more control:
from definable.agent.guardrail import GuardrailResult
class SentimentGuardrail :
name = "sentiment_check"
async def check ( self , text : str , context ) -> GuardrailResult:
# Your custom logic here
if is_toxic(text):
return GuardrailResult.block( "Toxic content detected" )
return GuardrailResult.allow()
Modify Action
Guardrails can rewrite content instead of blocking:
@output_guardrail
async def redact_names ( text : str , context ) -> GuardrailResult:
cleaned = text.replace( "Alice" , "[REDACTED]" )
if cleaned != text:
return GuardrailResult.modify(cleaned, reason = "Names redacted" )
return GuardrailResult.allow()
Composable Guardrails
Combine guardrails with logic operators:
from definable.agent.guardrail import ALL , ANY , NOT , when, max_tokens, block_topics
# ALL — every guardrail must allow
strict_input = ALL(
max_tokens( 1000 ),
block_topics([ "violence" , "exploit" ]),
name = "strict_input" ,
)
# ANY — at least one must allow
flexible_check = ANY(
max_tokens( 5000 ),
max_tokens( 10000 ),
name = "flexible_check" ,
)
# NOT — invert a guardrail (allow ↔ block)
must_mention_topic = NOT(
block_topics([ "support" ]),
name = "must_mention_support" ,
)
# when — conditional execution
admin_limit = when(
condition = lambda ctx : ctx.user_id != "admin" ,
guardrail = max_tokens( 500 ),
name = "non_admin_limit" ,
)
Block Handling
Raise Exceptions (Default)
from definable.exceptions import InputCheckError, OutputCheckError
agent = Agent(
model = model,
guardrails = Guardrails(
input = [max_tokens( 100 )],
on_block = "raise" , # default
),
)
try :
output = agent.run( "a]very long message..." )
except InputCheckError as e:
print ( f "Blocked: { e.message } " )
except OutputCheckError as e:
print ( f "Output blocked: { e.message } " )
Return Blocked Status
from definable.agent.run import RunStatus
agent = Agent(
model = model,
guardrails = Guardrails(
input = [max_tokens( 100 )],
on_block = "return_message" ,
),
)
output = agent.run( "a very long message..." )
if output.status == RunStatus.blocked:
print ( f "Request was blocked: { output.content } " )
Tracing Events
Guardrail activity is captured in the agent’s trace stream:
Event Fields Emitted When GuardrailCheckedEventguardrail_name, guardrail_type, action, message, duration_msAfter each check completes GuardrailBlockedEventguardrail_name, guardrail_type, reasonWhen a guardrail blocks
What’s Next
Agents Overview Learn how agents orchestrate models, tools, and guardrails.
Middleware Add request/response transforms alongside guardrails.
Testing Use MockModel to test guardrail behavior without API calls.
Error Handling Handle InputCheckError, OutputCheckError, and other exceptions.