Guardrails - Quick Start
Setup Prompt Injection Detection, PII Masking on LiteLLM Proxy (AI Gateway)
1. Define guardrails on your LiteLLM config.yamlβ
Set your guardrails under the guardrails section
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: openai/gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
guardrails:
- guardrail_name: general-guard
litellm_params:
guardrail: aim
mode: [pre_call, post_call]
api_key: os.environ/AIM_API_KEY
api_base: os.environ/AIM_API_BASE
default_on: true # Optional
- guardrail_name: "aporia-pre-guard"
litellm_params:
guardrail: aporia # supported values: "aporia", "lakera"
mode: "during_call"
api_key: os.environ/APORIA_API_KEY_1
api_base: os.environ/APORIA_API_BASE_1
- guardrail_name: "aporia-post-guard"
litellm_params:
guardrail: aporia # supported values: "aporia", "lakera"
mode: "post_call"
api_key: os.environ/APORIA_API_KEY_2
api_base: os.environ/APORIA_API_BASE_2
guardrail_info: # Optional field, info is returned on GET /guardrails/list
# you can enter any fields under info for consumers of your guardrail
params:
- name: "toxicity_score"
type: "float"
description: "Score between 0-1 indicating content toxicity level"
- name: "pii_detection"
type: "boolean"
# Example Presidio guardrail config with entity actions + confidence score thresholds
- guardrail_name: "presidio-pii"
litellm_params:
guardrail: presidio
mode: "pre_call"
presidio_language: "en"
pii_entities_config:
CREDIT_CARD: "MASK"
EMAIL_ADDRESS: "MASK"
US_SSN: "MASK"
presidio_score_thresholds: # minimum confidence scores for keeping detections
CREDIT_CARD: 0.8
EMAIL_ADDRESS: 0.6
# Example Pillar Security config via Generic Guardrail API
- guardrail_name: "pillar-security"
litellm_params:
guardrail: generic_guardrail_api
mode: [pre_call, post_call]
api_base: https://api.pillar.security/api/v1/integrations/litellm
api_key: os.environ/PILLAR_API_KEY
additional_provider_specific_params:
plr_mask: true
plr_evidence: true
plr_scanners: true
For generic guardrail APIs you can also set static headers (headers: key/value sent on every request) and dynamic headers (extra_headers: list of client header names to forward). See Generic Guardrail API - Static and dynamic headers.
Supported values for mode (Event Hooks)β
pre_callRun before LLM call, on inputpost_callRun after LLM call, on input & outputduring_callRun during LLM call, on input Same aspre_callbut runs in parallel as LLM call. Response not returned until guardrail check completes- A list of the above values to run multiple modes, e.g.
mode: [pre_call, post_call]
Skip system messages in guardrail evaluationβ
You can stop unified guardrails from scanning role: system content while still sending the full messages list to the model.
Global β in litellm_settings:
litellm_settings:
skip_system_message_in_guardrail: true
Per guardrail β under that guardrailβs litellm_params: set skip_system_message_in_guardrail: true or false. If omitted, the global litellm_settings value is used; per-guardrail false forces system messages to be included even when the global flag is true.
Where this applies: Only the unified guardrail path (providers that implement apply_guardrail and run through LiteLLMβs message translation layer) on OpenAI Chat Completions (/v1/chat/completions) and Anthropic Messages (/v1/messages). Examples include Presidio, Bedrock guardrails, litellm_content_filter, OpenAI Moderation, Generic Guardrail API, and custom code guardrails that define apply_guardrail.
Where this does not apply: Guardrails that run only via direct hooks on the raw request (e.g. Lakera v2, Aporia, DynamoAI, Javelin, Lasso, Pangea, Model Armor, Azure Content Safety hooks, Guardrails AI, AIM, tool permission, MCP security). It also does not apply to other routes until those endpoints use the same translation layer (e.g. Responses API, embeddings, speech).
Load Balancing Guardrailsβ
Need to distribute guardrail requests across multiple accounts or regions? See Guardrail Load Balancing for details on:
- Load balancing across multiple AWS Bedrock accounts (useful for rate limit management)
- Weighted distribution across guardrail instances
- Multi-region guardrail deployments
2. Start LiteLLM Gatewayβ
litellm --config config.yaml --detailed_debug
3. Test requestβ
Langchain, OpenAI SDK Usage Examples
- Unsuccessful call
- Successful Call
Expect this to fail since since ishaan@berri.ai in the request is PII
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "hi my email is ishaan@berri.ai"}
],
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
}'
Expected response on failure
{
"error": {
"message": {
"error": "Violated guardrail policy",
"aporia_ai_response": {
"action": "block",
"revised_prompt": null,
"revised_response": "Aporia detected and blocked PII",
"explain_log": null
}
},
"type": "None",
"param": "None",
"code": "400"
}
}
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "hi what is the weather"}
],
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
}'
Default On Guardrailsβ
Set default_on: true in your guardrail config to run the guardrail on every request. This is useful if you want to run a guardrail on every request without the user having to specify it.
Note: These will run even if user specifies a different guardrail or empty guardrails array.
guardrails:
- guardrail_name: "aporia-pre-guard"
litellm_params:
guardrail: aporia
mode: "pre_call"
default_on: true
Test Request
In this request, the guardrail aporia-pre-guard will run on every request because default_on: true is set.
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "hi my email is ishaan@berri.ai"}
]
}'
Expected response
Your response headers will include x-litellm-applied-guardrails with the guardrail applied
x-litellm-applied-guardrails: aporia-pre-guard
Guardrail Policiesβ
Need more control? Use Guardrail Policies to:
- Group guardrails into reusable policies
- Enable/disable guardrails for specific teams, keys, or models
- Inherit from existing policies and override specific guardrails
Using Guardrails Client Sideβ
Test yourself (OSS)β
Pass guardrails to your request body to test it
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "hi my email is ishaan@berri.ai"}
],
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
}'
Expose to your users (Enterprise)β
Follow this simple workflow to implement and tune guardrails:
1. View Available Guardrailsβ
First, check what guardrails are available and their parameters:
Call /guardrails/list to view available guardrails and the guardrail info (supported parameters, description, etc)
curl -X GET 'http://0.0.0.0:4000/guardrails/list'
Expected response
{
"guardrails": [
{
"guardrail_name": "aporia-post-guard",
"guardrail_info": {
"params": [
{
"name": "toxicity_score",
"type": "float",
"description": "Score between 0-1 indicating content toxicity level"
},
{
"name": "pii_detection",
"type": "boolean"
}
]
}
}
]
}
This config will return the /guardrails/list response above. The guardrail_info field is optional and you can add any fields under info for consumers of your guardrail
- guardrail_name: "aporia-post-guard"
litellm_params:
guardrail: aporia # supported values: "aporia", "lakera"
mode: "post_call"
api_key: os.environ/APORIA_API_KEY_2
api_base: os.environ/APORIA_API_BASE_2
guardrail_info: # Optional field, info is returned on GET /guardrails/list
# you can enter any fields under info for consumers of your guardrail
params:
- name: "toxicity_score"
type: "float"
description: "Score between 0-1 indicating content toxicity level"
- name: "pii_detection"
type: "boolean"
2. Apply Guardrailsβ
Add selected guardrails to your chat completion request:
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "your message"}],
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
}'
3. Test with Mock LLM completionsβ
Send mock_response to test guardrails without making an LLM call. More info on mock_response here
curl -i http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-npnwjPQciVRok5yNZgKmFQ" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "user", "content": "hi my email is ishaan@berri.ai"}
],
"mock_response": "This is a mock response",
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
}'
4. β¨ Pass Dynamic Parameters to Guardrailβ
β¨ This is an Enterprise only feature Get a free trial
Use this to pass additional parameters to the guardrail API call. e.g. things like success threshold. See guardrails spec for more details
- OpenAI Python v1.0.0+
- Curl Request
Set guardrails={"aporia-pre-guard": {"extra_body": {"success_threshold": 0.9}}} to pass additional parameters to the guardrail
In this example success_threshold=0.9 is passed to the aporia-pre-guard guardrail request body
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"guardrails": {
"aporia-pre-guard": {
"extra_body": {
"success_threshold": 0.9
}
}
}
}
)
print(response)
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"guardrails": {
"aporia-pre-guard": {
"extra_body": {
"success_threshold": 0.9
}
}
}
}'
Proxy Admin Controlsβ
Monitoring Guardrailsβ
Monitor which guardrails were executed and whether they passed or failed. e.g. guardrail going rogue and failing requests we don't intend to fail
:::
Setupβ
- Connect LiteLLM to a supported logging provider
- Make a request with a
guardrailsparameter - Check your logging provider for the guardrail trace
Traced Guardrail Successβ
Traced Guardrail Failureβ
β¨ Control Guardrails per API Keyβ
β¨ This is an Enterprise only feature Get a free trial
Use this to control what guardrails run per API Key. In this tutorial we only want the following guardrails to run for 1 API Key
guardrails: ["aporia-pre-guard", "aporia-post-guard"]
Step 1 Create Key with guardrail settings
- /key/generate
- /key/update
curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
}'
curl --location 'http://0.0.0.0:4000/key/update' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{
"key": "sk-jNm1Zar7XfNdZXp49Z1kSQ",
"guardrails": ["aporia-pre-guard", "aporia-post-guard"]
}'
Step 2 Test it with new key
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Authorization: Bearer sk-jNm1Zar7XfNdZXp49Z1kSQ' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "my email is ishaan@berri.ai"
}
]
}'
β¨ Tag-based Guardrail Modesβ
β¨ This is an Enterprise only feature Get a free trial
Run guardrails based on the user-agent header. This is useful for running pre-call checks on OpenWebUI but only masking in logs for Claude CLI.
Both default and tag values can be a single mode string or a list of modes.
- Single Default Mode
- Multiple Default Modes
- Multiple Tag Modes
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
guardrails:
- guardrail_name: "guardrails_ai-guard"
litellm_params:
guardrail: guardrails_ai
guard_name: "pii_detect" # π Guardrail AI guard name
mode:
tags:
"User-Agent: claude-cli": "logging_only" # Claude CLI - only mask in logs
default: "pre_call" # Default mode when no tags match
api_base: os.environ/GUARDRAILS_AI_API_BASE # π Guardrails AI API Base. Defaults to "http://0.0.0.0:8000"
default_on: true # run on every request
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
guardrails:
- guardrail_name: "guardrails_ai-guard"
litellm_params:
guardrail: guardrails_ai
guard_name: "pii_detect"
mode:
tags:
"User-Agent: claude-cli": "logging_only"
default: ["pre_call", "post_call"] # Run on both pre and post call when no tags match
api_base: os.environ/GUARDRAILS_AI_API_BASE
default_on: true
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
api_key: os.environ/OPENAI_API_KEY
guardrails:
- guardrail_name: "guardrails_ai-guard"
litellm_params:
guardrail: guardrails_ai
guard_name: "pii_detect"
mode:
tags:
"User-Agent: claude-cli": ["pre_call", "post_call"] # Run both pre and post call for claude-cli
default: "logging_only" # Default to logging only when no tags match
api_base: os.environ/GUARDRAILS_AI_API_BASE
default_on: true
β¨ Model-level Guardrailsβ
β¨ This is an Enterprise only feature Get a free trial
This is great for cases when you have an on-prem and hosted model, and just want to run prevent sending PII to the hosted model.
model_list:
- model_name: claude-sonnet-4
litellm_params:
model: anthropic/claude-sonnet-4-20250514
api_key: os.environ/ANTHROPIC_API_KEY
api_base: https://api.anthropic.com/v1
guardrails: ["azure-text-moderation"]
- model_name: openai-gpt-4o
litellm_params:
model: openai/gpt-4o
guardrails:
- guardrail_name: "presidio-pii"
litellm_params:
guardrail: presidio # supported values: "aporia", "bedrock", "lakera", "presidio"
mode: "pre_call"
presidio_language: "en" # optional: set default language for PII analysis
pii_entities_config:
PERSON: "BLOCK" # Will mask credit card numbers
- guardrail_name: azure-text-moderation
litellm_params:
guardrail: azure/text_moderations
mode: "post_call"
api_key: os.environ/AZURE_GUARDRAIL_API_KEY
api_base: os.environ/AZURE_GUARDRAIL_API_BASE
β¨ Disable team from turning on/off guardrailsβ
β¨ This is an Enterprise only feature Get a free trial
1. Disable team from modifying guardrailsβ
curl -X POST 'http://0.0.0.0:4000/team/update' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"team_id": "4198d93c-d375-4c83-8d5a-71e7c5473e50",
"metadata": {"guardrails": {"modify_guardrails": false}}
}'
2. Try to disable guardrails for a callβ
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_VIRTUAL_KEY' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Think of 10 random colors."
}
],
"metadata": {"guardrails": {"hide_secrets": false}}
}'
3. Get 403 Errorβ
{
"error": {
"message": {
"error": "Your team does not have permission to modify guardrails."
},
"type": "auth_error",
"param": "None",
"code": 403
}
}
Expect to NOT see +1 412-612-9992 in your server logs on your callback.
The pii_masking guardrail ran on this request because api key=sk-jNm1Zar7XfNdZXp49Z1kSQ has "permissions": {"pii_masking": true}
Specificationβ
guardrails Configuration on YAMLβ
guardrails:
- guardrail_name: string # Required: Name of the guardrail
litellm_params: # Required: Configuration parameters
guardrail: string # Required: One of "aporia", "bedrock", "guardrails_ai", "lakera", "presidio", "hide-secrets"
mode: Union[string, List[string], Mode] # Required: One or more of "pre_call", "post_call", "during_call", "logging_only"
api_key: string # Required: API key for the guardrail service
api_base: string # Optional: Base URL for the guardrail service
default_on: boolean # Optional: Default False. When set to True, will run on every request, does not need client to specify guardrail in request
guardrail_info: # Optional[Dict]: Additional information about the guardrail
Mode Specification
Both default and tag values accept either a single string or a list of strings.
from litellm.types.guardrails import Mode
# Single default mode
mode = Mode(
tags={"User-Agent: claude-cli": "logging_only"},
default="logging_only"
)
# Multiple default modes
mode = Mode(
tags={"User-Agent: claude-cli": "logging_only"},
default=["pre_call", "post_call"]
)
# Multiple modes on a tag value
mode = Mode(
tags={"User-Agent: claude-cli": ["pre_call", "post_call"]},
default="logging_only"
)
guardrails Request Parameterβ
The guardrails parameter can be passed to any LiteLLM Proxy endpoint (/chat/completions, /completions, /embeddings).
Format Optionsβ
- Simple List Format:
"guardrails": [
"aporia-pre-guard",
"aporia-post-guard"
]
- Advanced Dictionary Format:
In this format the dictionary key is guardrail_name you want to run
"guardrails": {
"aporia-pre-guard": {
"extra_body": {
"success_threshold": 0.9,
"other_param": "value"
}
}
}
Type Definitionβ
guardrails: Union[
List[str], # Simple list of guardrail names
Dict[str, DynamicGuardrailParams] # Advanced configuration
]
class DynamicGuardrailParams:
extra_body: Dict[str, Any] # Additional parameters for the guardrail