Bedrock Guardrails: Preventing AI Hallucinations in Production
AI hallucinations cost companies money and trust. Amazon Bedrock Guardrails provides enterprise-grade controls to filter harmful content, enforce topic restrictions, and ground responses in facts. Here’s how to implement them.
Guardrails Capabilities
| Feature | What It Does |
|---|---|
| Content Filters | Block hate, violence, sexual content, profanity |
| Topic Blocking | Deny discussions on specified subjects |
| PII Redaction | Mask SSN, credit cards, phone numbers |
| Word Filters | Block competitor names, internal terms |
| Grounding | Verify responses against source documents |
Creating a Guardrail
import boto3
bedrock = boto3.client('bedrock')
# Create guardrail
guardrail = bedrock.create_guardrail(
name='customer-support-guardrail',
description='Guardrails for customer-facing chatbot',
# Content filtering
contentPolicyConfig={
'filtersConfig': [
{'type': 'HATE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
{'type': 'VIOLENCE', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
{'type': 'SEXUAL', 'inputStrength': 'HIGH', 'outputStrength': 'HIGH'},
]
},
# Topic restrictions
topicPolicyConfig={
'topicsConfig': [
{
'name': 'Competitor Discussion',
'definition': 'Any mention of competitor products or pricing',
'type': 'DENY'
},
{
'name': 'Financial Advice',
'definition': 'Investment or financial planning recommendations',
'type': 'DENY'
}
]
},
# PII handling
sensitiveInformationPolicyConfig={
'piiEntitiesConfig': [
{'type': 'US_SOCIAL_SECURITY_NUMBER', 'action': 'ANONYMIZE'},
{'type': 'CREDIT_DEBIT_CARD_NUMBER', 'action': 'BLOCK'},
]
},
blockedInputMessaging='I cannot help with that request.',
blockedOutputsMessaging='I cannot provide that information.'
)
Using Guardrails with Invoke
runtime = boto3.client('bedrock-runtime')
response = runtime.invoke_model(
modelId='amazon.nova-pro-v1:0',
guardrailIdentifier='my-guardrail-id',
guardrailVersion='DRAFT',
body=json.dumps({
'inputText': user_question
})
)
# Check if guardrail intervened
if response.get('guardrailAction') == 'INTERVENED':
print("Guardrail blocked this request")
Grounding Checks (Anti-Hallucination)
# Enable grounding to verify facts
response = runtime.apply_guardrail(
guardrailIdentifier='my-guardrail',
guardrailVersion='1',
source='OUTPUT',
content=[{
'text': {
'text': model_response,
'qualifiers': ['grounding_source']
}
}],
# Provide source documents for verification
groundingSource={
'text': source_documents
}
)
Best Practice
Always apply guardrails to both inputs AND outputs. Users can craft prompts to bypass input filters, so output filtering is your last line of defense.
Stay in the loop
Get the latest wildlife research and conservation news delivered to your inbox.