2.3 Text and Image Moderation Implementation
Key Takeaways
- Text moderation supports content up to 10,000 characters per request and can process multiple text segments in a single batch call.
- Image moderation supports JPEG, PNG, GIF, BMP, TIFF, and WEBP formats with a maximum file size of 4 MB.
- Custom categories allow organizations to define domain-specific moderation rules beyond the four default harm categories.
- Blocklists enable exact-match or regex-pattern blocking of specific terms, phrases, or identifiers.
- Moderation decisions should combine automated Content Safety scores with human review workflows for edge cases.
Text and Image Moderation Implementation
Quick Answer: Implement text moderation using AnalyzeTextOptions with configurable severity thresholds per harm category. Use blocklists for exact-match term filtering, custom categories for domain-specific rules, and human review workflows for edge cases. Image moderation uses AnalyzeImageOptions with the same four harm categories.
Text Moderation Best Practices
Setting Severity Thresholds
Choose appropriate severity thresholds based on your application context:
| Application Type | Violence | Self-Harm | Sexual | Hate | Rationale |
|---|---|---|---|---|---|
| Children's app | 0 | 0 | 0 | 0 | Block all potentially harmful content |
| General social media | 2 | 0 | 2 | 2 | Allow mild references, strict on self-harm |
| Medical forum | 4 | 2 | 2 | 2 | Allow medical violence descriptions |
| News platform | 4 | 2 | 2 | 4 | Allow reporting on violence and hate incidents |
| Adult platform | 4 | 0 | 4 | 2 | Higher sexual tolerance, strict on self-harm |
Implementing Threshold-Based Filtering
def moderate_text(client, text, thresholds):
"""
Moderate text with configurable severity thresholds.
Returns (is_allowed, violations) tuple.
"""
request = AnalyzeTextOptions(text=text)
response = client.analyze_text(request)
violations = []
for result in response.categories_analysis:
category = result.category
severity = result.severity
threshold = thresholds.get(category, 2) # Default threshold
if severity >= threshold:
violations.append({
"category": category,
"severity": severity,
"threshold": threshold
})
return len(violations) == 0, violations
# Usage
thresholds = {
"Violence": 4,
"SelfHarm": 0,
"Sexual": 2,
"Hate": 2
}
is_allowed, violations = moderate_text(client, user_input, thresholds)
Blocklists
Blocklists enable exact-match and pattern-based blocking of specific terms:
Creating and Managing Blocklists
from azure.ai.contentsafety.models import (
TextBlocklist,
AddOrUpdateTextBlocklistItemsOptions,
TextBlocklistItem
)
# Create a blocklist
client.create_or_update_text_blocklist(
blocklist_name="competitor-names",
options=TextBlocklist(description="Block competitor brand mentions")
)
# Add items to the blocklist
items = AddOrUpdateTextBlocklistItemsOptions(
blocklist_items=[
TextBlocklistItem(text="CompetitorBrand", description="Main competitor"),
TextBlocklistItem(text="RivalProduct", description="Competitor product")
]
)
client.add_or_update_blocklist_items(
blocklist_name="competitor-names",
options=items
)
# Analyze text with blocklist
request = AnalyzeTextOptions(
text="I recommend CompetitorBrand for this task",
blocklist_names=["competitor-names"],
halt_on_blocklist_hit=True # Stop analysis if blocklist match found
)
response = client.analyze_text(request)
# Check blocklist matches
if response.blocklists_match:
for match in response.blocklists_match:
print(f"Blocked term: {match.blocklist_item_text}")
print(f"Blocklist: {match.blocklist_name}")
On the Exam: Blocklists are used for organization-specific term filtering (competitor names, internal codes, banned phrases). The four harm categories handle general safety. Questions may test when to use a blocklist vs. the built-in categories.
Custom Categories
Custom categories allow you to define moderation categories specific to your organization:
- Create categories beyond the default four (violence, self-harm, sexual, hate)
- Provide positive and negative examples for training
- Useful for industry-specific content policies (financial compliance, healthcare regulations)
- Available in Standard tier and above
Use Cases for Custom Categories
| Industry | Custom Category | Examples |
|---|---|---|
| Finance | Investment advice | Specific stock recommendations, guaranteed returns |
| Healthcare | Medical misinformation | Unproven treatments, anti-vaccination content |
| Education | Academic dishonesty | Requests to write essays, solve homework problems |
| Gaming | Cheating/exploits | Game hack instructions, exploit sharing |
Image Moderation Implementation
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
import base64
# Option 1: Analyze image from file
with open("user_upload.jpg", "rb") as f:
image_bytes = f.read()
request = AnalyzeImageOptions(
image=ImageData(content=image_bytes)
)
response = client.analyze_image(request)
# Option 2: Analyze image from URL
request = AnalyzeImageOptions(
image=ImageData(blob_url="https://storage.blob.core.windows.net/images/photo.jpg")
)
response = client.analyze_image(request)
# Process results
for result in response.categories_analysis:
print(f"{result.category}: severity {result.severity}")
if result.severity >= 4:
print(f" → BLOCKED: {result.category} severity too high")
Image Moderation Limits
| Parameter | Limit |
|---|---|
| Maximum file size | 4 MB |
| Supported formats | JPEG, PNG, GIF, BMP, TIFF, WEBP |
| Minimum dimensions | 50 x 50 pixels |
| Maximum dimensions | 2,048 x 2,048 pixels (larger images are downscaled) |
Human Review Workflows
For edge cases where automated moderation is uncertain (severity scores near the threshold), implement a human review workflow:
[User Content] → [Content Safety API]
├── Severity < threshold → ✅ Auto-approve
├── Severity > high_threshold → ❌ Auto-reject
└── Severity between thresholds → 🔍 Send to human review queue
Implementation Pattern
- Auto-approve: Severity 0-1 (clearly safe content)
- Human review: Severity 2-3 (borderline content requiring human judgment)
- Auto-reject: Severity 4+ (clearly harmful content)
This three-tier approach balances safety with efficiency and reduces false positives.
What is the maximum text length supported per request by Azure AI Content Safety text moderation?
When should you use a blocklist instead of the built-in harm categories?
A content moderation system receives a text with a severity score of 3. The auto-approve threshold is 1 and the auto-reject threshold is 4. What should happen?
What parameter in AnalyzeTextOptions causes the analysis to stop immediately if a blocklist term is found?