2.3 Text and Image Moderation Implementation

Key Takeaways

  • Text moderation supports content up to 10,000 characters per request and can process multiple text segments in a single batch call.
  • Image moderation supports JPEG, PNG, GIF, BMP, TIFF, and WEBP formats with a maximum file size of 4 MB.
  • Custom categories allow organizations to define domain-specific moderation rules beyond the four default harm categories.
  • Blocklists enable exact-match or regex-pattern blocking of specific terms, phrases, or identifiers.
  • Moderation decisions should combine automated Content Safety scores with human review workflows for edge cases.
Last updated: March 2026

Text and Image Moderation Implementation

Quick Answer: Implement text moderation using AnalyzeTextOptions with configurable severity thresholds per harm category. Use blocklists for exact-match term filtering, custom categories for domain-specific rules, and human review workflows for edge cases. Image moderation uses AnalyzeImageOptions with the same four harm categories.

Text Moderation Best Practices

Setting Severity Thresholds

Choose appropriate severity thresholds based on your application context:

Application TypeViolenceSelf-HarmSexualHateRationale
Children's app0000Block all potentially harmful content
General social media2022Allow mild references, strict on self-harm
Medical forum4222Allow medical violence descriptions
News platform4224Allow reporting on violence and hate incidents
Adult platform4042Higher sexual tolerance, strict on self-harm

Implementing Threshold-Based Filtering

def moderate_text(client, text, thresholds):
    """
    Moderate text with configurable severity thresholds.
    Returns (is_allowed, violations) tuple.
    """
    request = AnalyzeTextOptions(text=text)
    response = client.analyze_text(request)

    violations = []
    for result in response.categories_analysis:
        category = result.category
        severity = result.severity
        threshold = thresholds.get(category, 2)  # Default threshold

        if severity >= threshold:
            violations.append({
                "category": category,
                "severity": severity,
                "threshold": threshold
            })

    return len(violations) == 0, violations

# Usage
thresholds = {
    "Violence": 4,
    "SelfHarm": 0,
    "Sexual": 2,
    "Hate": 2
}
is_allowed, violations = moderate_text(client, user_input, thresholds)

Blocklists

Blocklists enable exact-match and pattern-based blocking of specific terms:

Creating and Managing Blocklists

from azure.ai.contentsafety.models import (
    TextBlocklist,
    AddOrUpdateTextBlocklistItemsOptions,
    TextBlocklistItem
)

# Create a blocklist
client.create_or_update_text_blocklist(
    blocklist_name="competitor-names",
    options=TextBlocklist(description="Block competitor brand mentions")
)

# Add items to the blocklist
items = AddOrUpdateTextBlocklistItemsOptions(
    blocklist_items=[
        TextBlocklistItem(text="CompetitorBrand", description="Main competitor"),
        TextBlocklistItem(text="RivalProduct", description="Competitor product")
    ]
)
client.add_or_update_blocklist_items(
    blocklist_name="competitor-names",
    options=items
)

# Analyze text with blocklist
request = AnalyzeTextOptions(
    text="I recommend CompetitorBrand for this task",
    blocklist_names=["competitor-names"],
    halt_on_blocklist_hit=True  # Stop analysis if blocklist match found
)
response = client.analyze_text(request)

# Check blocklist matches
if response.blocklists_match:
    for match in response.blocklists_match:
        print(f"Blocked term: {match.blocklist_item_text}")
        print(f"Blocklist: {match.blocklist_name}")

On the Exam: Blocklists are used for organization-specific term filtering (competitor names, internal codes, banned phrases). The four harm categories handle general safety. Questions may test when to use a blocklist vs. the built-in categories.

Custom Categories

Custom categories allow you to define moderation categories specific to your organization:

  • Create categories beyond the default four (violence, self-harm, sexual, hate)
  • Provide positive and negative examples for training
  • Useful for industry-specific content policies (financial compliance, healthcare regulations)
  • Available in Standard tier and above

Use Cases for Custom Categories

IndustryCustom CategoryExamples
FinanceInvestment adviceSpecific stock recommendations, guaranteed returns
HealthcareMedical misinformationUnproven treatments, anti-vaccination content
EducationAcademic dishonestyRequests to write essays, solve homework problems
GamingCheating/exploitsGame hack instructions, exploit sharing

Image Moderation Implementation

from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
import base64

# Option 1: Analyze image from file
with open("user_upload.jpg", "rb") as f:
    image_bytes = f.read()

request = AnalyzeImageOptions(
    image=ImageData(content=image_bytes)
)
response = client.analyze_image(request)

# Option 2: Analyze image from URL
request = AnalyzeImageOptions(
    image=ImageData(blob_url="https://storage.blob.core.windows.net/images/photo.jpg")
)
response = client.analyze_image(request)

# Process results
for result in response.categories_analysis:
    print(f"{result.category}: severity {result.severity}")
    if result.severity >= 4:
        print(f"  → BLOCKED: {result.category} severity too high")

Image Moderation Limits

ParameterLimit
Maximum file size4 MB
Supported formatsJPEG, PNG, GIF, BMP, TIFF, WEBP
Minimum dimensions50 x 50 pixels
Maximum dimensions2,048 x 2,048 pixels (larger images are downscaled)

Human Review Workflows

For edge cases where automated moderation is uncertain (severity scores near the threshold), implement a human review workflow:

[User Content] → [Content Safety API]
                    ├── Severity < threshold → ✅ Auto-approve
                    ├── Severity > high_threshold → ❌ Auto-reject
                    └── Severity between thresholds → 🔍 Send to human review queue

Implementation Pattern

  1. Auto-approve: Severity 0-1 (clearly safe content)
  2. Human review: Severity 2-3 (borderline content requiring human judgment)
  3. Auto-reject: Severity 4+ (clearly harmful content)

This three-tier approach balances safety with efficiency and reduces false positives.

Test Your Knowledge

What is the maximum text length supported per request by Azure AI Content Safety text moderation?

A
B
C
D
Test Your Knowledge

When should you use a blocklist instead of the built-in harm categories?

A
B
C
D
Test Your Knowledge

A content moderation system receives a text with a severity score of 3. The auto-approve threshold is 1 and the auto-reject threshold is 4. What should happen?

A
B
C
D
Test Your Knowledge

What parameter in AnalyzeTextOptions causes the analysis to stop immediately if a blocklist term is found?

A
B
C
D