2.3 Text and Image Moderation Implementation

Key Takeaways

  • Text moderation accepts up to 10,000 characters per request; split longer content into chunks and analyze each separately.
  • Image moderation supports JPEG, PNG, GIF, BMP, TIFF, and WEBP up to 4 MB, with images downscaled toward a 2,048 px bound before analysis.
  • Blocklists provide exact-match term filtering for org-specific words (competitor names, internal codes); halt_on_blocklist_hit stops analysis on the first match.
  • Custom categories let you train domain-specific moderation beyond the four built-in harms using labeled positive and negative examples.
  • Combine automated severities with a three-tier human-review workflow — auto-approve, human queue, auto-reject — to handle borderline content.
Last updated: June 2026

Quick Answer: Call analyze_text (up to 10,000 characters) or analyze_image (JPEG/PNG/GIF/BMP/TIFF/WEBP, max 4 MB) and compare each category's returned severity against thresholds you define. Use blocklists for exact-match org-specific terms, custom categories for trained domain rules, and a three-tier auto-approve / human-review / auto-reject workflow for borderline severities.

You Set the Thresholds, Not the Service

The API returns a severity per category; your application decides the allow/block boundary. Tune thresholds to the context — a children's app blocks everything, a medical forum tolerates clinical violence descriptions.

ApplicationViolenceSelf-HarmSexualHateRationale
Children's app0000Block any flagged content
General social2022Allow mild references, zero tolerance on self-harm
Medical forum4222Allow clinical injury descriptions
News platform4224Allow reporting on violence and hate incidents
def moderate_text(client, text, thresholds):
    response = client.analyze_text(AnalyzeTextOptions(text=text))
    violations = []
    for r in response.categories_analysis:
        limit = thresholds.get(r.category, 2)
        if r.severity >= limit:
            violations.append((r.category, r.severity, limit))
    return (len(violations) == 0), violations

thresholds = {"Violence": 4, "SelfHarm": 0, "Sexual": 2, "Hate": 2}
allowed, violations = moderate_text(client, user_input, thresholds)

Text longer than 10,000 characters must be chunked; analyze each chunk and aggregate the worst severity per category.

Blocklists

The four harm categories are general-purpose. They will not catch a competitor's brand name, an internal project codeword, or a banned phrase specific to your business — that is exactly what a blocklist is for. Blocklists do exact-match (and pattern) term filtering and run alongside the harm classifiers.

from azure.ai.contentsafety.models import (
    TextBlocklist, AddOrUpdateTextBlocklistItemsOptions, TextBlocklistItem)

client.create_or_update_text_blocklist(
    blocklist_name="competitor-names",
    options=TextBlocklist(description="Block competitor brand mentions"))

client.add_or_update_blocklist_items(
    blocklist_name="competitor-names",
    options=AddOrUpdateTextBlocklistItemsOptions(blocklist_items=[
        TextBlocklistItem(text="CompetitorBrand", description="Main rival")]))

resp = client.analyze_text(AnalyzeTextOptions(
    text="I recommend CompetitorBrand here",
    blocklist_names=["competitor-names"],
    halt_on_blocklist_hit=True))  # stop on first blocklist match

for m in resp.blocklists_match:
    print(m.blocklist_name, m.blocklist_item_text)

On the Exam: Blocklist vs harm category is a classic distractor. "Block mentions of a rival product / internal code / specific slur list" → blocklist. "Detect violent or hateful content generally" → built-in categories. halt_on_blocklist_hit=True short-circuits the call when a match is found, skipping further harm analysis for efficiency.

Custom Categories

When you need a category the four defaults do not cover — and a static blocklist is too brittle — train a custom category with labeled positive and negative examples. Custom categories require the standard (S0) tier.

IndustryCustom categoryExamples
FinanceUnlicensed investment adviceGuaranteed-return promises, stock tips
HealthcareMedical misinformationUnproven cures, anti-vaccine claims
EducationAcademic dishonesty"Write my essay," "solve my exam"
GamingCheating / exploitsHack tutorials, exploit sharing

Choose blocklist for known exact strings, custom category for fuzzy concepts that need a model to generalize.

Image Moderation

from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData

with open("upload.jpg", "rb") as f:
    resp = client.analyze_image(AnalyzeImageOptions(
        image=ImageData(content=f.read())))
for r in resp.categories_analysis:
    if r.severity >= 4:
        block(r.category)

You can also pass ImageData(blob_url=...) to analyze an image already in Azure Blob Storage rather than uploading bytes.

ParameterLimit
Max file size4 MB
FormatsJPEG, PNG, GIF, BMP, TIFF, WEBP
Min dimensions50 x 50 pixels
Max dimensions~2,048 x 2,048 (larger images are downscaled)

Image severities are returned on the four-value scale (0, 2, 4, 6).

Three-Tier Human-Review Workflow

Fully automated allow/block produces false positives near the threshold. The exam-favored pattern routes the uncertain middle band to humans:

[User content] -> [Content Safety]
   severity 0-1   -> auto-approve
   severity 2-3   -> human review queue
   severity 4+    -> auto-reject
  1. Auto-approve severity 0-1 (clearly safe).
  2. Human review severity 2-3 (borderline — human judgment).
  3. Auto-reject severity 4+ (clearly harmful).

This balances safety against reviewer cost and is the right answer whenever a scenario mentions "borderline" content or "reducing false positives." Persist the moderation decision and reviewer outcome for audit and Responsible AI reporting.

Batching, Latency, and Cost

Text and image calls are single-item per request, so high-volume moderation is throughput-bound. Practical patterns:

  • Pre-filter with blocklists. A halt_on_blocklist_hit=True match can short-circuit before the harm classifiers run, saving latency and quota on obvious rejects.
  • Chunk long text at sentence boundaries near the 10,000-character cap and aggregate by taking the maximum severity per category across chunks — never the average, or a single severe sentence buried in benign text gets diluted below threshold.
  • Cache decisions for identical content (hash the input) so repeated submissions of the same comment do not re-bill the API.
  • Tier strictness to risk. Public, anonymous, child-facing surfaces warrant strict thresholds and Low+Medium+High blocking; authenticated, niche professional forums can tolerate higher thresholds.

Blocklist vs Custom Category vs Built-In — Decision Guide

This three-way choice is heavily tested. Anchor on what kind of rule you are expressing:

NeedMechanismTier
Detect general hate / violence / sexual / self-harmBuilt-in harm categoriesF0 / S0
Block a fixed list of known exact strings (names, codes)BlocklistS0
Detect a fuzzy domain concept needing generalizationCustom category (trained)S0
Enforce two of these at onceCombine them in one analyze callS0

The built-in categories and a blocklist run in the same analyze_text call; you do not need separate requests. A blocklist is deterministic and instant to update (no training); a custom category must be trained on labeled examples but generalizes to phrasings you never explicitly listed.

Responsible AI Tie-In

Content moderation maps directly to the Microsoft Responsible AI principle of safety and reliability. For AI-102, remember that moderation should be logged, explainable (store the triggering category and severity), and paired with a human appeal path — automated blocking without recourse is a fairness and transparency failure, and exam scenarios about "users disputing a block" point to a human-review and audit workflow rather than tightening thresholds further.

Test Your Knowledge

What is the maximum text length accepted in a single Azure AI Content Safety text-moderation request?

A
B
C
D
Test Your Knowledge

When should you use a blocklist rather than the built-in harm categories?

A
B
C
D
Test Your Knowledge

A moderation pipeline auto-approves severity 0-1 and auto-rejects severity 4+. Content scores severity 3. What should happen?

A
B
C
D
Test Your Knowledge

Which AnalyzeTextOptions parameter stops analysis as soon as a blocklist term is matched?

A
B
C
D