2.3 Text and Image Moderation Implementation
Key Takeaways
- Text moderation accepts up to 10,000 characters per request; split longer content into chunks and analyze each separately.
- Image moderation supports JPEG, PNG, GIF, BMP, TIFF, and WEBP up to 4 MB, with images downscaled toward a 2,048 px bound before analysis.
- Blocklists provide exact-match term filtering for org-specific words (competitor names, internal codes); halt_on_blocklist_hit stops analysis on the first match.
- Custom categories let you train domain-specific moderation beyond the four built-in harms using labeled positive and negative examples.
- Combine automated severities with a three-tier human-review workflow — auto-approve, human queue, auto-reject — to handle borderline content.
Quick Answer: Call
analyze_text(up to 10,000 characters) oranalyze_image(JPEG/PNG/GIF/BMP/TIFF/WEBP, max 4 MB) and compare each category's returned severity against thresholds you define. Use blocklists for exact-match org-specific terms, custom categories for trained domain rules, and a three-tier auto-approve / human-review / auto-reject workflow for borderline severities.
You Set the Thresholds, Not the Service
The API returns a severity per category; your application decides the allow/block boundary. Tune thresholds to the context — a children's app blocks everything, a medical forum tolerates clinical violence descriptions.
| Application | Violence | Self-Harm | Sexual | Hate | Rationale |
|---|---|---|---|---|---|
| Children's app | 0 | 0 | 0 | 0 | Block any flagged content |
| General social | 2 | 0 | 2 | 2 | Allow mild references, zero tolerance on self-harm |
| Medical forum | 4 | 2 | 2 | 2 | Allow clinical injury descriptions |
| News platform | 4 | 2 | 2 | 4 | Allow reporting on violence and hate incidents |
def moderate_text(client, text, thresholds):
response = client.analyze_text(AnalyzeTextOptions(text=text))
violations = []
for r in response.categories_analysis:
limit = thresholds.get(r.category, 2)
if r.severity >= limit:
violations.append((r.category, r.severity, limit))
return (len(violations) == 0), violations
thresholds = {"Violence": 4, "SelfHarm": 0, "Sexual": 2, "Hate": 2}
allowed, violations = moderate_text(client, user_input, thresholds)
Text longer than 10,000 characters must be chunked; analyze each chunk and aggregate the worst severity per category.
Blocklists
The four harm categories are general-purpose. They will not catch a competitor's brand name, an internal project codeword, or a banned phrase specific to your business — that is exactly what a blocklist is for. Blocklists do exact-match (and pattern) term filtering and run alongside the harm classifiers.
from azure.ai.contentsafety.models import (
TextBlocklist, AddOrUpdateTextBlocklistItemsOptions, TextBlocklistItem)
client.create_or_update_text_blocklist(
blocklist_name="competitor-names",
options=TextBlocklist(description="Block competitor brand mentions"))
client.add_or_update_blocklist_items(
blocklist_name="competitor-names",
options=AddOrUpdateTextBlocklistItemsOptions(blocklist_items=[
TextBlocklistItem(text="CompetitorBrand", description="Main rival")]))
resp = client.analyze_text(AnalyzeTextOptions(
text="I recommend CompetitorBrand here",
blocklist_names=["competitor-names"],
halt_on_blocklist_hit=True)) # stop on first blocklist match
for m in resp.blocklists_match:
print(m.blocklist_name, m.blocklist_item_text)
On the Exam: Blocklist vs harm category is a classic distractor. "Block mentions of a rival product / internal code / specific slur list" → blocklist. "Detect violent or hateful content generally" → built-in categories.
halt_on_blocklist_hit=Trueshort-circuits the call when a match is found, skipping further harm analysis for efficiency.
Custom Categories
When you need a category the four defaults do not cover — and a static blocklist is too brittle — train a custom category with labeled positive and negative examples. Custom categories require the standard (S0) tier.
| Industry | Custom category | Examples |
|---|---|---|
| Finance | Unlicensed investment advice | Guaranteed-return promises, stock tips |
| Healthcare | Medical misinformation | Unproven cures, anti-vaccine claims |
| Education | Academic dishonesty | "Write my essay," "solve my exam" |
| Gaming | Cheating / exploits | Hack tutorials, exploit sharing |
Choose blocklist for known exact strings, custom category for fuzzy concepts that need a model to generalize.
Image Moderation
from azure.ai.contentsafety.models import AnalyzeImageOptions, ImageData
with open("upload.jpg", "rb") as f:
resp = client.analyze_image(AnalyzeImageOptions(
image=ImageData(content=f.read())))
for r in resp.categories_analysis:
if r.severity >= 4:
block(r.category)
You can also pass ImageData(blob_url=...) to analyze an image already in Azure Blob Storage rather than uploading bytes.
| Parameter | Limit |
|---|---|
| Max file size | 4 MB |
| Formats | JPEG, PNG, GIF, BMP, TIFF, WEBP |
| Min dimensions | 50 x 50 pixels |
| Max dimensions | ~2,048 x 2,048 (larger images are downscaled) |
Image severities are returned on the four-value scale (0, 2, 4, 6).
Three-Tier Human-Review Workflow
Fully automated allow/block produces false positives near the threshold. The exam-favored pattern routes the uncertain middle band to humans:
[User content] -> [Content Safety]
severity 0-1 -> auto-approve
severity 2-3 -> human review queue
severity 4+ -> auto-reject
- Auto-approve severity 0-1 (clearly safe).
- Human review severity 2-3 (borderline — human judgment).
- Auto-reject severity 4+ (clearly harmful).
This balances safety against reviewer cost and is the right answer whenever a scenario mentions "borderline" content or "reducing false positives." Persist the moderation decision and reviewer outcome for audit and Responsible AI reporting.
Batching, Latency, and Cost
Text and image calls are single-item per request, so high-volume moderation is throughput-bound. Practical patterns:
- Pre-filter with blocklists. A
halt_on_blocklist_hit=Truematch can short-circuit before the harm classifiers run, saving latency and quota on obvious rejects. - Chunk long text at sentence boundaries near the 10,000-character cap and aggregate by taking the maximum severity per category across chunks — never the average, or a single severe sentence buried in benign text gets diluted below threshold.
- Cache decisions for identical content (hash the input) so repeated submissions of the same comment do not re-bill the API.
- Tier strictness to risk. Public, anonymous, child-facing surfaces warrant strict thresholds and Low+Medium+High blocking; authenticated, niche professional forums can tolerate higher thresholds.
Blocklist vs Custom Category vs Built-In — Decision Guide
This three-way choice is heavily tested. Anchor on what kind of rule you are expressing:
| Need | Mechanism | Tier |
|---|---|---|
| Detect general hate / violence / sexual / self-harm | Built-in harm categories | F0 / S0 |
| Block a fixed list of known exact strings (names, codes) | Blocklist | S0 |
| Detect a fuzzy domain concept needing generalization | Custom category (trained) | S0 |
| Enforce two of these at once | Combine them in one analyze call | S0 |
The built-in categories and a blocklist run in the same analyze_text call; you do not need separate requests. A blocklist is deterministic and instant to update (no training); a custom category must be trained on labeled examples but generalizes to phrasings you never explicitly listed.
Responsible AI Tie-In
Content moderation maps directly to the Microsoft Responsible AI principle of safety and reliability. For AI-102, remember that moderation should be logged, explainable (store the triggering category and severity), and paired with a human appeal path — automated blocking without recourse is a fairness and transparency failure, and exam scenarios about "users disputing a block" point to a human-review and audit workflow rather than tightening thresholds further.
What is the maximum text length accepted in a single Azure AI Content Safety text-moderation request?
When should you use a blocklist rather than the built-in harm categories?
A moderation pipeline auto-approves severity 0-1 and auto-rejects severity 4+. Content scores severity 3. What should happen?
Which AnalyzeTextOptions parameter stops analysis as soon as a blocklist term is matched?