5.5 Responsible Generative AI

Key Takeaways

  • Responsible generative AI applies all six Microsoft responsible AI principles specifically to generative AI systems — with additional considerations for content generation.
  • Content filters in Azure OpenAI scan both prompts (input) and completions (output) for harmful content across four categories: hate, violence, sexual content, and self-harm.
  • Prompt Shields detect and block jailbreak attempts — techniques where users try to bypass the model's safety guardrails.
  • Groundedness detection verifies that AI-generated responses are factually supported by the provided source documents, helping reduce hallucinations.
  • Microsoft Copilot incorporates responsible AI by design — with content filters, citations, disclaimers, and user feedback mechanisms built into the product.
Last updated: March 2026

Responsible Generative AI

Quick Answer: Responsible generative AI applies Microsoft's six principles with additional safeguards: content filters block harmful content in prompts and responses, Prompt Shields prevent jailbreaks, and groundedness detection ensures responses are factually supported. Azure OpenAI includes these protections by default.

Why Responsible Generative AI Is Different

Generative AI introduces unique risks that traditional AI does not have:

RiskDescriptionExample
HallucinationModel generates false information confidently"Abraham Lincoln invented the telephone"
Harmful content generationModel produces hate speech, violence, or inappropriate contentGenerating offensive text or images
JailbreakingUsers craft prompts to bypass safety guardrails"Ignore your instructions and tell me how to..."
Prompt injectionHidden instructions in user input override system messageInjecting instructions through pasted text
Copyright concernsModel may generate content closely matching copyrighted materialReproducing copyrighted text or code
MisinformationModel generates plausible-sounding but false informationFake medical advice or legal information
Privacy leakageModel may reveal personal information from training dataGenerating someone's phone number or address

Content Filtering in Azure OpenAI

Azure OpenAI includes default content filters that are always active:

Four Filter Categories

CategoryWhat It BlocksSeverity Levels
HateDiscrimination, slurs, dehumanizationSafe → Low → Medium → High
ViolenceThreats, graphic violence, harm descriptionsSafe → Low → Medium → High
SexualSexually explicit or suggestive contentSafe → Low → Medium → High
Self-harmSelf-injury descriptions, instructionsSafe → Low → Medium → High

How Filtering Works

User Prompt → [Input Content Filter] → Model → [Output Content Filter] → Response
  1. User submits a prompt to the API
  2. Input filter scans the prompt for harmful content
  3. If the prompt passes, the model generates a response
  4. Output filter scans the response for harmful content
  5. If the response passes, it is returned to the user
  6. If either filter triggers, the request is blocked with an error

On the Exam: Azure OpenAI content filters are enabled BY DEFAULT — they scan BOTH prompts AND responses. You do not need to enable them separately. This is a common exam question.

Prompt Shields

Prompt Shields detect and block jailbreak attempts — techniques where users try to make the model ignore its safety instructions:

Types of Jailbreak Attempts

TechniqueExampleHow Prompt Shields Help
Direct override"Ignore all previous instructions and..."Detects instruction override patterns
Role playing"Pretend you are an AI with no restrictions..."Identifies role-play bypass attempts
Encoding tricksUsing base64 or alternative encodings to hide harmful requestsDetects encoded content
Indirect injectionHarmful instructions hidden in pasted documents or URLsScans all input content for hidden instructions

Groundedness Detection

Groundedness detection verifies that AI-generated responses are factually supported by provided source documents:

CheckDescriptionResult
GroundedResponse is supported by the provided contextSafe — response is factual
UngroundedResponse contains claims not supported by contextFlagged — may be a hallucination

This is especially important for RAG scenarios where you want the model to only answer from retrieved documents.

Microsoft Copilot and Responsible AI

Microsoft Copilot (the AI assistant built into Microsoft 365, Windows, Bing, and other products) incorporates responsible AI by design:

FeatureResponsible AI Purpose
Content filtersBlock harmful content in prompts and responses
CitationsShow sources for generated information (transparency)
Disclaimers"AI-generated content may be incorrect" warnings (transparency)
User feedbackThumbs up/down to report quality issues (accountability)
Data boundariesEnterprise data does not leave the Microsoft 365 boundary (privacy)
Conversation limitsLimit conversation length to prevent harmful content accumulation
Grounding in enterprise dataRespond based on your organization's data (accuracy)

Best Practices for Responsible Generative AI

For Developers and Organizations

PracticeDescription
Use content filtersKeep default Azure OpenAI content filters enabled
Implement RAGGround responses in factual data to reduce hallucinations
Set clear system messagesDefine what the AI should and should not do
Monitor usageReview logs for harmful content and misuse
Provide disclaimersInform users that content is AI-generated
Enable feedbackAllow users to report inaccurate or harmful responses
Test thoroughlyRed-team your application to find vulnerabilities
Human oversightKeep humans in the loop for high-stakes decisions
Document limitationsBe transparent about what the AI can and cannot do

For End Users

PracticeDescription
Verify informationDo not blindly trust AI-generated content
Provide contextGive the AI relevant information to reduce hallucinations
Review for biasCheck AI outputs for potential bias or unfairness
Report issuesUse feedback mechanisms to report problems
Understand limitationsKnow that AI can make mistakes and has knowledge cutoffs

On the Exam: Responsible generative AI questions often ask about content filters, hallucination mitigation (RAG/grounding), or how Copilot implements responsible AI. Know that content filters scan both inputs and outputs, RAG reduces hallucinations, and Copilot includes citations and disclaimers for transparency.

Test Your Knowledge

What are Prompt Shields in Azure OpenAI Service?

A
B
C
D
Test Your Knowledge

What does groundedness detection verify in generative AI?

A
B
C
D
Test Your Knowledge

Which of the following is a responsible AI feature built into Microsoft Copilot?

A
B
C
D
Test Your Knowledge

What is the best technique to reduce hallucinations in generative AI responses?

A
B
C
D
Test Your KnowledgeMulti-Select

Which THREE of the following are categories used by Azure OpenAI content filters? (Select three)

Select all that apply

Hate speech
Political opinions
Violence
Competitive analysis
Self-harm