5.2 AI Enrichment with Skillsets
Key Takeaways
- Skillsets are ordered arrays of skills that define the AI enrichment pipeline — each skill takes an input, processes it, and produces an output.
- Built-in skills include OCR, image analysis, entity recognition, key phrase extraction, sentiment analysis, language detection, text translation, and PII detection.
- Custom skills (Web API skill) allow calling external HTTP endpoints for custom processing logic not covered by built-in skills.
- The enrichment tree is a hierarchical document structure where each skill reads from and writes to specific paths (e.g., /document/content, /document/organizations).
- Knowledge stores persist enrichment outputs to Azure Table Storage, Blob Storage, or both for analytics beyond the search index.
AI Enrichment with Skillsets
Quick Answer: Skillsets define AI enrichment pipelines using built-in skills (OCR, NER, key phrases, sentiment) and custom skills (Web API calls). Skills read from and write to the enrichment tree (hierarchical document structure). Knowledge stores persist enriched data outside the search index.
Skillset Structure
{
"name": "my-ai-skillset",
"description": "Extract entities, key phrases, and perform OCR",
"skills": [
{
"@odata.type": "#Microsoft.Skills.Vision.OcrSkill",
"name": "ocr-skill",
"context": "/document/normalized_images/*",
"inputs": [
{"name": "image", "source": "/document/normalized_images/*"}
],
"outputs": [
{"name": "text", "targetName": "ocrText"}
]
},
{
"@odata.type": "#Microsoft.Skills.Text.V3.EntityRecognitionSkill",
"name": "entity-recognition",
"context": "/document",
"categories": ["Organization", "Person", "Location"],
"inputs": [
{"name": "text", "source": "/document/content"}
],
"outputs": [
{"name": "organizations", "targetName": "organizations"},
{"name": "persons", "targetName": "people"},
{"name": "locations", "targetName": "locations"}
]
},
{
"@odata.type": "#Microsoft.Skills.Text.KeyPhraseExtractionSkill",
"name": "key-phrase-extraction",
"context": "/document",
"inputs": [
{"name": "text", "source": "/document/content"}
],
"outputs": [
{"name": "keyPhrases", "targetName": "keyPhrases"}
]
}
],
"knowledgeStore": {
"storageConnectionString": "<storage-connection-string>",
"projections": [
{
"tables": [
{
"tableName": "documentsTable",
"generatedKeyName": "docId",
"source": "/document"
}
]
}
]
}
}
Built-in Skills
Text Skills
| Skill | @odata.type | Input | Output |
|---|---|---|---|
| Entity Recognition | #Microsoft.Skills.Text.V3.EntityRecognitionSkill | text | organizations, persons, locations, etc. |
| Key Phrase Extraction | #Microsoft.Skills.Text.KeyPhraseExtractionSkill | text | keyPhrases |
| Language Detection | #Microsoft.Skills.Text.LanguageDetectionSkill | text | languageCode, languageName |
| Sentiment Analysis | #Microsoft.Skills.Text.V3.SentimentSkill | text | sentiment, confidenceScores |
| PII Detection | #Microsoft.Skills.Text.PIIDetectionSkill | text | piiEntities, maskedText |
| Text Merge | #Microsoft.Skills.Text.MergeSkill | text, insertText | mergedText |
| Text Split | #Microsoft.Skills.Text.SplitSkill | text | textItems (chunks) |
| Translation | #Microsoft.Skills.Text.TranslationSkill | text | translatedText |
Vision Skills
| Skill | @odata.type | Input | Output |
|---|---|---|---|
| OCR | #Microsoft.Skills.Vision.OcrSkill | image | text |
| Image Analysis | #Microsoft.Skills.Vision.ImageAnalysisSkill | image | tags, description, categories |
Utility Skills
| Skill | @odata.type | Purpose |
|---|---|---|
| Shaper | #Microsoft.Skills.Util.ShaperSkill | Reshape enrichment tree for knowledge store projections |
| Conditional | #Microsoft.Skills.Util.ConditionalSkill | If-then-else logic in the pipeline |
The Enrichment Tree
The enrichment tree is a hierarchical structure representing the document and its enrichments:
/document
├── content (original text content)
├── metadata_storage_name (file name)
├── metadata_storage_path (file path)
├── normalized_images/ (extracted images)
│ ├── [0]
│ │ ├── ocrText (OCR output)
│ │ └── imageTags (image analysis output)
│ └── [1]
│ ├── ocrText
│ └── imageTags
├── organizations (NER output)
├── people (NER output)
├── locations (NER output)
├── keyPhrases (key phrase output)
└── language (language detection output)
On the Exam: Understanding the enrichment tree path syntax is critical. Skills reference paths like
/document/content(text content),/document/normalized_images/*(each image), and/document/organizations(enrichment output). Questions may ask you to fix a skillset by correcting the input/output paths.
Custom Skills (Web API Skill)
Custom skills call external HTTP endpoints for processing not covered by built-in skills:
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "custom-classification",
"description": "Call a custom classification API",
"uri": "https://my-function-app.azurewebsites.net/api/classify",
"httpMethod": "POST",
"timeout": "PT30S",
"batchSize": 10,
"context": "/document",
"inputs": [
{"name": "text", "source": "/document/content"}
],
"outputs": [
{"name": "category", "targetName": "documentCategory"}
]
}
Custom Skill Request/Response Contract
The custom endpoint must accept and return data in a specific format:
Request:
{
"values": [
{
"recordId": "1",
"data": {
"text": "Document content to classify..."
}
}
]
}
Response:
{
"values": [
{
"recordId": "1",
"data": {
"category": "Legal"
},
"errors": [],
"warnings": []
}
]
}
Knowledge Stores
Knowledge stores persist enriched data outside the search index for analytics and downstream processing:
Projection Types
| Projection | Storage | Best For |
|---|---|---|
| Table projections | Azure Table Storage | Structured data for Power BI analytics |
| Object projections | Azure Blob Storage | Enriched JSON documents |
| File projections | Azure Blob Storage | Normalized images from documents |
Use Cases
- Power BI dashboards: Connect to table projections for business analytics
- Data science: Use object projections as training data for ML models
- Document archives: Store enriched documents with extracted metadata
What is the enrichment tree in Azure AI Search?
When should you use a Custom Web API Skill instead of a built-in skill?
Which knowledge store projection type should you use for Power BI dashboards?