3.4 Azure AI Document Intelligence

Key Takeaways

  • Azure AI Document Intelligence (formerly Form Recognizer) extracts structured data from documents — text, tables, key-value pairs, and layout information.
  • Pre-built models handle common document types: invoices, receipts, ID documents, W-2 tax forms, and business cards without training.
  • Custom models can be trained to extract data from your organization's specific document formats (purchase orders, contracts, custom forms).
  • The Layout API extracts text, tables, selection marks, and document structure (paragraphs, sections, headers) from any document.
  • Document Intelligence goes beyond OCR by understanding document STRUCTURE — it knows a value belongs to a specific field, not just what text is present.
Last updated: March 2026

Azure AI Document Intelligence

Quick Answer: Azure AI Document Intelligence extracts structured data from documents including text, tables, key-value pairs, and layout information. It provides pre-built models for invoices, receipts, ID documents, and tax forms. Custom models can be trained for your specific document types. It goes beyond basic OCR by understanding document structure.

What Is Document Intelligence?

Azure AI Document Intelligence (formerly Azure Form Recognizer) is a service that uses machine learning to extract text, key-value pairs, tables, and structures from documents. Unlike basic OCR which just reads text, Document Intelligence understands the structure of documents.

OCR vs. Document Intelligence

CapabilityBasic OCR (Azure AI Vision)Document Intelligence
Extract text from imageYesYes
Identify text positionYesYes
Extract key-value pairsNoYes ("Invoice Number: 12345")
Extract tablesNoYes (rows, columns, cells)
Understand document structureNoYes (headers, paragraphs, sections)
Pre-built document modelsNoYes (invoices, receipts, IDs)
Custom document modelsNoYes (train on your documents)

Pre-Built Models

Document Intelligence provides pre-trained models for common document types:

Pre-Built ModelWhat It ExtractsExample Fields
InvoiceInvoice dataInvoice number, date, total, line items, vendor
ReceiptReceipt dataMerchant name, date, total, items, tax
ID DocumentIdentity card dataName, date of birth, address, document number
W-2Tax form dataEmployee info, wages, tax withholdings
Business CardContact infoName, company, phone, email, address
Health Insurance CardInsurance dataInsurer, member ID, group number, plan

How Pre-Built Models Work

  1. Submit a document (image or PDF) to the pre-built model endpoint
  2. The model analyzes the document structure and content
  3. Receive extracted fields with confidence scores

No training is needed — the models are pre-trained by Microsoft on millions of documents.

Custom Models

When pre-built models don't cover your document types, you can train custom models:

Custom Template Models

  • Trained on a specific document template/layout
  • Works best when documents have a consistent format
  • Requires 5+ sample documents per template
  • Ideal for: standardized forms, purchase orders, specific contracts

Custom Neural Models

  • Uses deep learning for more flexible extraction
  • Handles documents with varying layouts
  • Requires more training data (10+ documents)
  • Ideal for: documents with varied formats, mixed layouts

The Layout API

The Layout API extracts document structure without requiring a specific document model:

Extracted ElementDescription
TextAll text content with positions
TablesTable structure with rows, columns, and cells
Selection marksCheckboxes and radio buttons (selected/unselected)
ParagraphsText grouped into paragraphs
SectionsDocument sections and subsections
Headers/FootersPage headers and footers
Page numbersPage number identification
Barcodes1D and 2D barcodes (QR codes, Code 128, etc.)

On the Exam: The Layout API is the most general Document Intelligence capability — it works on any document without training. Pre-built models are for specific document types. Custom models are for your organization's unique document formats. Know which to recommend for each scenario.

Test Your Knowledge

How does Azure AI Document Intelligence differ from basic OCR?

A
B
C
D
Test Your Knowledge

A company processes thousands of invoices from different vendors. They want to automatically extract invoice numbers, dates, totals, and line items. Which Document Intelligence approach should they use?

A
B
C
D
Test Your Knowledge

When should you train a custom Document Intelligence model instead of using a pre-built model?

A
B
C
D