5.3 Azure AI Document Intelligence

Key Takeaways

  • Azure AI Document Intelligence (formerly Form Recognizer) extracts structured data from documents including text, key-value pairs, tables, and specific fields.
  • Prebuilt models handle common document types: invoices, receipts, business cards, ID documents, W-2 forms, health insurance cards, and more.
  • Custom models can be trained for domain-specific documents — either template-based (fixed layout) or neural (varying layouts).
  • Composed models combine multiple custom models into a single endpoint that automatically classifies incoming documents and routes them to the appropriate model.
  • The Layout model extracts text, tables, selection marks, and document structure (headings, paragraphs, sections) without any training.
Last updated: March 2026

Azure AI Document Intelligence

Quick Answer: Document Intelligence (formerly Form Recognizer) extracts structured data from documents. Use prebuilt models for common documents (invoices, receipts, IDs), custom models for domain-specific documents, and composed models to automatically route different document types. The Layout API extracts text, tables, and structure without training.

Document Intelligence Models

Prebuilt Models

ModelDocument TypeExtracted Fields
InvoiceInvoicesVendor, customer, amounts, line items, tax, total
ReceiptReceiptsMerchant, date, items, subtotal, tax, total, tip
ID DocumentIDs, passports, driver licensesName, DOB, address, document number, expiration
Business CardBusiness cardsName, title, company, phone, email, address
W-2US tax form W-2Employee info, employer info, wages, taxes
Health InsuranceHealth insurance cardsInsurer, member ID, group number, plan
ContractContractsParties, dates, terms
US Tax Forms1040, 1098, 1099 variantsAll relevant tax fields

General Models

ModelPurposeTraining Required
ReadExtract text and language from documentsNo
LayoutExtract text, tables, selection marks, structureNo
General DocumentExtract key-value pairs from any documentNo

Using Prebuilt Models

from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.core.credentials import AzureKeyCredential

client = DocumentIntelligenceClient(
    endpoint="https://my-doc-intel.cognitiveservices.azure.com/",
    credential=AzureKeyCredential("<your-key>")
)

# Analyze an invoice
with open("invoice.pdf", "rb") as f:
    poller = client.begin_analyze_document(
        model_id="prebuilt-invoice",
        body=f
    )
result = poller.result()

for document in result.documents:
    vendor = document.fields.get("VendorName")
    if vendor:
        print(f"Vendor: {vendor.content} "
              f"(confidence: {vendor.confidence:.2f})")

    invoice_total = document.fields.get("InvoiceTotal")
    if invoice_total:
        print(f"Total: {invoice_total.content} "
              f"(confidence: {invoice_total.confidence:.2f})")

    # Access line items
    items = document.fields.get("Items")
    if items:
        for item in items.value:
            description = item.value.get("Description")
            amount = item.value.get("Amount")
            print(f"  Item: {description.content} = {amount.content}")

Layout Model

The Layout model extracts document structure without any training:

# Extract layout (text, tables, structure)
with open("document.pdf", "rb") as f:
    poller = client.begin_analyze_document(
        model_id="prebuilt-layout",
        body=f
    )
result = poller.result()

# Extract text by page
for page in result.pages:
    print(f"Page {page.page_number}:")
    for line in page.lines:
        print(f"  Line: {line.content}")

# Extract tables
for table in result.tables:
    print(f"Table ({table.row_count} rows x {table.column_count} columns):")
    for cell in table.cells:
        print(f"  [{cell.row_index},{cell.column_index}]: {cell.content}")

# Extract selection marks (checkboxes)
for page in result.pages:
    for mark in page.selection_marks:
        print(f"  Checkbox at ({mark.polygon}): {mark.state}")
        # state: "selected" or "unselected"

Custom Models

Template Models (Fixed Layout)

  • Train on documents with a consistent layout
  • Best for: Forms, applications, structured questionnaires
  • Minimum: 5 training documents

Neural Models (Varying Layout)

  • Handle documents with varying layouts and formats
  • Best for: Contracts, letters, documents with unpredictable structures
  • Minimum: 5 training documents (recommended: 15+)
  • Better generalization than template models

Training a Custom Model

# Start custom model training
poller = client.begin_build_document_model(
    build_mode="template",  # or "neural"
    blob_container_url="https://storage.blob.core.windows.net/training-data?<SAS>",
    description="Custom purchase order model"
)

model = poller.result()
print(f"Model ID: {model.model_id}")
print(f"Fields: {model.doc_types}")

Composed Models

Composed models combine multiple custom models into a single endpoint:

# Create a composed model from existing models
poller = client.begin_compose_document_model(
    component_model_ids=["invoice-model", "receipt-model", "po-model"],
    description="Unified document processing model"
)

composed_model = poller.result()
print(f"Composed model ID: {composed_model.model_id}")

# When you analyze a document with the composed model,
# it automatically classifies and routes to the correct component model

How Composed Models Work

  1. A document is submitted to the composed model endpoint
  2. The composed model classifies the document type
  3. The appropriate component model processes the document
  4. Results include the document type and extracted fields

On the Exam: Composed models are the answer when a scenario describes needing to process multiple document types through a single endpoint. The composed model handles routing automatically — no pre-classification step is needed.

Document Classification

Document classification models categorize documents without extracting fields:

FeatureDescription
PurposeSort documents into categories before processing
TrainingProvide labeled document examples per category
OutputDocument type classification with confidence score
Use caseMail sorting, document routing, triage
# Classify a document
poller = client.begin_classify_document(
    classifier_id="my-document-classifier",
    body=document_bytes
)
result = poller.result()

for document in result.documents:
    print(f"Type: {document.doc_type}")
    print(f"Confidence: {document.confidence:.2f}")
Test Your Knowledge

What was Azure AI Document Intelligence previously called?

A
B
C
D
Test Your Knowledge

Which model should you use to extract vendor name, invoice total, and line items from an invoice?

A
B
C
D
Test Your Knowledge

What is the purpose of a composed model in Document Intelligence?

A
B
C
D
Test Your Knowledge

What is the key difference between template and neural custom models?

A
B
C
D