10.2 Document Intelligence and Compliance Review Lab

Key Takeaways

  • Document intelligence projects should separate capture, extraction, classification, summarization, human review, retention, and audit evidence.
  • Amazon Textract, Amazon Comprehend, Amazon Bedrock, Amazon A2I, S3, KMS, Macie, CloudTrail, and Audit Manager can each play a different role in a compliant workflow.
  • Generated summaries should assist reviewers, not replace accountable compliance or legal decisions in high-impact document workflows.
  • Failure modes include poor scan quality, wrong document type classification, missing tables, leaked PII, unsupported conclusions, and unclear reviewer authority.
  • A strong lab requires a decision log, exception queue, sample evidence package, and review prompts for privacy, accuracy, and operational ownership.
Last updated: May 2026

Lab scenario: contract and policy review intake

A financial services operations team receives vendor contracts, policy attestations, insurance certificates, and audit questionnaires. Reviewers manually open PDFs, copy fields into a case system, flag missing clauses, and write a short compliance summary. The team wants AI assistance to extract fields, identify document type, summarize key obligations, and route exceptions. The chief compliance officer says the AI may assist reviewers, but it may not make final legal or compliance decisions.

Start by splitting the workflow into stages. Document upload and storage are not the same as text extraction. Extraction is not the same as legal interpretation. Summarization is not the same as approval. A practitioner should map each stage to the service that fits the job and the risk control that keeps the workflow auditable. This prevents the common mistake of asking one foundation model to read everything, decide everything, and store everything with no review trail.

Workflow stageAWS fitPractitioner control question
Secure intakeAmazon S3 with KMS encryption and strict IAMWho can upload, read, tag, delete, and restore documents?
Sensitive data discoveryAmazon Macie for S3 data discoveryAre PII and sensitive buckets known before documents enter AI workflows?
OCR and formsAmazon TextractAre forms, tables, signatures, and low-quality scans tested separately?
Classification and entitiesAmazon Comprehend or Bedrock classification promptsIs the document type and field extraction accurate enough for routing?
Summary and clause reviewAmazon Bedrock with a controlled prompt and citations to extracted textDoes the model state uncertainty and avoid unsupported legal conclusions?
Human reviewAmazon A2I or an internal review queueWhich exceptions require a person and what evidence do they see?
Audit evidenceCloudTrail, CloudWatch, AWS Config, Audit Manager, and ticket historyCan the organization prove who accessed and approved each document?

The pilot should use non-sensitive sample documents or approved redacted copies. Store them in a dedicated S3 bucket with default encryption using AWS KMS, bucket policies that deny public access, and least-privilege IAM roles. Enable object versioning if the retention plan requires it. Use tags such as document type, business unit, review status, data classification, and retention category. If Macie flags unexpected sensitive data in a location that should hold redacted samples only, stop and fix the intake process before expanding the lab.

Use Amazon Textract when the task is extracting printed or handwritten text, key-value pairs, and tables from documents. Test different formats: a clean PDF, a scanned image, a document with rotated pages, and a form with multiple tables. Expected observations include text blocks, detected forms, table cells, confidence values, and pages that need review. Failure modes include missing small print, merged table cells, low-confidence handwriting, and signatures being treated as ordinary marks. Do not hide confidence values from reviewers if they are important to the decision.

Add classification and summarization as separate steps. Amazon Comprehend can help with entity extraction and classification patterns for text, while Bedrock can generate a structured summary from extracted text. The prompt should ask for a concise summary, missing required fields, cited source snippets, and a list of uncertainties. It should not ask the model to decide whether a contract is legally acceptable. For high-impact reviews, route low-confidence extraction, missing clauses, conflicting clauses, or disallowed document types to human reviewers.

Decision log example:

  • Intake bucket: dedicated S3 bucket with KMS encryption, blocked public access, and restricted roles.
  • Extraction service: Textract for OCR, forms, and tables because the source is scanned PDFs.
  • Summarization service: Bedrock with a prompt that requires citations to extracted text and uncertainty flags.
  • Human review trigger: missing signature, low extraction confidence, unsupported summary claim, sensitive data mismatch, or high-value vendor category.
  • Evidence owner: compliance operations owns review notes, while cloud operations owns access logs and storage controls.
  • Retention rule: documents and outputs follow approved records policy, not model developer preference.

A responsible review queue shows the original document, extracted fields, confidence indicators, generated summary, citations, and reason for escalation. Reviewers should be able to approve, reject, correct fields, request more information, or mark the document out of scope. Their decision becomes the business record. The generated summary is supporting evidence, not the authority. If the team cannot explain how a reviewer changed or accepted AI output, the workflow is not ready for regulated use.

Compliance failure modes deserve explicit testing. Upload a document with a missing signature and confirm it routes to review. Upload a document with a table that Textract partially reads and confirm the low-confidence field is visible. Include a prompt injection inside a document, such as instructions telling the model to ignore policy, and verify guardrails and prompts do not follow it. Include a document with customer PII and confirm masking, access, logging, and retention behavior match the data classification.

Review prompts before the quiz:

  • Which documents are safe sample inputs, and who approved their use?
  • Which extracted fields can be auto-populated, and which require reviewer confirmation?
  • What evidence would an auditor need to reconstruct a review decision?
  • How are low confidence, missing clauses, and sensitive data mismatches routed?
  • What would make this workflow inappropriate for AI assistance until governance improves?
Test Your Knowledge

A team needs to extract text, tables, and key-value pairs from scanned vendor forms before summarization. Which AWS service is the best fit for the extraction stage?

A
B
C
D
Test Your Knowledge

A generated contract summary says a required clause is present, but no extracted text supports the claim. What should the workflow do?

A
B
C
D
Test Your Knowledge

Which control best addresses unexpected sensitive data appearing in the document intake S3 bucket?

A
B
C
D