7.2 SDK and REST API Patterns

Key Takeaways

  • Every Azure AI SDK follows the same shape: create a client from endpoint + credential, call a method, process the typed response.
  • Key auth uses the Ocp-Apim-Subscription-Key header; Entra ID auth uses Authorization: Bearer <token>; managed identity via DefaultAzureCredential is the production-preferred path.
  • Long-running operations (training, batch analyze, custom-model builds) return HTTP 202 with an Operation-Location header in REST, or a poller object in the SDK whose .result() blocks until done.
  • Memorize the distinct base URLs: most services use <resource>.cognitiveservices.azure.com, but Translator uses api.cognitive.microsofttranslator.com, AI Search uses <service>.search.windows.net, and Azure OpenAI uses <resource>.openai.azure.com.
  • Handle status codes deliberately: 401 (bad key/token), 403 (RBAC), 404 (endpoint), 429 (rate limit, back off using Retry-After), and content_filter errors from Azure OpenAI.
Last updated: June 2026

Quick Answer: Client = endpoint + credential, then call a method, then read a typed response. REST uses the Ocp-Apim-Subscription-Key header; Entra ID uses Authorization: Bearer. Async work returns 202 + Operation-Location (REST) or a poller (SDK). Handle 401, 403, 404, 429, and content_filter.

Universal SDK Pattern

The AI-102 expects you to read code, not write it from scratch. Every SDK uses the same three steps:

from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential

# Key-based auth
client = TextAnalyticsClient(
    endpoint="https://my-res.cognitiveservices.azure.com/",
    credential=AzureKeyCredential("<key>"))

# Production-preferred: managed identity / Entra ID
client = TextAnalyticsClient(
    endpoint="https://my-res.cognitiveservices.azure.com/",
    credential=DefaultAzureCredential())

result = client.analyze_sentiment(["I love this product"])
for doc in result:
    print(doc.sentiment, doc.confidence_scores)

Authentication Headers (REST)

MethodHeaderNotes
API keyOcp-Apim-Subscription-Key: <key>Simplest; avoid in production
Entra ID tokenAuthorization: Bearer <token>Needs an RBAC role (e.g., Cognitive Services User)
TranslatorOcp-Apim-Subscription-Key + Ocp-Apim-Subscription-RegionRegion header is required for multi-service keys
Azure OpenAIapi-key: <key> or Authorization: BearerNote: OpenAI uses api-key, not the Ocp-Apim header

Trap: Translator with a multi-service resource fails with 401 unless you also send Ocp-Apim-Subscription-Region. Azure OpenAI key auth uses the lowercase api-key header, not Ocp-Apim-Subscription-Key.

Service-Specific Endpoint Patterns

ServiceBase URL pattern
AI Vision / Language / Document Intelligencehttps://<resource>.cognitiveservices.azure.com/
AI Translatorhttps://api.cognitive.microsofttranslator.com/
AI Speechhttps://<region>.api.cognitive.microsoft.com/
AI Searchhttps://<service>.search.windows.net/
Azure OpenAIhttps://<resource>.openai.azure.com/openai/

Knowing which base URL belongs to which service is directly tested — Translator, Search, and OpenAI each break the common cognitiveservices.azure.com pattern.

Long-Running Operations

# SDK: begin_* returns a poller
poller = client.begin_analyze_document(
    "prebuilt-invoice", body=file_bytes)
result = poller.result()   # blocks until 'succeeded'

In REST, the initial POST returns 202 Accepted with an Operation-Location header. You then GET that URL on an interval until status is succeeded (or failed). Operations that use this pattern include Document Intelligence analyze, Language custom-model training, batch document translation, and Vision OCR for large files.

Error Handling and Status Codes

from azure.core.exceptions import (
    HttpResponseError, ClientAuthenticationError, ResourceNotFoundError)
try:
    result = client.analyze_text(request)
except ClientAuthenticationError:
    log("401: bad key or token")
except ResourceNotFoundError:
    log("404: wrong endpoint or deployment")
except HttpResponseError as e:
    if e.status_code == 429:
        sleep(int(e.response.headers.get("Retry-After", 5)))
CodeMeaningAction
200OKProcess response
202Accepted (async)Poll Operation-Location
400Bad requestFix payload/parameters
401UnauthorizedCheck key or token
403ForbiddenGrant the RBAC role
404Not foundCheck endpoint/deployment name
429Rate limitedExponential backoff, honor Retry-After
500/503Service errorRetry with backoff

On the Exam: A 429 is never solved by rotating the key — it means quota/TPM exhaustion, so back off or raise the quota. A content_filter finish reason from Azure OpenAI is not an error to retry blindly; it means the prompt or completion was blocked by the safety system.

Choosing Key Auth vs Entra ID vs Managed Identity

The SDK accepts any credential object, but the exam wants you to choose the right one for the scenario. Key auth (AzureKeyCredential) is fine for a quick prototype or a local script, but it embeds a long-lived secret that must be rotated and protected. Entra ID auth uses a short-lived bearer token and requires the caller to hold an RBAC role such as Cognitive Services User on the resource — choose it when the requirement mentions centralized identity, auditing, or eliminating shared keys.

Managed identity, surfaced through DefaultAzureCredential, is Entra ID with no secret at all: Azure issues and rotates the token for the app's system- or user-assigned identity. For any "production", "no secrets in code", or "App Service / Functions / VM" scenario, managed identity is the answer. DefaultAzureCredential is also convenient because it tries managed identity in Azure and falls back to developer credentials locally, so the same code works in both places.

Reading Responses and Confidence Scores

Many code questions hinge on what the response object contains, not on how the client was built. Language sentiment returns an overall label plus confidence_scores for positive, neutral, and negative that sum to 1.0; the exam may ask you to pick the label with the highest score. NER and PII return each entity with a category, a text offset and length, and a confidence_score you can threshold. Document Intelligence returns fields keyed by name, each with a value, a content string, and a confidence — production code typically routes anything below a configured confidence to human review.

Knowing that these scores exist, and that they are per-item rather than per-document, lets you eliminate options that claim a service returns only a single yes/no result.

Batch, Streaming, and Pagination Patterns

Azure AI APIs expose three throughput shapes you should recognize. Synchronous single-call APIs (sentiment on a short batch, a chat completion) return the full result immediately. Long-running operations (custom-model training, batch document translation, large-file OCR) return 202 and a poller, as covered above. Streaming applies to Azure OpenAI chat: setting stream=true returns server-sent event chunks so a UI can render tokens as they arrive instead of waiting for the whole completion.

Finally, list endpoints such as AI Search query results and Document Intelligence model lists are paginated — the SDK exposes an iterator that fetches the next page transparently, so you iterate rather than assuming one response holds every record. Matching the throughput shape to the workload is a recurring distractor theme.

Test Your Knowledge

Which header authenticates a request to Azure AI services using an API key for a non-OpenAI service such as Azure AI Language?

A
B
C
D
Test Your Knowledge

A REST call to start Document Intelligence analysis returns HTTP 202 with an Operation-Location header. What should the client do next?

A
B
C
D
Test Your Knowledge

Which Azure AI service uses the base URL api.cognitive.microsofttranslator.com rather than the standard <resource>.cognitiveservices.azure.com pattern?

A
B
C
D