7.2 SDK and REST API Patterns
Key Takeaways
- Every Azure AI SDK follows the same shape: create a client from endpoint + credential, call a method, process the typed response.
- Key auth uses the Ocp-Apim-Subscription-Key header; Entra ID auth uses Authorization: Bearer <token>; managed identity via DefaultAzureCredential is the production-preferred path.
- Long-running operations (training, batch analyze, custom-model builds) return HTTP 202 with an Operation-Location header in REST, or a poller object in the SDK whose .result() blocks until done.
- Memorize the distinct base URLs: most services use <resource>.cognitiveservices.azure.com, but Translator uses api.cognitive.microsofttranslator.com, AI Search uses <service>.search.windows.net, and Azure OpenAI uses <resource>.openai.azure.com.
- Handle status codes deliberately: 401 (bad key/token), 403 (RBAC), 404 (endpoint), 429 (rate limit, back off using Retry-After), and content_filter errors from Azure OpenAI.
Quick Answer: Client = endpoint + credential, then call a method, then read a typed response. REST uses the
Ocp-Apim-Subscription-Keyheader; Entra ID usesAuthorization: Bearer. Async work returns 202 +Operation-Location(REST) or a poller (SDK). Handle 401, 403, 404, 429, andcontent_filter.
Universal SDK Pattern
The AI-102 expects you to read code, not write it from scratch. Every SDK uses the same three steps:
from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential
# Key-based auth
client = TextAnalyticsClient(
endpoint="https://my-res.cognitiveservices.azure.com/",
credential=AzureKeyCredential("<key>"))
# Production-preferred: managed identity / Entra ID
client = TextAnalyticsClient(
endpoint="https://my-res.cognitiveservices.azure.com/",
credential=DefaultAzureCredential())
result = client.analyze_sentiment(["I love this product"])
for doc in result:
print(doc.sentiment, doc.confidence_scores)
Authentication Headers (REST)
| Method | Header | Notes |
|---|---|---|
| API key | Ocp-Apim-Subscription-Key: <key> | Simplest; avoid in production |
| Entra ID token | Authorization: Bearer <token> | Needs an RBAC role (e.g., Cognitive Services User) |
| Translator | Ocp-Apim-Subscription-Key + Ocp-Apim-Subscription-Region | Region header is required for multi-service keys |
| Azure OpenAI | api-key: <key> or Authorization: Bearer | Note: OpenAI uses api-key, not the Ocp-Apim header |
Trap: Translator with a multi-service resource fails with 401 unless you also send
Ocp-Apim-Subscription-Region. Azure OpenAI key auth uses the lowercaseapi-keyheader, notOcp-Apim-Subscription-Key.
Service-Specific Endpoint Patterns
| Service | Base URL pattern |
|---|---|
| AI Vision / Language / Document Intelligence | https://<resource>.cognitiveservices.azure.com/ |
| AI Translator | https://api.cognitive.microsofttranslator.com/ |
| AI Speech | https://<region>.api.cognitive.microsoft.com/ |
| AI Search | https://<service>.search.windows.net/ |
| Azure OpenAI | https://<resource>.openai.azure.com/openai/ |
Knowing which base URL belongs to which service is directly tested — Translator, Search, and OpenAI each break the common cognitiveservices.azure.com pattern.
Long-Running Operations
# SDK: begin_* returns a poller
poller = client.begin_analyze_document(
"prebuilt-invoice", body=file_bytes)
result = poller.result() # blocks until 'succeeded'
In REST, the initial POST returns 202 Accepted with an Operation-Location header. You then GET that URL on an interval until status is succeeded (or failed). Operations that use this pattern include Document Intelligence analyze, Language custom-model training, batch document translation, and Vision OCR for large files.
Error Handling and Status Codes
from azure.core.exceptions import (
HttpResponseError, ClientAuthenticationError, ResourceNotFoundError)
try:
result = client.analyze_text(request)
except ClientAuthenticationError:
log("401: bad key or token")
except ResourceNotFoundError:
log("404: wrong endpoint or deployment")
except HttpResponseError as e:
if e.status_code == 429:
sleep(int(e.response.headers.get("Retry-After", 5)))
| Code | Meaning | Action |
|---|---|---|
| 200 | OK | Process response |
| 202 | Accepted (async) | Poll Operation-Location |
| 400 | Bad request | Fix payload/parameters |
| 401 | Unauthorized | Check key or token |
| 403 | Forbidden | Grant the RBAC role |
| 404 | Not found | Check endpoint/deployment name |
| 429 | Rate limited | Exponential backoff, honor Retry-After |
| 500/503 | Service error | Retry with backoff |
On the Exam: A 429 is never solved by rotating the key — it means quota/TPM exhaustion, so back off or raise the quota. A
content_filterfinish reason from Azure OpenAI is not an error to retry blindly; it means the prompt or completion was blocked by the safety system.
Choosing Key Auth vs Entra ID vs Managed Identity
The SDK accepts any credential object, but the exam wants you to choose the right one for the scenario. Key auth (AzureKeyCredential) is fine for a quick prototype or a local script, but it embeds a long-lived secret that must be rotated and protected. Entra ID auth uses a short-lived bearer token and requires the caller to hold an RBAC role such as Cognitive Services User on the resource — choose it when the requirement mentions centralized identity, auditing, or eliminating shared keys.
Managed identity, surfaced through DefaultAzureCredential, is Entra ID with no secret at all: Azure issues and rotates the token for the app's system- or user-assigned identity. For any "production", "no secrets in code", or "App Service / Functions / VM" scenario, managed identity is the answer. DefaultAzureCredential is also convenient because it tries managed identity in Azure and falls back to developer credentials locally, so the same code works in both places.
Reading Responses and Confidence Scores
Many code questions hinge on what the response object contains, not on how the client was built. Language sentiment returns an overall label plus confidence_scores for positive, neutral, and negative that sum to 1.0; the exam may ask you to pick the label with the highest score. NER and PII return each entity with a category, a text offset and length, and a confidence_score you can threshold. Document Intelligence returns fields keyed by name, each with a value, a content string, and a confidence — production code typically routes anything below a configured confidence to human review.
Knowing that these scores exist, and that they are per-item rather than per-document, lets you eliminate options that claim a service returns only a single yes/no result.
Batch, Streaming, and Pagination Patterns
Azure AI APIs expose three throughput shapes you should recognize. Synchronous single-call APIs (sentiment on a short batch, a chat completion) return the full result immediately. Long-running operations (custom-model training, batch document translation, large-file OCR) return 202 and a poller, as covered above. Streaming applies to Azure OpenAI chat: setting stream=true returns server-sent event chunks so a UI can render tokens as they arrive instead of waiting for the whole completion.
Finally, list endpoints such as AI Search query results and Document Intelligence model lists are paginated — the SDK exposes an iterator that fetches the next page transparently, so you iterate rather than assuming one response holds every record. Matching the throughput shape to the workload is a recurring distractor theme.
Which header authenticates a request to Azure AI services using an API key for a non-OpenAI service such as Azure AI Language?
A REST call to start Document Intelligence analysis returns HTTP 202 with an Operation-Location header. What should the client do next?
Which Azure AI service uses the base URL api.cognitive.microsofttranslator.com rather than the standard <resource>.cognitiveservices.azure.com pattern?