4.5 Azure AI Translator Service

Key Takeaways

  • Azure AI Translator supports text translation across 135 languages, document translation that preserves formatting, and Custom Translator for domain-specific models.
  • Translator uses a dedicated global endpoint (api.cognitive.microsofttranslator.com) and requires the Ocp-Apim-Subscription-Region header in addition to the key.
  • The /translate operation accepts a 'to' array for multi-target translation in one call and can auto-detect the source when 'from' is omitted.
  • Transliteration converts between scripts of the SAME language; it does not translate meaning.
  • Custom Translator trains on parallel sentences (≈10,000 minimum), is evaluated with BLEU score, and is invoked through a custom category ID.
Last updated: June 2026

Quick Answer: Azure AI Translator provides text translation across 135 languages, document translation that preserves formatting, and Custom Translator for domain models. It uses a unique global endpoint and, unlike most Azure AI services, requires a region header in addition to the key. The /translate operation takes a to array for multi-language output in one request.

The Translator Endpoint Is Different

Most Azure AI services use https://<resource>.cognitiveservices.azure.com with only Ocp-Apim-Subscription-Key. Translator uses the global host https://api.cognitive.microsofttranslator.com and additionally requires Ocp-Apim-Subscription-Region. Omitting the region header is the single most common Translator failure (returns 401).

import requests, uuid
endpoint = "https://api.cognitive.microsofttranslator.com"
params = {"api-version": "3.0", "from": "en", "to": ["fr", "de", "es"]}
headers = {
  "Ocp-Apim-Subscription-Key": "<key>",
  "Ocp-Apim-Subscription-Region": "eastus",          # REQUIRED
  "Content-Type": "application/json",
  "X-ClientTraceId": str(uuid.uuid4())}
body = [{"text": "Hello, how are you?"}]
r = requests.post(endpoint + "/translate", params=params, headers=headers, json=body)
for t in r.json()[0]["translations"]:
    print(t["to"], t["text"])      # fr ... / de ... / es ...

If you omit from, Translator auto-detects the source and returns a detectedLanguage object with a confidence score.

Key API Operations

OperationPathPurpose
Translate/translatetext → one or more target languages
Detect/detectidentify source language + score
Transliterate/transliterateconvert script (same language)
Dictionary Lookup/dictionary/lookupalternative translations of a word
Dictionary Examples/dictionary/examplesexample sentences for a pair
Languages/languageslist supported languages (no auth)

Useful translate parameters

ParameterEffect
textTypeplain (default) or html (skips tags)
profanityActionNoAction, Marked, or Deleted
suggestedFromfallback source if detection fails
categorycustom model category ID

Transliteration vs. Translation

Transliteration rewrites text in a different script while keeping the same language — e.g. Japanese kanji こんにちは → Latin konnichiha. It does not change meaning. Translation changes language; transliteration changes alphabet. Exam items frequently swap these definitions.

params = {"api-version": "3.0", "language": "ja",
          "fromScript": "jpan", "toScript": "latn"}
body = [{"text": "こんにちは"}]   # -> "konnichiha"

Document Translation

Document translation preserves layout for .docx, .xlsx, .pptx, .pdf, .html, .txt/.csv, .rtf, .md and runs as an async batch:

  1. Upload source files to a Blob source container.
  2. Create a target container.
  3. Generate SAS tokens (or use managed identity) for both.
  4. POST a batch job specifying source/target containers + target language.
  5. Poll job status until Succeeded.
  6. Download translated files (formatting intact) from the target container.

Custom Translator

When industry terminology must translate consistently, train a custom model on parallel text (aligned source-target sentence pairs).

AspectDetail
Document typestraining, tuning, testing, phrase dictionary, sentence dictionary
Minimum data~10,000 parallel sentences for meaningful gains
Quality metricBLEU (Bilingual Evaluation Understudy) score
Invocationpass category=<custom-category-id> on /translate

A phrase dictionary forces exact replacements (brand names) without retraining; a sentence dictionary enforces whole-sentence translations. Both can supplement a trained model.

Common Trap: BLEU (not F1, not accuracy) evaluates translation quality. And remember the region header — exam scenarios describing a 401 from Translator while "the key is correct" point at the missing Ocp-Apim-Subscription-Region. Choosing Document Translation vs. text /translate hinges on whether formatting must be preserved across whole files.

Versioning, Batching, and Limits

The Translator REST surface is versioned with the api-version=3.0 query parameter, and forgetting it returns a 400. Requests are batched: the body is a JSON array of text objects, and you may send up to 100 array elements per call, with a combined limit of roughly 50,000 characters across the whole request. Each translation response mirrors that array, so the first input maps to the first result. Designing around these batch limits is a real-world concern the exam reflects in capacity-planning style questions, where the correct answer chunks a large document set rather than sending one giant payload.

Language detection deserves a precise mental model because three different services touch it. The Translator /detect operation returns the language plus a score and alternatives; Azure AI Language has its own detect_language feature; and the /translate operation auto-detects when from is omitted. They overlap but are billed and surfaced differently, so a question that says "detect the language and translate it in a single call" is pointing at omitting from on /translate, not at calling /detect first.

Choosing Translation Approach

Decide among the three Translator capabilities by the artifact and the consistency need. Plain strings or chat messages use text /translate; whole files where layout, tables, and styles must survive use Document Translation through Blob containers; and domain content where terminology must render consistently uses a Custom Translator model invoked by category ID. Layer a phrase dictionary on top when only a handful of brand terms must never change, since that avoids the ten-thousand-sentence training threshold entirely.

These distinctions, plus the mandatory region header and the BLEU evaluation metric, are the highest-yield Translator facts to carry into the exam.

Test Your Knowledge

Beyond the subscription key, which header must accompany Azure AI Translator calls and is a frequent cause of 401 errors when missing?

A
B
C
D
Test Your Knowledge

Converting the Japanese text こんにちは into the Latin string "konnichiha" while keeping it Japanese is an example of what?

A
B
C
D
Test Your Knowledge

A team trains a Custom Translator model on aligned bilingual sentences. Which metric reports its translation quality?

A
B
C
D