4.5 Azure AI Translator Service
Key Takeaways
- Azure AI Translator provides text translation (130+ languages), document translation (preserving formatting), and custom translator (domain-specific models).
- Text translation supports real-time translation of text strings via REST API with automatic language detection.
- Document translation preserves the original document formatting (Word, PDF, PowerPoint, Excel, HTML) while translating content.
- Custom Translator enables training domain-specific translation models using parallel text (source and target language pairs).
- Transliteration converts text from one script to another (e.g., Japanese kanji to romaji) without translating the language.
Azure AI Translator Service
Quick Answer: Azure AI Translator provides text translation (130+ languages), document translation (preserving formatting), and custom translator (domain-specific models). The REST API supports real-time translation, transliteration, dictionary lookup, and automatic language detection.
Text Translation
Basic Translation API Call
import requests
import uuid
endpoint = "https://api.cognitive.microsofttranslator.com"
path = "/translate"
params = {
"api-version": "3.0",
"from": "en",
"to": ["fr", "de", "es"]
}
headers = {
"Ocp-Apim-Subscription-Key": "<your-key>",
"Ocp-Apim-Subscription-Region": "eastus",
"Content-Type": "application/json",
"X-ClientTraceId": str(uuid.uuid4())
}
body = [{"text": "Hello, how are you?"}]
response = requests.post(
endpoint + path,
params=params,
headers=headers,
json=body
)
for translation in response.json()[0]["translations"]:
print(f"{translation['to']}: {translation['text']}")
# Output:
# fr: Bonjour, comment allez-vous?
# de: Hallo, wie geht es Ihnen?
# es: Hola, como estas?
On the Exam: Note that the Translator API uses a different endpoint pattern than other Azure AI services. It uses
api.cognitive.microsofttranslator.comwith a subscription key AND region header. Also note thetoparameter accepts an array for multi-language translation in a single call.
Key API Operations
| Operation | Endpoint | Description |
|---|---|---|
| Translate | /translate | Translate text to one or more languages |
| Detect | /detect | Detect language of input text |
| Transliterate | /transliterate | Convert text from one script to another |
| Dictionary Lookup | /dictionary/lookup | Get alternative translations for a word |
| Dictionary Examples | /dictionary/examples | Get example sentences for a translation |
| Languages | /languages | List all supported languages |
Transliteration
# Convert Japanese text from kanji to Latin script
params = {
"api-version": "3.0",
"language": "ja",
"fromScript": "jpan",
"toScript": "latn"
}
body = [{"text": "こんにちは"}]
response = requests.post(
endpoint + "/transliterate",
params=params,
headers=headers,
json=body
)
# Output: "konnichiha"
Document Translation
Document translation translates entire documents while preserving formatting:
Supported Formats
| Format | Extensions |
|---|---|
| Microsoft Office | .docx, .xlsx, .pptx |
| HTML | .html, .htm |
| Text | .txt, .csv, .tsv |
| Rich Text | .rtf |
| Markdown | .md |
Document Translation Workflow
- Upload source documents to Azure Blob Storage (source container)
- Create a target container for translated documents
- Generate SAS tokens for both containers
- Submit a batch translation request
- Poll for status until translation completes
- Download translated documents from the target container
Custom Translator
Custom Translator improves translation quality for domain-specific content:
| Feature | Description |
|---|---|
| Parallel text | Provide aligned source-target sentence pairs |
| Document types | Training, tuning, testing, phrase dictionary, sentence dictionary |
| Minimum data | 10,000 parallel sentences for meaningful improvement |
| Evaluation | BLEU score measures translation quality |
| Deployment | Custom model deployed to a custom category ID |
Using a Custom Model
params = {
"api-version": "3.0",
"from": "en",
"to": "fr",
"category": "my-custom-category-id" # Custom model
}
On the Exam: Custom Translator questions typically test whether you know the minimum data requirements (10,000 parallel sentences), how to deploy a custom model (category ID), and how to evaluate quality (BLEU score).
Which header is required for Azure AI Translator API calls in addition to the subscription key?
What is transliteration?
What metric does Custom Translator use to evaluate translation quality?