4.2 Conversational Language Understanding (CLU)
Key Takeaways
- CLU (the successor to LUIS) enables building custom natural language understanding models that predict user intents and extract entities from utterances.
- A CLU model requires defining intents (what the user wants to do), entities (key information to extract), and utterances (example phrases for training).
- CLU supports multiple entity types: learned (ML-based), list (exact match), prebuilt (dates, numbers), and regex (pattern-based).
- The training process uses labeled utterances to build a model that can generalize to new, unseen utterances.
- CLU models are trained and deployed through Language Studio or the REST API, and called via a prediction endpoint.
Conversational Language Understanding (CLU)
Quick Answer: CLU (replacing LUIS) lets you build custom NLU models that predict user intents and extract entities from natural language text. Define intents, entities, and example utterances, then train and deploy the model. CLU supports learned, list, prebuilt, and regex entity types.
CLU vs. LUIS
| Feature | CLU (Current) | LUIS (Deprecated) |
|---|---|---|
| Platform | Azure AI Language | Standalone service |
| Portal | Language Studio | LUIS.ai portal |
| Entity types | Learned, List, Prebuilt, Regex | Machine-learned, List, Prebuilt, Regex, Composite |
| Multilingual | Native multilingual support | Separate model per language |
| Training | Standard and Advanced | Standard |
| Max utterances | 50,000 per project | 15,000 per application |
| Status | Generally Available | Deprecated — migrate to CLU |
On the Exam: LUIS is deprecated. All questions about conversational NLU will reference CLU. If you see "LUIS" in exam materials, the answer likely involves migrating to CLU.
Core Concepts
Intents
Intents represent the user's goal or purpose. Each utterance is mapped to exactly one intent.
| Intent | Example Utterances |
|---|---|
| BookFlight | "Book a flight to Paris", "I need to fly to London next week" |
| CheckWeather | "What's the weather like?", "Will it rain tomorrow in Seattle?" |
| OrderFood | "I'd like to order a pizza", "Can I get a large coffee?" |
| None | Built-in fallback for unrecognized utterances |
Entities
Entities are key pieces of information to extract from utterances.
| Entity Type | Description | Example |
|---|---|---|
| Learned | ML-based, trained from labeled examples | "Paris" as Destination |
| List | Exact match from a defined list | "Economy", "Business", "First" as SeatClass |
| Prebuilt | Pre-trained for common types | DateTime, Number, Temperature |
| Regex | Pattern matching with regular expressions | Flight numbers like "AA1234" |
Utterances
Utterances are example phrases labeled with intents and entities:
Utterance: "Book a flight to [Paris](Destination) for [next Monday](DateTime)"
Intent: BookFlight
Entities: Destination = "Paris", DateTime = "next Monday"
Building a CLU Model
Step 1: Create a Project
import requests
endpoint = "https://my-language.cognitiveservices.azure.com"
api_key = "<your-key>"
# Create a CLU project
project_name = "FlightBooking"
url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}?api-version=2023-04-01"
headers = {
"Ocp-Apim-Subscription-Key": api_key,
"Content-Type": "application/json"
}
body = {
"projectName": project_name,
"language": "en",
"projectKind": "Conversation",
"description": "Flight booking assistant",
"multilingual": True
}
response = requests.put(url, headers=headers, json=body)
Step 2: Define Intents and Entities
# Add an intent
intent_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/intents/{intent_name}?api-version=2023-04-01"
requests.put(intent_url, headers=headers)
# Add an entity
entity_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/entities/{entity_name}?api-version=2023-04-01"
entity_body = {
"category": "Destination",
"compositionSetting": "combineComponents",
"list": None,
"prebuilts": None
}
requests.put(entity_url, headers=headers, json=entity_body)
Step 3: Add Labeled Utterances
# Add labeled utterances
utterances_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/utterances?api-version=2023-04-01"
utterances = [
{
"text": "Book a flight to Paris",
"language": "en",
"intent": "BookFlight",
"entities": [
{
"category": "Destination",
"offset": 17,
"length": 5
}
]
},
{
"text": "I need to fly to London next week",
"language": "en",
"intent": "BookFlight",
"entities": [
{
"category": "Destination",
"offset": 17,
"length": 6
}
]
}
]
requests.post(utterances_url, headers=headers, json=utterances)
Step 4: Train the Model
# Start training
train_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/:train?api-version=2023-04-01"
train_body = {
"modelLabel": "v1",
"trainingMode": "standard" # or "advanced"
}
response = requests.post(train_url, headers=headers, json=train_body)
Step 5: Deploy and Call the Model
# Deploy the trained model
deploy_url = f"{endpoint}/language/authoring/analyze-conversations/projects/{project_name}/deployments/{deployment_name}?api-version=2023-04-01"
deploy_body = {
"trainedModelLabel": "v1"
}
requests.put(deploy_url, headers=headers, json=deploy_body)
# Call the prediction endpoint
predict_url = f"{endpoint}/language/:analyze-conversations?api-version=2023-04-01"
predict_body = {
"kind": "Conversation",
"analysisInput": {
"conversationItem": {
"id": "1",
"text": "I want to fly to Tokyo tomorrow",
"participantId": "user1"
}
},
"parameters": {
"projectName": project_name,
"deploymentName": deployment_name
}
}
result = requests.post(predict_url, headers=headers, json=predict_body).json()
Prediction Response
{
"kind": "ConversationResult",
"result": {
"prediction": {
"topIntent": "BookFlight",
"intents": [
{"category": "BookFlight", "confidenceScore": 0.95},
{"category": "CheckWeather", "confidenceScore": 0.02},
{"category": "None", "confidenceScore": 0.03}
],
"entities": [
{
"category": "Destination",
"text": "Tokyo",
"offset": 18,
"length": 5,
"confidenceScore": 0.92
}
]
}
}
}
Training Best Practices
| Practice | Recommendation |
|---|---|
| Utterances per intent | Minimum 10, recommended 25-50 |
| Balanced intents | Similar number of utterances per intent |
| Diverse examples | Vary phrasing, word order, and vocabulary |
| Entity coverage | Include examples with and without entities |
| None intent | Add 10-20% of utterances to the None intent |
| Evaluation | Review confusion matrix and per-intent metrics |
On the Exam: The None intent is critical. It captures utterances that don't match any defined intent. Without sufficient None examples, the model may incorrectly classify random text into defined intents with high confidence.
What has replaced LUIS (Language Understanding Intelligent Service) in Azure?
Which CLU entity type would you use to extract flight numbers that follow the pattern "AA1234"?
What is the purpose of the "None" intent in a CLU model?
In a CLU prediction response, which field contains the most likely user intent?