4.2 Conversational Language Understanding (CLU)

Key Takeaways

  • CLU is the successor to LUIS and builds custom models that predict a top intent and extract entities from utterances.
  • A CLU project requires intents, entities, and labeled utterances; the None intent is the built-in fallback for unmatched input.
  • Entity components include Learned (ML), List (exact/synonyms), Prebuilt (Quantity, DateTime, etc.), and Regex, and can be combined with combineComponents.
  • Training modes are Standard (fast, free) and Advanced (slower, higher quality, billed); deployments are named slots you can swap.
  • The prediction response exposes topIntent plus a ranked intents array and an entities array with confidence scores and offsets.
Last updated: June 2026

Quick Answer: Conversational Language Understanding (CLU) replaces Language Understanding Intelligent Service (LUIS). You define intents (goals), entities (data to extract), and labeled utterances, then train and deploy a model that returns a topIntent and extracted entities. CLU supports Learned, List, Prebuilt, and Regex entity components and runs inside Language Studio.

CLU vs. LUIS

FeatureCLU (current)LUIS (retired)
PlatformAzure AI LanguageStandalone service
PortalLanguage Studioluis.ai
MultilingualSingle multilingual modelOne model per language
Training modesStandard + AdvancedStandard only
Max utterances50,000 / project15,000 / app
StatusGenerally AvailableRetired — migrate to CLU

On the Exam: LUIS portal authoring was retired in 2025. Any question mentioning LUIS expects a migration to CLU answer (or to orchestration workflow for combining intents and Q&A). Do not pick LUIS as a new build target.

Core Concepts

Intents map each utterance to exactly one goal. Entities are the data slots inside an utterance. CLU entities are built from one or more components:

ComponentBehaviorExample
LearnedML-trained from labeled spans"Paris" as Destination
ListExact text + synonyms list"economy"/"coach" → SeatClass
PrebuiltMicrosoft-trained typesQuantity, DateTime, Email
RegexPattern matchFlight code [A-Z]{2}\d{4}

A single entity may combine components — combineComponents: true returns one merged prediction when components overlap; false returns each component separately. This is a frequent exam distractor.

Building and Deploying a CLU Model

The authoring lifecycle is: create project → add intents/entities → label utterances → trainevaluatedeploypredict. Training and deployment are separate steps; deploying does not retrain.

# 1) Train (choose mode)
train_body = {"modelLabel": "v1", "trainingMode": "standard"}  # or "advanced"
requests.post(f"{ep}/.../projects/{proj}/:train?api-version=2023-04-01",
              headers=headers, json=train_body)

# 2) Deploy a named slot
requests.put(f"{ep}/.../projects/{proj}/deployments/production?api-version=2023-04-01",
             headers=headers, json={"trainedModelLabel": "v1"})

# 3) Predict (runtime endpoint, NOT the authoring endpoint)
predict_body = {
  "kind": "Conversation",
  "analysisInput": {"conversationItem": {"id": "1",
      "text": "I want to fly to Tokyo tomorrow", "participantId": "u1"}},
  "parameters": {"projectName": proj, "deploymentName": "production"}}
result = requests.post(f"{ep}/language/:analyze-conversations?api-version=2023-04-01",
                       headers=headers, json=predict_body).json()

Training modes

ModeSpeedQualityCost
StandardFastGood baselineFree training
AdvancedSlowHigher accuracy, better generalizationBilled per training hour

Prediction Response Shape

{"result": {"prediction": {
  "topIntent": "BookFlight",
  "intents": [{"category": "BookFlight", "confidenceScore": 0.95},
               {"category": "None", "confidenceScore": 0.03}],
  "entities": [{"category": "Destination", "text": "Tokyo",
                 "offset": 17, "length": 5, "confidenceScore": 0.92}]}}}

Read the most likely intent from topIntent; the full ranking is in intents[]. Never read intents[0] blindly — order is by score but topIntent is the contract.

Training Best Practices and Traps

  • Utterances per intent: at least 10, ideally 25-50; keep intents balanced.
  • None intent: seed it with 10-20% of total utterances or unrelated input gets force-classified with false confidence.
  • Diversity over volume: vary phrasing, word order, and entity placement.
  • Confusion checks: review the per-intent precision/recall and the confusion matrix after evaluation; two overlapping intents (e.g. CheckStatus vs. CancelOrder) signal you should merge or relabel.

Common Trap: Calling the authoring endpoint for predictions returns 404/auth errors — predictions use the runtime :analyze-conversations path with projectName and deploymentName. Also remember CLU is single multilingual model: you label in one language and predict in many, unlike LUIS's per-language apps.

Evaluation and Iteration in Practice

After training, Language Studio reports an overall score plus per-intent and per-entity precision, recall, and F1, along with a confusion matrix. Reading this correctly is what separates a passing answer from a guess. If two intents constantly steal each other's utterances in the matrix, the underlying problem is usually that their training phrases overlap in meaning; the remedy is to either merge the intents or add sharply contrasting examples that teach the boundary. If a single entity has low recall, you need more labeled spans showing that entity in varied positions and surrounding contexts.

A subtle but exam-relevant point is the difference between intent confidence and entity confidence. The top intent can be highly confident while an entity inside the same utterance is missed entirely, because intents are classified from the whole sentence whereas entities are extracted span by span. So a flight-booking utterance can correctly resolve to the BookFlight intent yet fail to tag the destination if that city never appeared in training. Treat them as two separate quality dials.

Deployment slots also matter operationally. Because a deployment points at a specific trained model label, you can train a new candidate model, deploy it to a staging slot, validate it against real traffic, and only then repoint the production slot — all without changing client code, since clients address the deployment name rather than a model version. This staging pattern, together with the runtime-versus-authoring endpoint distinction, is the kind of lifecycle detail AI-102 likes to probe with subtly wrong answer choices.

Test Your Knowledge

Which service is the supported replacement for the retired LUIS for building custom intent-and-entity models?

A
B
C
D
Test Your Knowledge

You need to extract flight codes that always match two letters followed by four digits (e.g. "AA1234"). Which CLU entity component is most appropriate?

A
B
C
D
Test Your Knowledge

After training, random off-topic phrases are being classified into your defined intents with high confidence. What is the most likely fix?

A
B
C
D