Domain 5: Implement natural language processing solutions (15-20%) โ
This domain covers analyzing text and speech using the Azure AI Language and Azure AI Speech services. At 15โ20% of the exam, it ties with Domains 1 and 6 as the highest-weighted areas โ prioritize it.
5.1 Language Service (Analyze Text) โ
Pre-built Text Analytics Features โ
| Feature | What It Does | Exam Trigger Phrase |
|---|---|---|
| Sentiment Analysis | Returns positive/negative/neutral at document + sentence level | "customer feedback analysis", "sentiment per sentence" |
| Key Phrase Extraction | Identifies the main topics in a document | "summarize key topics" |
| NER | Extracts people, organizations, locations, dates | "extract entities from text" |
| Entity Linking | Disambiguates named entities using Wikipedia (e.g., "Mercury" โ planet or element) | "resolve entity ambiguity" |
| PII Detection | Detects and can redact personal data (emails, SSN, phone numbers) | "remove sensitive data before storage" |
| Language Detection | Identifies the language of text | "detect language automatically" |
Custom Language Models โ
| Model | Use Case | Training Data Needed |
|---|---|---|
| Custom NER | Extract domain-specific entities (legal terms, medical codes) | Labeled examples with entity spans |
| Custom Text Classification | Categorize documents into your own labels | Labeled documents per category |
NER vs Entity Linking
NER โ identifies entities and their type (Person, Location, etc.) Entity Linking โ disambiguates entities by connecting them to a known knowledge base entry. Both can be used together.
5.2 Conversational Language Understanding (CLU) โ
CLU replaces the older LUIS service. It turns spoken or typed utterances into structured intents + entities.
Core Concepts โ
| Concept | Definition | Example |
|---|---|---|
| Utterance | What the user says | "Set an alarm for 7am tomorrow" |
| Intent | The user's goal | SetAlarm |
| Entity | A parameter extracted from the utterance | Time = 7am, Date = tomorrow |
| Confidence Score | 0โ1 probability that the model assigned the correct intent | Use threshold (e.g., > 0.7) to reject low-confidence calls |
CLU Lifecycle โ
Design โ Label utterances โ Train โ Test โ Deploy โ Consume via SDK/REST- Deployment slots: Maintain separate
productionandstagingdeployments so you can test a new model version before promoting it. - Export/Import: Model definition exported as JSON โ useful for version control or migrating between projects.
CLU vs Custom QA
CLU โ maps an utterance to an intent + entities (structured output). Use when your app needs to take action based on what the user wants. Custom QA โ matches a question to a stored answer (Q&A pair output). Use when your app needs to return information from a knowledge base.
5.3 Custom Question Answering โ
Custom QA (formerly QnA Maker) builds a knowledge base of Q&A pairs and returns the best matching answer to a user question.
Knowledge Base Sources โ
| Source Type | Example |
|---|---|
| URLs | Public FAQ web pages, product documentation |
| Files | PDF manuals, Word documents, Excel/TSV Q&A spreadsheets |
| Manual entry | Directly authored Q&A pairs |
Key Features โ
| Feature | What It Does |
|---|---|
| Multi-turn conversations | Adds follow-up prompts to an answer (e.g., "Did that help?" โ "Yes/No" branches) |
| Alternate phrasing | Add synonym questions to improve match rate |
| Chit-chat | Pre-built personality sets (Professional, Friendly, Witty) for small talk |
| Active Learning | Surfaces low-confidence questions for human review and improvement |
| Confidence threshold | Rejects answers below a set score โ prevents wrong answers from being returned |
Deployment โ
- Create a project in Language Studio or AI Foundry
- Add sources and label/edit Q&A pairs
- Train and test
- Deploy to a named endpoint
- Consume via REST API (
POST /knowledgebases/{kbId}/generateAnswer)
Multi-turn Q&A
The exam phrase "multi-turn Q&A from documents" maps to Custom Question Answering โ not CLU. CLU is for intent detection; Custom QA is for retrieving stored answers.
5.4 Speech Services โ
Core Capabilities โ
| Capability | Description | SDK Method |
|---|---|---|
| Speech-to-Text (STT) | Real-time or batch audio transcription | SpeechRecognizer |
| Text-to-Speech (TTS) | Synthesize neural voices with natural prosody | SpeechSynthesizer |
| Speech Translation | Real-time STT + translation in one pass | TranslationRecognizer |
| Intent Recognition | Detect CLU intents directly from spoken audio | IntentRecognizer with CLU model |
| Keyword Recognition | Local on-device detection of a wake word | KeywordRecognizer |
Intent Recognition vs Keyword Recognition
Intent Recognition โ understands what the user wants (requires CLU model, cloud call). Keyword Recognition โ detects a specific activation word (e.g., "Hey Cortana") โ runs locally on device, no cloud needed.
The exam uses "recognize spoken intent / commands" โ Intent Recognition + Speech SDK. The exam uses "wake word / offline activation" โ Keyword Recognition.
Generative Speaking (GenAI Speaking) โ
Combines Azure OpenAI with Speech TTS to produce expressive, context-aware spoken responses โ used in AI agents that speak to users in real-time.
Batch Transcription โ
For large audio files that cannot be processed in real-time:
- Submit files stored in Azure Blob Storage
- Use an async pattern: submit โ poll status โ retrieve transcript
- Returns word-level timestamps and speaker diarization
5.5 Translator Service โ
| Feature | Description |
|---|---|
| Text Translation | Translate text across 100+ languages in a single API call |
| Transliterate | Convert script without changing language (e.g., Japanese Kanji โ Romaji) |
| Detect | Auto-detect the source language |
| Dictionary Lookup | Returns alternate translations and examples |
| Document Translation | Async translation of entire Word/PDF files, preserving layout |
| Custom Translator | Train a domain-specific translation model using parallel corpora (TMX/XLIFF files) |
Document Translation Pattern
Document Translation is async โ submit โ get Operation-Location header โ poll until complete โ download translated files. Same 202 โ GET pattern as the Read API and batch operations.