Domain 1: Plan and manage an Azure AI solution (20-25%) โ
This domain covers the foundational infrastructure of Azure AI โ resource selection, deployment strategy, security, responsible AI, and monitoring. At 20โ25%, it is the highest-weighted domain on the exam.
1.1 Select and Deploy Resources โ
Microsoft AI Foundry Architecture โ
| Resource | Role | Exam Signal |
|---|---|---|
| Azure AI Services | The underlying platform APIs (Vision, Language, Speech, etc.) | Any time you provision an API endpoint |
| AI Foundry Hub | Shared infrastructure: compute, connections, role assignments, security for multiple teams/projects | "shared compute", "manage multiple teams" |
| AI Foundry Project | Your workspace inside a Hub โ build, test, deploy models and agents | "build a chatbot", "deploy a model" |
Hub vs Project (Exam Trap)
Hub = shared infrastructure for multiple teams (compute, keys, network policy). Project = your individual workspace inside a Hub (models, datasets, prompt flows).
The exam uses "set up shared compute for multiple teams" โ Hub. "Build a specific solution" โ Project.
Service Selection โ
| Capability | Service to Use |
|---|---|
| Generative AI (LLMs) | Azure OpenAI / Model Catalog in AI Foundry |
| Vision (images, video) | Azure AI Vision / Video Indexer |
| NLP (sentiment, NER, Q&A) | Azure AI Language |
| Speech (STT, TTS, translation) | Azure AI Speech |
| Knowledge Mining | Azure AI Search |
| Form / Document extraction | Document Intelligence |
| Multimodal pipelines | Content Understanding |
Deployment Options โ
| Option | Characteristics | Exam Signal |
|---|---|---|
| Standard (Pay-as-you-go) | Shared infrastructure, variable latency, billed per token/call | Default choice |
| Provisioned (PTU) | Reserved capacity, consistent latency, higher fixed cost | "predictable latency", "guaranteed throughput" |
| Docker Container | Runs on-premises or at edge, requires Azure for billing metering | "data residency", "offline", "edge deployment" |
| Global deployment | Routes to nearest Azure region, higher rate limits | "scale globally" |
PTU vs Standard
Standard = pay-per-token, latency varies with load. PTU (Provisioned Throughput) = reserved capacity, fixed cost, consistent sub-second latency.
The exam signals PTU with phrases like "predictable latency", "guaranteed throughput", or "high-volume production workload".
1.2 Manage Costs โ
- Billing models: AI services charge per transaction (API call), per page (Document Intelligence), or per token (OpenAI models).
- Azure Cost Management: Set budgets and configure alerts when spending approaches thresholds.
- Single-service resources vs multi-service: Use single-service resources when you need department-level chargebacks.
- Token monitoring: Track input/output token counts in generative apps to prevent runaway costs โ use AI Foundry metrics dashboards.
1.3 Manage Security and Authentication โ
Authentication Methods โ
| Method | When to Use | Exam Signal |
|---|---|---|
| Managed Identity + DefaultAzureCredential() | Production apps, CI/CD, key rotation avoidance | "avoid hardcoded keys", "keyless auth" |
| Subscription Key | Quick testing, scripts, development | "simple testing" |
| RBAC Roles | Granular access control, audit trail | "least privilege", "cross-service access" |
DefaultAzureCredential
DefaultAzureCredential() from the Azure Identity SDK tries authentication sources in order: environment variables โ managed identity โ Visual Studio โ CLI โ browser. In production, use Managed Identity so no credentials are stored in code.
Key Storage โ
- Store subscription keys in Azure Key Vault โ never hardcode in source code or config files.
- Use the Key Vault SDK to retrieve keys at runtime.
Network Security โ
| Control | Purpose |
|---|---|
| Private Endpoints | Access services over private IP (no public internet) |
| Firewall / IP Rules | Restrict to specific IP ranges or VNETs |
| VNet Integration | Services communicate over private network |
RBAC Roles (Common Exam Roles) โ
| Role | Permissions |
|---|---|
| Cognitive Services Contributor | Create and manage resources |
| Cognitive Services User | Call APIs, read keys |
| Cognitive Services OpenAI Contributor | Deploy and manage OpenAI models |
| AI Foundry Developer | Access Foundry projects, deploy models |
1.4 Implement Responsible AI and Content Safety โ
Microsoft's 6 Responsible AI Principles โ
- Fairness โ mitigate bias in model outputs
- Reliability & Safety โ consistent performance, error handling
- Privacy & Security โ protect data throughout the AI lifecycle
- Inclusiveness โ accessible to all users
- Transparency โ explainable AI decisions
- Accountability โ human oversight of AI systems
Azure AI Content Safety โ
A standalone service for moderating user-generated content:
| Category | Severity Levels |
|---|---|
| Hate | Safe / Low / Medium / High |
| Self-Harm | Safe / Low / Medium / High |
| Violence | Safe / Low / Medium / High |
| Sexual | Safe / Low / Medium / High |
- Blocklists: Custom lists of terms to always reject regardless of category severity.
Content Safety vs Content Filters
Azure AI Content Safety = standalone service, used for moderating user-generated content in any application. Azure OpenAI Content Filters = built into the OpenAI model deployment, applied to model inputs and outputs.
They are separate. The exam tests this โ "user content moderation" โ Content Safety. "Model output filtering" โ OpenAI Content Filters.
Generative AI Safeguards โ
| Safeguard | Purpose |
|---|---|
| Prompt Shields | Detects and blocks prompt injection attacks (users trying to override system instructions) |
| Grounded responses | RAG pattern ensures model answers from provided context, reducing hallucination |
| Content Filters | Applied to Azure OpenAI model inputs AND outputs |
1.5 Monitor and Troubleshoot โ
| Tool | Use Case |
|---|---|
| Azure Monitor / Diagnostic Settings | Route logs to Log Analytics, Storage, or Event Hubs |
| Alerts | Fire on 429 (rate limit exceeded) or 5xx (server errors) |
| Foundry Tracing | Debug LLM call chains โ see every step, tool call, and latency |
| AI Foundry Metrics | Track token usage, request counts, error rates |