AIF-C01: Study Notes
Comprehensive study notes for the AIF-C01 AWS Certified AI Practitioner certification.
← Back to Overview | ← Objectives | Exam Tips →
Domain 1: Fundamentals of AI and ML (20%)
1.1: AI, ML, and DL Hierarchy
┌─────────────────────────────────────┐
│ Artificial Intelligence (AI) │ Broadest: Machines that mimic human intelligence
│ ┌───────────────────────────────┐ │
│ │ Machine Learning (ML) │ │ Subset: Learn from data without explicit programming
│ │ ┌─────────────────────────┐ │ │
│ │ │ Deep Learning (DL) │ │ │ Subset: Neural networks with multiple layers
│ │ └─────────────────────────┘ │ │
│ └───────────────────────────────┘ │
└─────────────────────────────────────┘Why This Matters
Exam questions test whether you understand the hierarchy: All DL is ML, all ML is AI, but not all AI is ML.
Types of Machine Learning
1. Supervised Learning
Definition: Learn from labeled data
Components:
- Features (X): Input variables
- Labels (y): Known output/target
- Model: Learns mapping from X to y
Types:
- Classification: Predict categories (spam/not spam)
- Regression: Predict continuous values (house price)
Examples:
- Email spam detection
- Image classification (cat vs dog)
- House price prediction
- Customer churn prediction
2. Unsupervised Learning
Definition: Learn patterns from unlabeled data
Types:
- Clustering: Group similar items
- Dimensionality Reduction: Reduce features while preserving information
- Anomaly Detection: Find outliers
Examples:
- Customer segmentation
- Anomaly detection in network traffic
- Product recommendations
3. Reinforcement Learning
Definition: Learn through trial and error with rewards
Components:
- Agent: Learns and makes decisions
- Environment: What agent interacts with
- Actions: What agent can do
- Rewards: Feedback for actions
Examples:
- Game playing (AlphaGo)
- Robotics
- Autonomous vehicles
- Resource optimization
| Learning Type | Data | Goal | Example |
|---|---|---|---|
| Supervised | Labeled | Predict labels | Spam detection |
| Unsupervised | Unlabeled | Find patterns | Customer groups |
| Reinforcement | Rewards/penalties | Maximize rewards | Game AI |
ML Model Performance Metrics
Classification Metrics
Confusion Matrix:
Predicted
Positive Negative
Actual Pos TP FN
Neg FP TNKey Metrics:
Accuracy = (TP + TN) / Total
- Good when classes are balanced
Precision = TP / (TP + FP)
- "Of all predicted positives, how many are actually positive?"
- Use when false positives are costly
Recall = TP / (TP + FN)
- "Of all actual positives, how many did we catch?"
- Use when false negatives are costly
Exam Scenario
Medical diagnosis: Prefer high recall (catch all diseases, even with false positives) Spam filter: Prefer high precision (don't mark important emails as spam)
Regression Metrics
- MAE (Mean Absolute Error): Average absolute difference
- MSE (Mean Squared Error): Average squared difference
- RMSE (Root Mean Squared Error): Square root of MSE
Overfitting vs. Underfitting
| Issue | Description | Performance | Solution |
|---|---|---|---|
| Underfitting | Model too simple | Poor on training AND test | More complex model, more features |
| Good Fit | Just right | Good on both | ✅ This is the goal |
| Overfitting | Memorized training data | Great on training, poor on test | More data, regularization, simpler model |
Visual:
Underfitting: 🙁 Training: 70% 🙁 Test: 65%
Perfect Fit: ✅ Training: 92% ✅ Test: 90%
Overfitting: ✅ Training: 99% 🙁 Test: 72%ML Development Lifecycle
Business Problem Definition
- What are we trying to predict/optimize?
- What's the success metric?
Data Collection and Preparation
- Gather data
- Clean data (handle missing values, outliers)
- Split data (train/validation/test)
Feature Engineering
- Create useful features
- Transform variables
- Encode categorical variables
Model Training
- Choose algorithm
- Train on training data
- Tune hyperparameters
Model Evaluation
- Test on validation data
- Check metrics (accuracy, precision, recall)
- Compare multiple models
Model Deployment
- Deploy to production
- Create API endpoint
- Monitor performance
Model Monitoring and Maintenance
- Track model drift
- Retrain when needed
- Update as data changes
Exam Focus
The exam tests whether you understand that ML is iterative, not one-and-done. Models need monitoring and retraining.
Domain 2: Fundamentals of Generative AI (24%)
2.1: What is Generative AI?
Traditional ML: Analyzes and classifies (discriminative)
- Input: Email → Output: Spam/Not Spam
Generative AI: Creates new content (generative)
- Input: "Write a poem about clouds" → Output: Original poem
Key Concepts
Foundation Models (FMs)
- Large models trained on vast amounts of data
- Can perform multiple tasks without task-specific training
- Examples: GPT-4, Claude, Llama, Stable Diffusion
Large Language Models (LLMs)
- Foundation models specifically for text
- Trained on text from books, websites, articles
- Understand and generate human-like text
Transformers
- Architecture behind modern LLMs
- Key innovation: Attention mechanism
- Allows model to focus on relevant parts of input
- Enables understanding of context
Tokens
- Basic units of text for LLMs
- Roughly: 1 token ≈ 0.75 words
- Example: "Hello world" = 2 tokens
Context Window
- Maximum tokens a model can process at once
- Includes both input (prompt) and output
- Example: Claude 2 has 100K token context window
Embeddings
- Numerical representations of text
- Similar meanings = similar embeddings
- Used for semantic search, RAG, clustering
2.2: Capabilities and Limitations
Capabilities of Generative AI
Text Generation:
- ✅ Write articles, stories, emails
- ✅ Summarize documents
- ✅ Translate languages
- ✅ Answer questions
Code Generation:
- ✅ Write code from descriptions
- ✅ Debug and explain code
- ✅ Generate unit tests
Reasoning:
- ✅ Multi-step problem solving
- ✅ Chain-of-thought reasoning
- ✅ Mathematical calculations (with limitations)
Few-Shot Learning:
- ✅ Learn from just a few examples in prompt
- ✅ Adapt to new tasks without retraining
Multimodal:
- ✅ Process text + images
- ✅ Generate images from text
- ✅ Understand and describe images
Limitations of Generative AI
Critical Exam Concept: Hallucinations
Hallucination: When model generates false or nonsensical information presented as fact.
Why it happens:
- Model is predicting next token, not accessing knowledge database
- Trained to be helpful and complete responses
- No built-in fact-checking
Mitigation:
- Use RAG to provide factual context
- Implement guardrails
- Human review for critical applications
Other Limitations:
Training Data Cutoff
- Models don't know events after training
- Solution: Use RAG for current data
Context Window Limits
- Can't process extremely long documents
- Solution: Chunking, summarization
Computational Cost
- Expensive to train and run
- Larger models = higher cost
Bias in Training Data
- Reflects biases in training data
- Solution: Careful data curation, testing
No Real-Time Data
- Can't access internet or databases
- Solution: RAG, function calling
Lack of True Understanding
- Pattern matching, not reasoning
- Can make logical errors
2.3: AWS Generative AI Services
Amazon Bedrock
Fully managed service to access foundation models via API
Key Features:
- Multiple model providers (Anthropic, AI21, Cohere, Meta, Stability AI, Amazon)
- Pay-per-use pricing
- No infrastructure management
- Data privacy (your data doesn't train models)
Models Available:
| Provider | Model | Strengths |
|---|---|---|
| Anthropic | Claude | Long context (200K), reasoning, code |
| Amazon | Titan Text | Cost-effective, AWS-optimized |
| Amazon | Titan Embeddings | Text embeddings for search |
| Amazon | Titan Image Generator | Image generation |
| AI21 Labs | Jurassic | Multilingual text |
| Cohere | Command | Text generation, summarization |
| Meta | Llama 2 | Open-source, customizable |
| Stability AI | Stable Diffusion | Image generation |
Use Cases:
- Chatbots and virtual assistants
- Content generation
- Document summarization
- Code generation
- Image generation
Amazon Q
AI-powered assistant for business
Capabilities:
- Answer questions about your business data
- Generate content
- Summarize documents
- Connect to enterprise data sources
Use Cases:
- Internal knowledge base queries
- Customer support
- Employee assistance
- Business intelligence
Amazon CodeWhisperer
AI coding companion
Features:
- Real-time code suggestions
- Supports multiple languages (Python, Java, JavaScript, etc.)
- Security scanning
- Reference tracking (avoids IP issues)
Use Cases:
- Accelerate development
- Learn new APIs
- Generate boilerplate code
- Unit test generation
Domain 3: Applications of Foundation Models (28%)
3.1: Design Considerations
Cost Optimization
Factors Affecting Cost:
- Input tokens: Text sent to model
- Output tokens: Text generated by model
- Model size: Larger models cost more
- Request frequency
Cost Optimization Strategies:
- ✅ Choose appropriate model size
- Don't use Claude Opus for simple tasks
- ✅ Optimize prompts
- Be concise, avoid unnecessary context
- ✅ Cache responses
- Store common queries
- ✅ Use smaller models when possible
- Titan for simple tasks, Claude for complex
Latency Considerations
Factors Affecting Latency:
- Model size (larger = slower)
- Output length
- Context window usage
- Concurrent requests
Optimization:
- Use smaller models for real-time apps
- Implement streaming responses
- Asynchronous processing for batch jobs
3.2: Choosing the Right Foundation Model
Decision Table: Model Selection
| Use Case | Recommended Model | Why |
|---|---|---|
| Long document analysis | Anthropic Claude | 200K context window |
| Cost-effective text generation | Amazon Titan Text | AWS-optimized, lower cost |
| Image generation | Stable Diffusion XL | High-quality images |
| Code generation | Claude or CodeWhisperer | Strong coding abilities |
| Multilingual content | Cohere or Jurassic | Better language support |
| Embeddings for RAG | Amazon Titan Embeddings | Optimized for search |
| General chatbot | Claude or Titan | Good balance |
Model Comparison Factors
Task Type
- Text generation, summarization, Q&A, code, images
Context Window
- Short tasks (<4K tokens): Any model
- Long documents (>8K tokens): Claude, GPT-4
Language Support
- English: All models
- Multiple languages: Cohere, Jurassic
Cost
- Budget-friendly: Titan
- Premium: Claude, GPT-4
Latency
- Real-time: Smaller models
- Batch: Any model
3.3: Prompt Engineering
Zero-Shot Prompting
No examples, just the instruction
Prompt: "Translate to French: Hello, how are you?"
Output: "Bonjour, comment allez-vous?"Few-Shot Prompting
Provide examples to guide the model
Prompt:
Sentiment analysis examples:
"I love this!" → Positive
"Terrible experience" → Negative
"It was okay" → Neutral
Now classify: "Best purchase ever!"
Output: PositiveChain-of-Thought Prompting
Ask model to show its reasoning
Prompt: "If apples cost $2/lb and I buy 3.5 lbs, how much do I pay?
Show your work."
Output:
Let me calculate:
- Price per pound: $2
- Pounds purchased: 3.5
- Total: $2 × 3.5 = $7
Answer: $7Prompt Engineering Best Practices
✅ Be Specific
- ❌ "Write about dogs"
- ✅ "Write a 200-word informative article about dog breeds suitable for apartment living"
✅ Provide Context
- ❌ "Summarize this"
- ✅ "You are a financial analyst. Summarize this quarterly report for executives."
✅ Use Formatting
- Use bullet points, sections
- Clear instructions
✅ Specify Output Format
- "Respond in JSON format"
- "Use bullet points"
✅ Add Constraints
- "In 100 words or less"
- "Using simple language"
Fine-Tuning vs. Prompt Engineering
| Approach | When to Use | Cost | Effort | Flexibility |
|---|---|---|---|---|
| Prompt Engineering | First choice | Low | Low | High |
| RAG | Need current/private data | Medium | Medium | High |
| Fine-Tuning | Domain-specific tasks | High | High | Low |
Exam Decision Pattern
- Start with prompt engineering (cheapest, fastest)
- Use RAG if you need external knowledge
- Fine-tune only if above methods insufficient
3.4: Retrieval Augmented Generation (RAG)
What is RAG?
Problem: LLMs don't know your private data or recent events
Solution: RAG provides relevant context to the model
How RAG Works:
1. User asks question
↓
2. Convert question to embedding (vector)
↓
3. Search vector database for similar content
↓
4. Retrieve relevant documents
↓
5. Provide documents + question to LLM
↓
6. LLM generates answer with contextRAG Architecture Components
1. Document Ingestion
- Split documents into chunks
- Generate embeddings for each chunk
- Store in vector database
2. Vector Database
- Stores embeddings
- Enables semantic search
- AWS Options: OpenSearch, Bedrock Knowledge Bases
3. Retrieval
- Convert query to embedding
- Find similar documents
- Return top K results
4. Generation
- Send retrieved docs + query to LLM
- LLM generates contextual answer
Benefits of RAG
✅ Reduces Hallucinations
- Grounds responses in factual documents
✅ Up-to-Date Information
- Access to current data
✅ Domain-Specific Knowledge
- Use your proprietary data
✅ Citations
- Can reference source documents
✅ Cost-Effective
- No fine-tuning needed
Amazon Bedrock Knowledge Bases
Managed RAG solution
Features:
- Automatic document ingestion
- Vector database management
- Built-in embeddings (Titan Embeddings)
- Simple API integration
Supported Data Sources:
- Amazon S3
- Web crawlers
- Confluence
- SharePoint
- Salesforce
Domain 4: Guidelines for Responsible AI (14%)
4.1: Responsible AI Principles
1. Fairness
Avoid bias and discrimination
Types of Bias:
- Training Data Bias: Unrepresentative data
- Selection Bias: Biased sampling
- Algorithmic Bias: Model amplifies existing bias
Mitigation:
- Diverse training data
- Test across demographic groups
- Regular bias audits
- Use SageMaker Clarify
2. Explainability
Understand how models make decisions
Why It Matters:
- Build trust
- Debug errors
- Meet regulatory requirements
- Identify bias
Techniques:
- Feature importance
- SHAP values
- Attention visualization
- Model cards
3. Privacy
Protect sensitive data
Best Practices:
- Data encryption (at rest, in transit)
- Access controls (IAM)
- Data anonymization
- Differential privacy
- Secure data deletion
4. Safety
Prevent harmful outputs
Risks:
- Toxic content generation
- Misinformation
- Dangerous instructions
- Manipulation
Guardrails:
- Content filtering
- Human review (Amazon A2I)
- Output validation
- Rate limiting
5. Transparency
Document capabilities and limitations
Model Cards Should Include:
- Intended use cases
- Training data description
- Known limitations
- Performance metrics
- Bias testing results
4.2: AWS Tools for Responsible AI
Amazon SageMaker Clarify
Detect bias in data and models
Features:
- Pre-training bias detection
- Post-training bias detection
- Feature importance
- Explainability reports
Bias Metrics:
- Class imbalance
- Disparate impact
- Difference in positive proportions
Amazon SageMaker Model Monitor
Continuous monitoring
Monitors:
- Data quality
- Model quality (drift)
- Bias drift
- Feature attribution drift
Alerts:
- CloudWatch alarms
- SNS notifications
Amazon Augmented AI (A2I)
Human review workflows
Use Cases:
- Review low-confidence predictions
- Audit high-stakes decisions
- Compliance requirements
- Continuous improvement
Domain 5: Security, Compliance, and Governance (14%)
5.1: Securing AI Systems
Data Security
Encryption:
- ✅ At Rest: AWS KMS, S3 encryption
- ✅ In Transit: TLS/SSL, HTTPS
Access Control:
- IAM policies for fine-grained permissions
- VPC for network isolation
- Security groups and NACLs
Input Validation:
- Sanitize user inputs
- Limit input size
- Check for injection attacks
Output Filtering:
- Content moderation
- PII detection and redaction
- Toxicity filtering
Model Security
Adversarial Attacks:
- Prompt Injection: Malicious prompts to bypass safety
- Data Poisoning: Corrupt training data
- Model Inversion: Extract training data
Protection:
- Input sanitization
- Rate limiting
- Monitoring for unusual patterns
- Regular security audits
5.2: Governance and Compliance
Model Governance
Version Control:
- SageMaker Model Registry
- Track model lineage
- Associate models with data/code
Approval Workflows:
- Manual approval before production
- Automated testing gates
- Change management
Audit Trails:
- CloudTrail logs all API calls
- SageMaker logs training jobs
- Model deployment history
Compliance
Data Residency:
- Choose appropriate AWS Region
- Keep data in specific geographic location
Industry Regulations:
- HIPAA: Healthcare data (use HIPAA-eligible services)
- GDPR: EU data privacy (right to explanation)
- SOC 2: Security controls
- PCI DSS: Payment card data
Documentation Requirements:
- Model cards
- Training data sources
- Performance metrics
- Bias testing results
Quick Reference: AWS AI/ML Services
When to Use Which Service?
| Need | Service |
|---|---|
| Access foundation models | Amazon Bedrock |
| Business assistant | Amazon Q |
| Code suggestions | Amazon CodeWhisperer |
| End-to-end ML platform | Amazon SageMaker |
| Image/video analysis | Amazon Rekognition |
| Extract text from documents | Amazon Textract |
| Sentiment analysis | Amazon Comprehend |
| Translation | Amazon Translate |
| Speech-to-text | Amazon Transcribe |
| Text-to-speech | Amazon Polly |
| Chatbot | Amazon Lex |
| Recommendations | Amazon Personalize |
| Fraud detection | Amazon Fraud Detector |
| Intelligent search | Amazon Kendra |
| Human review | Amazon Augmented AI (A2I) |
| Vector database for RAG | Amazon OpenSearch Service |
| Detect bias | SageMaker Clarify |
| Monitor models | SageMaker Model Monitor |