Domain 1: Fundamentals of AI and ML (20%) โ
โ Overview ยท Next Domain โ
1.1: AI, ML, and DL Hierarchy โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Artificial Intelligence (AI) โ Broadest: Machines that mimic human intelligence
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Machine Learning (ML) โ โ Subset: Learn from data without explicit programming
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Deep Learning (DL) โ โ โ Subset: Neural networks with multiple layers
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโWhy This Matters
Exam questions test whether you understand the hierarchy: All DL is ML, all ML is AI, but not all AI is ML.
Types of Machine Learning โ
ML Learning Types
What type of ML uses labeled data?
(Click to reveal)Uses features (X) and labels (y) to learn a mapping. Examples: spam detection, house price prediction.
1. Supervised Learning โ
Definition: Learn from labeled data
Components:
- Features (X): Input variables
- Labels (y): Known output/target
- Model: Learns mapping from X to y
Types:
- Classification: Predict categories (spam/not spam)
- Regression: Predict continuous values (house price)
Examples:
- Email spam detection
- Image classification (cat vs dog)
- House price prediction
- Customer churn prediction
2. Unsupervised Learning โ
Definition: Learn patterns from unlabeled data
Types:
- Clustering: Group similar items
- Dimensionality Reduction: Reduce features while preserving information
- Anomaly Detection: Find outliers
Examples:
- Customer segmentation
- Anomaly detection in network traffic
- Product recommendations
3. Reinforcement Learning โ
Definition: Learn through trial and error with rewards
Components:
- Agent: Learns and makes decisions
- Environment: What agent interacts with
- Actions: What agent can do
- Rewards: Feedback for actions
Examples:
- Game playing (AlphaGo)
- Robotics
- Autonomous vehicles
- Resource optimization
| Learning Type | Data | Goal | Example |
|---|---|---|---|
| Supervised | Labeled | Predict labels | Spam detection |
| Unsupervised | Unlabeled | Find patterns | Customer groups |
| Reinforcement | Rewards/penalties | Maximize rewards | Game AI |
ML Model Performance Metrics โ
Classification Metrics โ
Confusion Matrix:
Predicted
Positive Negative
Actual Pos TP FN
Neg FP TNClassification Metrics
When should you prioritize Precision?
(Click to reveal)Precision = TP / (TP + FP)
Example: Spam filter (do not mark important emails as spam)
Key Metrics:
Accuracy = (TP + TN) / Total
- Good when classes are balanced
Precision = TP / (TP + FP)
- "Of all predicted positives, how many are actually positive?"
- Use when false positives are costly
Recall = TP / (TP + FN)
- "Of all actual positives, how many did we catch?"
- Use when false negatives are costly
Exam Scenario
Medical diagnosis: Prefer high recall (catch all diseases, even with false positives) Spam filter: Prefer high precision (don't mark important emails as spam)
Regression Metrics โ
- MAE (Mean Absolute Error): Average absolute difference
- MSE (Mean Squared Error): Average squared difference
- RMSE (Root Mean Squared Error): Square root of MSE
Overfitting vs. Underfitting โ
Model Fit Issues
What is Underfitting?
(Click to reveal)Poor on BOTH training and test data.
Solution: More complex model, more features.
| Issue | Description | Performance | Solution |
|---|---|---|---|
| Underfitting | Model too simple | Poor on training AND test | More complex model, more features |
| Good Fit | Just right | Good on both | โ This is the goal |
| Overfitting | Memorized training data | Great on training, poor on test | More data, regularization, simpler model |
Visual:
Underfitting: ๐ Training: 70% ๐ Test: 65%
Perfect Fit: โ
Training: 92% โ
Test: 90%
Overfitting: โ
Training: 99% ๐ Test: 72%ML Development Lifecycle โ
Business Problem Definition
- What are we trying to predict/optimize?
- What's the success metric?
Data Collection and Preparation
- Gather data
- Clean data (handle missing values, outliers)
- Split data (train/validation/test)
Feature Engineering
- Create useful features
- Transform variables
- Encode categorical variables
Model Training
- Choose algorithm
- Train on training data
- Tune hyperparameters
Model Evaluation
- Test on validation data
- Check metrics (accuracy, precision, recall)
- Compare multiple models
Model Deployment
- Deploy to production
- Create API endpoint
- Monitor performance
Model Monitoring and Maintenance
- Track model drift
- Retrain when needed
- Update as data changes
Exam Focus
The exam tests whether you understand that ML is iterative, not one-and-done. Models need monitoring and retraining.