AWS Certification
Generative AI Developer — Professional
AIP-C01 Cheat Sheet
5-Minute Master Reference · All Five Domains · Key Services · Decision Trees
Pass: 750 / 1000

65 scored · 10 unscored · 170 min

Compensatory scoring model

D1
31%
FM Integration, Data & Compliance
D2
26%
Implementation & Integration
D3
20%
AI Safety, Security & Governance
D4
12%
Operational Efficiency & Optimization
D5
11%
Testing, Validation & Troubleshooting
D1 · Amazon Bedrock Essentials
  • Bedrock = managed API to invoke FMs — no infra to manage
  • FMs available: Anthropic Claude, Meta Llama, Cohere, Amazon Titan, Stability AI, Mistral
  • Inference params: temperature (creativity), top-P/K (sampling), max_tokens, stop_sequences
  • API calls: InvokeModel (sync) · InvokeModelWithResponseStream (streaming)
  • Pricing modes: On-demand = pay per token · Provisioned = reserved capacity, needed for custom models · Batch = async -50% cost
  • Embeddings: Amazon Titan Embeddings, Cohere Embed — convert text → vectors for RAG
D1 · RAG Pipeline (highest exam weight)
Docs / S3
source
Chunk
split
Embed
vectorize
Vector DB
store
Retrieve
kNN
Augment
prompt
FM
generate
  • Knowledge Bases for Bedrock = fully managed RAG (no code)
  • Vector stores: OpenSearch Serverless · Aurora pgvector · Pinecone · Redis
  • Chunking: fixed-size (simple) · semantic (quality) · hierarchical (docs)
  • Hybrid search = semantic + keyword (BM25) → better recall
  • Metadata filters narrow retrieved chunks before similarity search
D1 · Customization Decision Tree
ScenarioUse
Need fresh/dynamic dataRAG
Teach model a new style/formatFine-tune
Domain-specific vocabulary adaptationContinued pre-train
Quick, no training dataRAG
Consistent tone in outputsFine-tune
Latency-critical, no retrieval lagFine-tune
  • Fine-tune data format: JSONL with prompt/completion pairs
  • Custom models require provisioned throughput to invoke (not on-demand)
  • Model Distillation: large teacher → small student, same quality cheaper
D2 · Prompt Engineering Patterns
  • Zero-shot — no examples, rely on model knowledge Use when task is simple & model knows the domain
  • Few-shot — 3–5 examples in prompt Use when format/style must be controlled precisely
  • Chain-of-Thought (CoT) — "think step by step" Use for math, logic, multi-step reasoning
  • ReAct = Reason + Act → basis of Bedrock Agents loop
  • Prompt chaining — output of step N becomes input to step N+1
  • System prompt sets persona/constraints · Human/Assistant turns = conversation
  • Prompt injection — attacker embeds malicious instructions → mitigate with Guardrails
D2 · Bedrock Agents (ReAct Loop)
Agent Orchestration Cycle
User
Input
FM
Reason
Decide
Tool
Lambda
Action
Observe
Result
Loop or
Answer
  • Action Group = Lambda fn + OpenAPI schema → what the agent can do
  • Knowledge Base attached to agent → automatic RAG retrieval
  • Return of Control (RoC) — agent pauses, awaits human approval
  • Multi-agent: Supervisor agent delegates to Sub-agents by specialty
  • Agent memory: session (within convo) · cross-session (persisted)
  • Bedrock Flows = visual drag-and-drop workflow builder (low-code)
D2 · Right Service, Right Job
NeedService
Custom FM app, full controlAmazon Bedrock
Managed RAG chatbot over company docsAmazon Q Business
Enterprise keyword + semantic searchAmazon Kendra
Sentiment, entities, PII from textAmazon Comprehend
Extract data from PDFs / formsAmazon Textract
Train/deploy custom ML modelsAmazon SageMaker
Image / video understandingAmazon Rekognition
Multi-service pipeline orchestrationAWS Step Functions
D3 · Guardrails & Responsible AI
Content Filters hate, violence, sexual, insults — configurable severity
Topic Denial block competitor mentions, legal advice, etc.
PII Redaction SSN, credit card, email, phone → ANONYMIZE or BLOCK
Grounding Check verify answer is supported by retrieved context (anti-hallucination)
Word Filters custom deny-list · profanity filter
  • Responsible AI pillars: Fairness · Explainability · Privacy · Robustness · Transparency
  • Model invocation logging → S3 / CloudWatch (required for audit)
  • Encryption: KMS at rest · TLS in transit · VPC endpoints for private access
  • IAM: identity-based + resource-based policies · least-privilege per Lambda
D4 · Cost & Performance Optimization
StrategyCost ↓Latency ↓
Batch Inference API✓ -50%✗ async
Prompt Caching
Streaming responses~✓ perceived
Smaller model
Provisioned throughput~ (high vol)
Token compression
  • Prompt caching caches the system prompt portion (static, repeated context)
  • Batch inference = async jobs, for nightly/bulk processing, not real-time
  • CloudWatch metrics: InputTokenCount · OutputTokenCount · InvocationLatency · ThrottledRequests
D5 · Testing, Evaluation & Troubleshooting
  • Model Evaluation jobs (Bedrock built-in): auto metrics or human review
  • Auto metrics: ROUGE (summarization) · BERTScore (semantic) · accuracy
  • RAG-specific (RAGAS): Context Precision · Context Recall · Answer Relevancy · Groundedness
  • Hallucination → use grounding check in Guardrails + citations in response
  • Agent debugging: enable Trace in Bedrock console → shows each ReAct step
  • X-Ray tracing for end-to-end latency across Lambda + Bedrock calls
  • ThrottlingException → implement exponential backoff + jitter, or upgrade to provisioned throughput
  • Common failures: retrieval miss (bad chunking) · context window overflow · prompt injection · format errors
Exam Tips
D1+D2 = 57% — master RAG & Agents first No penalty for guessing — always answer all 75 questions "Managed / no-code" in stem → usually Q Business or Knowledge Bases Custom model inference always needs Provisioned Throughput Bulk/async workloads → Batch Inference API Competitor blocking → Topic Denial in Guardrails Compensatory scoring — weak on D5 is OK if D1/D2 are strong