Skip to content

AIP-C01: Study Notes

← Back to Overview

Study Strategy

These notes cover all exam domains comprehensively. Use the Quick Refresher for last-minute review. Domain 1 has dedicated notes at Domain 1 →.


Domain 1: FM Integration, Data Management, and Compliance (~31%)

See dedicated notes: Domain 1: FM Integration & Data Management →


Domain 2: Implementation and Integration (~26%)

2.1 Agentic AI & Bedrock Agents

Overview: Amazon Bedrock Agents enable multi-step reasoning workflows by orchestrating FM calls, API actions, and knowledge base retrievals.

Key Concepts:

  1. Action Groups

    • Definition: Lambda functions that Agents can invoke to interact with external systems
    • Use case: Querying a database, calling a REST API, writing to S3
    • Example: An HR Agent calls a Lambda to fetch employee leave balance
  2. Knowledge Base Integration

    • Definition: RAG pipeline attached to an Agent to retrieve relevant context
    • Use case: Document Q&A, policy lookup, customer support
    • Example: Agent retrieves relevant docs from OpenSearch before answering
  3. Orchestration Trace

    • Definition: Step-by-step trace of Agent reasoning and tool calls
    • Use case: Debugging agentic behavior
    • Example: enableTrace: true in InvokeAgent API call

2.2 RAG Architecture & Chunking Strategies

Overview: Retrieval-Augmented Generation combines FM inference with context retrieval from a vector store to ground responses in real data.

Chunking Strategies:

StrategyDescriptionBest For
Fixed-sizeSplit by token count (e.g., 300 tokens)Simple docs, consistent structure
SemanticSplit by topic/meaningLong-form content, varied structure
HierarchicalParent + child chunk structureComplex docs with sections

When to Use:

  • Use Fixed-size when: documents have uniform structure
  • Use Semantic when: documents vary widely in structure
  • Use Hierarchical when: you need both broad context and fine-grained retrieval

2.3 API Integration Patterns

Decision Tree:

Which Bedrock API to call?
├─ Need a complete, synchronous response? → InvokeModel
├─ Need streaming / low-latency UX? → InvokeModelWithResponseStream
└─ Multi-step agentic workflow? → InvokeAgent

Domain 3: AI Safety, Security, and Governance (~20%)

3.1 Guardrails for Amazon Bedrock

Overview: Guardrails apply content filters and safety controls to both inputs (prompts) and outputs (responses) from foundation models.

Step-by-Step: Creating a Guardrail

  1. Define Content Filters

    • What: Configure sensitivity levels for harmful categories (hate, violence, sexual, insults)
    • Why: Prevent inappropriate content from being generated or passed through
  2. Configure PII Redaction

    • What: Detect and mask/block PII (names, emails, SSNs, credit cards)
    • Why: Compliance with data privacy regulations
  3. Set Denied Topics

    • What: Specify topics the FM should refuse to discuss (e.g., competitor products)
    • Why: Business policy enforcement
  4. Apply to Inference

    • What: Pass guardrailIdentifier and guardrailVersion in API calls
    • Why: Guardrails are only active when explicitly applied

3.2 IAM & VPC Security

Service/Tool Comparison:

MethodPrimary UseKey FeatureWhen to Use
IAM PoliciesAccess controlLeast-privilege resource accessAlways — required baseline
VPC Endpoints (PrivateLink)Private connectivityNo public internet routingCompliance, data residency
Resource PoliciesCross-account accessBedrock model access from other accountsMulti-account architectures

Domain 4: Operational Efficiency and Optimization (~12%)

4.1 Cost Optimization Strategies

Provisioned Throughput vs On-Demand:

Use Provisioned Throughput when:

  • Traffic is predictable and consistent
  • You need guaranteed model units (MUs) available
  • Running 24/7 workloads where PTU commitment is cheaper than on-demand

Avoid Provisioned Throughput when:

  • Traffic is sporadic or unpredictable
  • Testing or development workloads
  • Short-lived experiments

Token Efficiency Best Practices:

  • ✅ DO: Keep system prompts concise — every token costs money
  • ✅ DO: Use streaming to improve perceived latency without changing cost
  • ✅ DO: Set maxTokens explicitly to prevent runaway responses
  • ❌ DON'T: Repeat the full conversation history when only recent context is needed

4.2 Monitoring & Troubleshooting

Common Issues:

IssueCauseSolution
ThrottlingExceptionExceeded on-demand TPSSwitch to Provisioned Throughput or reduce request rate
High latencySynchronous InvokeModelSwitch to InvokeModelWithResponseStream
Poor RAG retrievalLow relevance scoresTune chunking strategy or embedding model
Guardrail blocking valid contentFilter sensitivity too highReduce filter strength or update denied topics

Quick Reference

Key Acronyms

AcronymFull FormMeaning
FMFoundation ModelLarge pre-trained AI model (Claude, Llama, Titan, etc.)
RAGRetrieval-Augmented GenerationFM + context retrieval from a vector store
PTUProvisioned Throughput UnitReserved model capacity for predictable performance
PIIPersonally Identifiable InformationData that can identify an individual
MUModel UnitUnit of Bedrock Provisioned Throughput

Important Bedrock Limits & Notes

ResourceKey Detail
Claude context windowUp to 200k tokens
PTU commitment period1 month or 6 months
OpenSearch Serverless vector dimensionsUp to 16,000 dimensions
Bedrock Knowledge Base chunk sizeConfigurable (default ~300 tokens)

← Back to Overview | Quick Refresher → | Exam Tips →