Domain 2: Implementation and Integration (26%) โ
โ Domain 1 ยท Next: Domain 3 โ
Exam Tip
This domain tests your ability to implement GenAI applications end-to-end. Know when to use Bedrock Agents vs. a simple Knowledge Base call, understand the three Bedrock APIs, and be able to explain chunking strategy trade-offs. Agents + Domain 1 together = 57% of the exam.
2.1 Agentic AI & Amazon Bedrock Agents โ
What is an Agent? โ
Amazon Bedrock Agents enable multi-step, autonomous reasoning by orchestrating FM calls, tool invocations, and knowledge base retrievals to complete a user goal.
Core Components:
| Component | Role |
|---|---|
| Agent | The orchestrator โ receives the user request and plans the steps |
| Action Groups | Lambda functions the Agent can invoke to interact with external systems |
| Knowledge Bases | RAG pipeline the Agent can query for document-based context |
| Orchestration Trace | Step-by-step log of the Agent's reasoning and tool calls |
Action Groups โ
Action Groups define what external actions an Agent can take:
- Each action group is backed by a Lambda function
- The Agent decides whether and when to call an action based on the user's request
- Actions are described with an OpenAPI schema โ the Agent uses this to understand what parameters to pass
Examples:
- Look up a customer's order status in a database
- Send a confirmation email via an external API
- Write a record to an S3 bucket
Knowledge Base Integration โ
Agents can be connected to a Knowledge Base to retrieve document context:
- The Agent automatically decides when to query the Knowledge Base vs. call an Action Group
- Knowledge Bases use the RAG pipeline: S3 โ chunking โ embeddings โ OpenSearch Serverless
Orchestration Trace โ
Enable with enableTrace: true in the InvokeAgent API call:
- Shows the Agent's step-by-step reasoning chain (which step, which tool, which decision)
- Critical for debugging unexpected Agent behavior
Exam Trap
Bedrock Agents vs. Knowledge Bases (RAG only):
- Use a Knowledge Base alone when: the task is document Q&A with no external action needed
- Use Bedrock Agents when: the task requires multi-step reasoning, calling external APIs, or taking actions beyond retrieval
The exam will present scenarios โ choose Agents when action or orchestration is involved.
2.2 Knowledge Base Architecture โ
End-to-End RAG Flow โ
1. Data Source: S3 bucket (PDFs, Word, HTML, Markdown, CSV)
2. Parser: Extract and clean text from documents
3. Chunker: Split text into smaller pieces (fixed, semantic, hierarchical)
4. Embedder: Titan Text Embeddings / Cohere Embed โ convert text to vectors
5. Vector Store: Amazon OpenSearch Serverless (or Aurora pgvector)
6. At inference: embed query โ vector search โ top-K chunks โ inject into FM promptSupported Data Sources โ
- Amazon S3 โ primary ingestion source (most common exam scenario)
- Web Crawler โ Bedrock Knowledge Bases can crawl websites
- Salesforce, Confluence, SharePoint โ native connectors
Chunking Strategies โ
| Strategy | How It Works | Best For | Trade-off |
|---|---|---|---|
| Fixed-size | Split by token count (e.g., 300 tokens) | Uniform documents | May break mid-sentence or mid-context |
| Fixed-size + overlap | Tokens overlap between adjacent chunks | Preserving cross-boundary context | Higher storage and retrieval cost |
| Semantic | Split by meaning/topic using NLP | Long-form, varied content | Higher processing complexity |
| Hierarchical | Parent chunk + child chunk structure | Complex docs needing broad + fine retrieval | More complex retrieval logic |
TIP
Hierarchical chunking is best when you need both broad context and fine-grained retrieval. Fixed-size with overlap is the simplest option for preserving context across chunk boundaries.
Sync and Ingestion โ
- After updating S3 content, you must trigger a sync to update the vector store
- The Knowledge Base does NOT auto-update when S3 changes
- Ingestion status is visible in the Bedrock console
2.3 API Integration Patterns โ
The Three Core Bedrock Runtime APIs โ
| API | Use Case | Response Model |
|---|---|---|
InvokeModel | Synchronous โ get a complete response in one call | Blocking, full response at once |
InvokeModelWithResponseStream | Streaming โ receive response token by token | Non-blocking, chunk-by-chunk delivery |
InvokeAgent | Multi-step agentic workflow | Streaming trace + final response |
When to Use Each โ
User needs a complete answer in one shot?
โโ InvokeModel (synchronous, simple)
User interface needs a low-latency "typing" feel?
โโ InvokeModelWithResponseStream (streaming)
Task requires tool calls, external APIs, or multi-step planning?
โโ InvokeAgentInvokeModel Request Body (Simplified) โ
{
"modelId": "anthropic.claude-3-sonnet-20240229-v1:0",
"body": {
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Summarize this document..." }
]
}
}Key parameters:
modelId: The specific FM to invokemax_tokens: Cap on output tokens โ controls cost and response lengthtemperature: Controls creativity vs. determinism (0.0โ1.0)
2.4 Retrieval Configuration & Tuning โ
Top-K Retrieval โ
- At query time, the embedding of the user's question is compared against all vectors in the store
- The top-K most similar chunks are retrieved and injected into the FM prompt
- Higher K: More context but longer prompts (higher cost, risk of irrelevant chunks)
- Lower K: Focused context but may miss relevant information
Metadata Filtering โ
- Attach metadata to chunks during ingestion (e.g.,
department: "legal",year: 2024) - At query time, filter retrieval to only return chunks matching specific metadata
- Allows scoped retrieval without maintaining separate vector stores per category
Flashcards
What is the role of an Action Group in a Bedrock Agent?
(Click to reveal)