GCP-GAIL: Cheatsheet โ
โ Overview ยท โ Exam Guide
Exam Day Reference
Print this or review 5 minutes before the exam.
Vertex AI Components โ
| Component | Purpose |
|---|---|
| Model Garden | Browse and select AI models |
| Vertex AI Studio | Test prompts, adjust parameters |
| Vector Search | High-scale embeddings search (RAG) |
| AutoSxS | Compare model outputs side-by-side |
| Embeddings API | Generate text embeddings |
Gemini Models โ
| Model | Context Window | Best For |
|---|---|---|
| Gemini 1.5 Pro | 1M+ tokens | Long documents, complex reasoning |
| Gemini Pro | 32K tokens | General tasks |
| Gemini Flash | Fast | Low latency, simple tasks |
| Gemini Ultra | Varies | Maximum capability |
Key: Gemini is multimodal (text, images, video, audio)
Core Parameters โ
| Parameter | Effect | Use Case |
|---|---|---|
| Temperature | Controls randomness | High (0.8+) for creative writing; Low (0.1) for technical data. |
| Top-K | Limits vocabulary to K words | Prevents the model from picking highly unlikely "long tail" words. |
| Top-P | Dynamic vocabulary based on probability | Samples from smallest set of words whose cumulative probability is P. |
Model Garden Providers โ
| Provider | Models | Type |
|---|---|---|
| Gemini, Imagen, Codey, Chirp | First-party | |
| Open-source | Llama, Mistral, Gemma | Open models |
| Third-party | Claude (Anthropic) | Partner models |
Gemma = Google's open-weight model (can download and run anywhere)
Customization Ladder (3 Steps) โ
Prompt Design (cheapest, fastest)
- Zero-shot: No examples
- Few-shot: 3-5 examples
Grounding/RAG (current data)
- Google Search
- BigQuery
- Document AI
Fine-Tuning (100+ examples)
- Supervised Fine-Tuning (SFT)
- Dataset in Cloud Storage (JSONL)
Always try in this order!
RAG Workflow (4 Steps) โ
- Generate embeddings (Vertex AI Embeddings API)
- Store in Vector Search (Matching Engine)
- Retrieve similar documents
- Ground LLM with context
Analogy: Library where similar books are on same shelf
Grounding Sources (3) โ
- Google Search โ current events, web data
- BigQuery โ structured data, analytics
- Document AI โ your own documents/PDFs
Purpose: Connect model to verifiable source of truth
Vertex AI Studio Tabs โ
| Tab | Purpose |
|---|---|
| Language | Test prompts, adjust temperature, export code |
| Vision | Generate images (Imagen 2), visual Q&A |
| Speech | Text-to-speech, speech-to-text (Chirp) |
AutoSxS (Evaluation) โ
Problem: How to know if Model A > Model B?
Solution: Objective "judge" model compares outputs
Use for: Model selection, prompt comparison
Deployment & Monitoring โ
| Concept | Meaning |
|---|---|
| Endpoint | Deployed model accessible via API |
| Model Drift | Performance degrading over time |
| Monitoring | Track drift, hallucination rates |
Key: Models must be deployed to Endpoint for production use
Data Privacy (3 Rules) โ
- Google does NOT train foundation models on customer data
- Data stays in your GCP project
- Respects IAM permissions
Compliance: SOC 2, ISO 27001, GDPR
Limits & Quotas โ
| Resource | Default |
|---|---|
| Max Output Tokens | Varies by model (2048-8192) |
| Context Window | Gemini 1.5 Pro: 1M+ tokens |
| Gemini Capabilities | Text, images, video, audio |
Prompt Engineering Tips โ
Zero-shot: "Summarize this."
Few-shot: Give 3-5 input/output examples
Chain-of-Thought: "Think step by step"
Best Practices:
- Be specific
- Provide context
- Specify output format
- Add constraints
Common Mistakes to Avoid โ
- โ Fine-tuning for current data (use grounding)
- โ Fine-tuning for simple tasks (use prompting)
- โ Ignoring context window limits
- โ Not using grounding for factual accuracy
- โ Forgetting data privacy guarantees
Decision Trees โ
"Which customization?" โ
Simple task โ Zero-shot
Custom format โ Few-shot
Current data โ Grounding
100+ examples โ Fine-Tuning"Which component?" โ
Explore โ Model Garden
Test โ Vertex AI Studio
RAG โ Vector Search
Compare โ AutoSxS"Which model?" โ
Multimodal โ Gemini
Long context โ Gemini 1.5 Pro
Fast โ Gemini Flash
Open-source โ Llama/Mistral
Google open โ Gemmaโ Overview ยท โ Exam Guide
Last Updated: 2026-02-05