Skip to content

GCP-GAIL: Study Notes

← Back to Overview


Domain 1: Vertex AI Architecture

Model Garden

The "App Store" for AI models. It includes:

  • First-party models: Gemini, Imagen, Codey, Chirp.
  • Open-source models: Llama, Mistral, Gemma (Google's open-weight model).
  • Third-party models: Anthropic (Claude).

Vertex AI Studio

A low-code environment for rapid prototyping.

  • Language: Test prompts, adjust temperature, and export code to Python.
  • Vision: Generate images (Imagen 2) or perform visual Q&A.
  • Speech: Convert text to life-like speech or transcribe audio with Chirp.

Domain 2: Data Implementation & RAG

Embeddings

To perform RAG, you must convert text into numerical vectors.

  • Service: Vertex AI Embeddings API.
  • Analogy: A library where books with similar themes are placed on the same shelf.

Formerly known as Matching Engine.

  • Function: A high-scale, low-latency database for searching through millions of embeddings.
  • Integration: Essential for grounding LLMs in your own corporate data (PDFs, Databases).

Domain 3: Model Customization Techniques

1. Prompt Design (Zero/Few-Shot)

  • Zero-shot: "Summarize this." (No examples).
  • Few-shot: Giving 3-5 examples of input/output to "show" the model what you want.

2. Supervised Fine-Tuning (SFT)

  • When to use: When you have 100+ high-quality examples of how the model should behave.
  • Requirement: A dataset stored in Cloud Storage in JSONL format.

3. Grounding

  • Definition: Connecting the model to a verifiable source of truth.
  • Sources: Google Search, BigQuery, or your own Document AI index.

Domain 4: Operations (GenAIOps)

Evaluation (AutoSxS)

  • Problem: How do you know if Model A is better than Model B?
  • Solution: Vertex AI AutoSxS (Automatic Side-by-Side). An objective "judge" model compares the outputs of two models based on your criteria.

Deployment & Monitoring

  • Endpoints: Models must be deployed to an Endpoint to be accessible via API.
  • Safety: Monitoring for "Model Drift" or "Hallucination rates" over time.

Quick Reference: Limits & Quotas

ResourceDefault Logic
Max Output TokensVaries by model (e.g., 2048 or 8192).
Context WindowGemini 1.5 Pro supports up to 1M+ tokens.
Data PrivacyGoogle does not train foundation models on customer data.

← Back to Overview | Quick Refresher →

Happy Studying! 🚀 • We use privacy-friendly analytics (no cookies, no personal data) • Privacy Policy