GCP AI and ML Services¶

Scope¶

GCP managed AI and ML services: Vertex AI (model garden with 150+ models, generative AI studio, Gemini API — Gemini 1.5 Pro/Flash/Ultra, custom model training, managed endpoints, pipelines, feature store, vector search, tuning — supervised and RLHF, grounding with Google Search and enterprise data), Google AI Studio (rapid prototyping with Gemini models), Document AI (form parsing, OCR, specialized processors for invoices/receipts/lending/procurement), Vision AI (image classification, object detection, OCR), Speech-to-Text and Text-to-Speech, Natural Language AI (entity analysis, sentiment, syntax, classification), Recommendations AI (retail recommendations), Translation AI, AlloyDB AI (pgvector integration for vector search with PostgreSQL), BigQuery ML (in-database ML — train and predict using SQL). Does not cover TPU/GPU instance types — see patterns/ai-ml-infrastructure.md.

Checklist¶

Why This Matters¶

GCP's Gemini models offer the largest context windows in the industry (1M+ tokens for Gemini 1.5 Pro), enabling use cases that are impractical on other platforms — such as processing entire codebases, lengthy legal documents, or video content in a single request. However, the Vertex AI vs direct Gemini API distinction is important: Vertex AI provides enterprise controls (VPC Service Controls, customer-managed encryption keys, IAM, audit logging) that the direct Gemini API does not. Organizations that prototype in Google AI Studio must re-architect for Vertex AI when moving to production if enterprise security controls are required.

GCP's AI portfolio is deeply integrated with its data stack — BigQuery ML enables data analysts to build models without leaving SQL, AlloyDB AI adds vector search to PostgreSQL workloads, and Vertex AI Feature Store shares features between training and serving. This integration is GCP's differentiator for organizations already invested in BigQuery and Cloud Storage for their data platform. Document AI processors are pre-trained on Google's document understanding capabilities and often outperform general-purpose LLMs for structured extraction tasks (invoices, forms, IDs) at a fraction of the cost.

Common Decisions (ADR Triggers)¶

Vertex AI vs direct Gemini API -- enterprise controls vs simplicity
Gemini Pro vs Flash vs open-source models from Model Garden -- quality vs cost vs control
Vertex AI Search vs AlloyDB AI (pgvector) vs custom vector database for RAG -- managed retrieval vs SQL-native vectors vs self-managed
Grounding with Google Search vs enterprise-only grounding -- real-time factual accuracy vs data privacy
Document AI pre-trained processors vs Gemini for document understanding -- structured extraction vs flexible LLM
BigQuery ML vs Vertex AI training for tabular ML workloads -- SQL-native vs full MLOps
Supervised fine-tuning vs adapter tuning vs distillation -- model customization approach

Reference Links¶

Vertex AI documentation -- model garden, managed endpoints, pipelines, feature store, and vector search
Gemini API documentation -- Gemini model family, prompting, tuning, and grounding
Google AI Studio -- rapid prototyping with Gemini models
Document AI documentation -- form parsing, OCR, and specialized processors
AlloyDB AI documentation -- pgvector integration and Vertex AI embeddings for vector search
BigQuery ML documentation -- in-database ML training and prediction using SQL