Vertex AI Backend¶

Route any compatible CLI runtime through Google Cloud Vertex AI instead of direct provider APIs.

Why¶

Single-vendor billing & audit trail (CloudLogging captures every call)
EU data residency by default (europe-west1)
IAM auth instead of per-provider API keys (no ANTHROPIC_API_KEY to rotate)
Model Garden access: Anthropic Claude + Google Gemini + Llama + Mistral + others, all under one auth
Compliance: SOC2, ISO27001, HIPAA, EU GDPR — what your security team probably already approved

Setup¶

# 1. Authenticate (any one of these)
gcloud auth application-default login        # interactive
# or, in CI / Cloud Run:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

# 2. Tell agentspec which project to use
export GOOGLE_CLOUD_PROJECT=my-project

# 3. (Optional) override the region — defaults to europe-west1
export GOOGLE_CLOUD_LOCATION=europe-west4

That's it. AgentSpec auto-detects the configuration. Verify with:

agentspec resolve my-agent.agent
# decisions:
#   Vertex AI detected: vertex-ai (project=my-project, region=europe-west1)
#   selected claude/claude-sonnet-4-6 via claude-code (Vertex AI: europe-west1)

Per-runtime mapping¶

When AgentSpec routes through Vertex AI, it injects the right env vars for each CLI:

CLI	Mechanism	Env vars set
claude-code	Anthropic's official Vertex mode	`CLAUDE_CODE_USE_VERTEX=1`, `ANTHROPIC_VERTEX_PROJECT_ID`, `CLOUD_ML_REGION`
gemini-cli	Google's official Vertex mode	`GOOGLE_GENAI_USE_VERTEXAI=true`, `GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION`
aider	Via LiteLLM Vertex provider	`VERTEX_PROJECT`, `VERTEX_LOCATION`
opencode	Standard GCP env vars	`GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION`
codex-cli	❌ OpenAI not on Vertex Model Garden	(uses OpenAI direct API)
ollama	❌ Local model	(no Vertex routing)

Routing precedence¶

When both Vertex AI and direct provider API keys are configured, Vertex AI wins for routable providers (claude, anthropic, gemini, google).

For non-routable providers (openai, local), agentspec uses the direct API even when Vertex is configured.

Region selection¶

Default: europe-west1 (Belgium) — broadest model coverage in EU and GDPR-primary region.

Other useful EU regions:

Region	Notes
`europe-west1`	Belgium. Default. Best Gemini + Claude availability.
`europe-west4`	Netherlands. Alternative if you need it for residency.
`europe-southwest1`	Madrid. Latency-friendly for SP/PT workloads.

Set explicitly when needed:

export GOOGLE_CLOUD_LOCATION=europe-west4
# or
export AGENTSPEC_VERTEX_LOCATION=europe-west4

Env var precedence (project)¶

Highest to lowest:

AGENTSPEC_VERTEX_PROJECT (explicit, agentspec-specific)
GOOGLE_CLOUD_PROJECT (standard GCP env)
VERTEX_PROJECT (some tooling uses this)

Same for location: AGENTSPEC_VERTEX_LOCATION > GOOGLE_CLOUD_LOCATION > VERTEX_LOCATION > europe-west1 (default).

What about Model Garden Anthropic models?¶

Vertex Model Garden serves Anthropic Claude models on EU-resident infrastructure. AgentSpec recognizes provider prefixes claude/ and anthropic/ and routes them to claude-code with Vertex env vars. The actual model you request must be available in your region.

Check availability: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models

Disabling¶

Just unset the env vars:

unset GOOGLE_CLOUD_PROJECT

Or set a non-routable model in your .agent:

model:
  preferred:
    - openai/o3              # uses codex-cli direct, never Vertex
    - local/llama3:70b       # uses ollama, never Vertex

Verifying it actually went through Vertex¶

After running an agent, check CloudLogging:

gcloud logging read 'resource.type="aiplatform.googleapis.com/Endpoint"' \
  --project=$GOOGLE_CLOUD_PROJECT --limit=5

You should see your model invocations there. If you don't, agentspec fell back to direct API for some reason — check the resolver decisions with agentspec resolve --output json and look at auth_source.