Prerequisites

Before configuring Claude Code with Vertex AI, ensure you have:

  • A Google Cloud Platform (GCP) account with billing enabled
  • A GCP project with Vertex AI API enabled
  • Access to desired Claude models (e.g., Claude Sonnet 4)
  • Google Cloud SDK (gcloud) installed and configured
  • Quota allocated in desired GCP region

Vertex AI may not support the Claude Code default models on non-us-east5 regions. Ensure you are using us-east5 and have quota allocated, or switch to supported models.

Setup

1. Enable Vertex AI API

Enable the Vertex AI API in your GCP project:

# Set your project ID
gcloud config set project YOUR-PROJECT-ID

# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com

2. Request model access

Request access to Claude models in Vertex AI:

  1. Navigate to the Vertex AI Model Garden
  2. Search for “Claude” models
  3. Request access to desired Claude models (e.g., Claude Sonnet 4)
  4. Wait for approval (may take 24-48 hours)

3. Configure GCP credentials

Claude Code uses standard Google Cloud authentication.

For more information, see Google Cloud authentication documentation.

When authenticating, Claude Code will automatically use the project ID from the ANTHROPIC_VERTEX_PROJECT_ID environment variable. To override this, set one of these environment variables: GCLOUD_PROJECT, GOOGLE_CLOUD_PROJECT, or GOOGLE_APPLICATION_CREDENTIALS.

4. Configure Claude Code

Set the following environment variables:

# Enable Vertex AI integration
export CLAUDE_CODE_USE_VERTEX=1
export CLOUD_ML_REGION=us-east5
export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID

# Optional: Disable prompt caching if needed
export DISABLE_PROMPT_CACHING=1

# Optional: Override regions for specific models
export VERTEX_REGION_CLAUDE_3_5_HAIKU=us-central1
export VERTEX_REGION_CLAUDE_3_5_SONNET=us-east5
export VERTEX_REGION_CLAUDE_3_7_SONNET=us-east5
export VERTEX_REGION_CLAUDE_4_0_OPUS=europe-west4
export VERTEX_REGION_CLAUDE_4_0_SONNET=us-east5
export VERTEX_REGION_CLAUDE_4_1_OPUS=europe-west4

Prompt caching is automatically supported when you specify the cache_control ephemeral flag. To disable it, set DISABLE_PROMPT_CACHING=1. For heightened rate limits, contact Google Cloud support.

When using Vertex AI, the /login and /logout commands are disabled since authentication is handled through Google Cloud credentials.

5. Model configuration

Claude Code uses these default models for Vertex AI:

Model typeDefault value
Primary modelclaude-sonnet-4@20250514
Small/fast modelclaude-3-5-haiku@20241022

To customize models:

export ANTHROPIC_MODEL='claude-opus-4-1@20250805'
export ANTHROPIC_SMALL_FAST_MODEL='claude-3-5-haiku@20241022'

IAM configuration

Assign the required IAM permissions:

The roles/aiplatform.user role includes the required permissions:

  • aiplatform.endpoints.predict - Required for model invocation
  • aiplatform.endpoints.computeTokens - Required for token counting

For more restrictive permissions, create a custom role with only the permissions above.

For details, see Vertex IAM documentation.

We recommend creating a dedicated GCP project for Claude Code to simplify cost tracking and access control.

Troubleshooting

If you encounter quota issues:

  • Check current quotas or request quota increase through Cloud Console

If you encounter “model not found” 404 errors:

  • Verify you have access to the specified region
  • Confirm model is Enabled in Model Garden

If you encounter 429 errors:

  • Ensure the primary model and small/fast model are supported in your selected region

Additional resources