Skip to main content

GCP Vertex AI SLA Credits & Refunds Guide

How the GCP Vertex AI SLA works: uptime tiers, exclusions, claim windows, and how to recover the credits you're owed when Vertex AI goes down.

GCP Vertex AI SLA Credits & Refunds

Google Cloud publishes a service-specific SLA for Vertex AI that describes exactly when an AI/ML workload qualifies for credits — and the thresholds are stricter than most teams realize. This guide breaks down the Vertex AI commitment, what Google considers a downtime period, and how to file a financial credit request through the Cloud Console.

What this guide covers

  • The official GCP Vertex AI uptime commitment and credit tiers
  • Which incidents qualify (and which exclusions silently disqualify claims)
  • How to file a Vertex AI credit request inside the GCP claim window
  • Why manual claim recovery typically leaves money on the table

Frequently asked questions about GCP Vertex AI SLAs

What is the typical SLA uptime guarantee for GCP Vertex AI?

Vertex AI uptime varies by sub-service. Google's published commitments are 99.9% monthly uptime for online prediction endpoints serving generally-available models, 99.5% for the Vertex AI training service, and separate tiers for Matching Engine, Generative AI APIs (Gemini), and Pipelines. Batch prediction and Workbench are generally not covered by a uptime SLA. Check the per-service line in Google's SLA index for the exact figure on the surface you are using.

How do I claim GCP Vertex AI SLA credits after an outage?

File a Financial Credit Request through Google Cloud Support within 30 days of the end of the affected billing month — the deadline is shorter than AWS or Azure, which catches a lot of teams out. Include your Project ID, the affected Vertex AI resources, downtime intervals (with timezone), supporting evidence from Cloud Monitoring or your own observability stack, and a calculation showing where Monthly Uptime Percentage fell below the SLA threshold. Google issues approved credits against your billing account, not as cash refunds.

What exclusions apply to the GCP Vertex AI SLA?

Specifically for Vertex AI, preview/experimental models, custom training jobs that fail due to your code or container image, and quota-exhaustion errors (such as exceeding requests-per-minute on Gemini APIs) are explicitly excluded — only platform-level unavailability of GA endpoints counts.

Why is it difficult to get refunds for Vertex AI outages manually?

AI/ML SLAs are still maturing, and Vertex AI carries some of the most nuanced terms in the cloud catalog. Rate limits, queue depths, and model availability all get measured differently, and the SLA often excludes throttling that the provider deems "expected." Teams that successfully claim Vertex AI credits do so by capturing per-request latency and error-code data and matching it precisely against the published terms.

Related GCP SLA guides

Other Google Cloud services with their own published SLA and 30-day claim window:

Don't miss GCP's 30-day claim window

GCP's claim deadline for Vertex AI is the shortest of the three major clouds, and most teams miss it for the same reason: nobody owns "file SLA credit requests" as a recurring task. By the time finance closes out the month, the window is already gone.

Next Signal monitors Vertex AI availability, files the Financial Credit Request inside Google's deadline, and tracks the claim through resolution. See how it works or start a free trial.