Cloud Blog: Build with more flexibility: New open models arrive in the Vertex AI Model Garden

Jul 16, 2025

—

Source URL: https://cloud.google.com/blog/products/ai-machine-learning/deepseek-r1-is-available-for-everyone-in-vertex-ai-model-garden/
Source: Cloud Blog
Title: Build with more flexibility: New open models arrive in the Vertex AI Model Garden

Feedly Summary: In our ongoing effort to provide businesses with the flexibility and choice needed to build innovative AI applications, we are expanding the catalog of open models available as Model-as-a-Service (MaaS) offerings in Vertex AI Model Garden. Following the addition of Llama 4 models earlier this year, we are announcing DeepSeek R1 is available for everyone through our Model-as-a-Service (MaaS) offering. This expansion reinforces our commitment to an open AI ecosystem, ensuring our customers can access a diverse range of powerful models to find the one best suited for their specific use case.
Deploying and managing today’s large-scale models presents operational and financial challenges. For instance, a large model such as DeepSeek R1 can require an infrastructure of eight advanced H200 GPUs to run inference. For many organizations, procuring and managing such resources is a major undertaking that can divert focus from core application development.
Vertex AI’s MaaS offering is designed to remove this complexity. By providing these models as fully managed, serverless APIs, we eliminate the need for customers to provision or manage the underlying infrastructure. This allows your teams to bypass the complexities of GPU management and focus directly on building and innovating. With Vertex AI, you benefit from a secure, enterprise-grade platform with built-in data privacy and compliance, all under a flexible, pay-as-you-go pricing model that scales with your needs.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Getting started
Below we provide a step-by-step guide on how you can use open models available on MaaS. We have used DeepSeek R1 on Vertex AI as an example. It can be accessed both via the UI and API.
1. Enable the DeepSeek API Service
Navigate to the DeepSeek API Service from the Vertex AI Model Garden and click on the title to open the model card. Then, enable access to the DeepSeek API Service. It may take a few minutes for permissions to propagate after enablement.

DeepSeek API Service from the Vertex AI Model Garden

2. Try out the model via the UI
Navigate to the DeepSeek API Service from the Vertex AI Model Garden and click on the tile to open the model card. You can use the UI in the sidebar to test the service.

DeepSeek API Service with UI sidebar to test the service

3. Try out the model via Vertex AI API
To integrate DeepSeek R1 within your applications, you can use either REST API or OpenAI Python API Client Library. Note: For security of your data, DeepSeek MaaS endpoint does not have any outbound internet access.
Get Predictions via the REST API
You can make API requests via curl from the Cloud Shell or your machine with gcloud credentials configured. Remember to replace the placeholders with this code:

code_block
<ListValue: [StructValue([(‘code’, ‘export PROJECT_ID=<ENTER_PROJECT_ID>\r\nexport REGION_ID=<ENTER_REGION_ID> \r\n\r\ncurl \\\r\n-X POST \\\r\n-H “Authorization: Bearer $(gcloud auth print-access-token)" \\\r\n-H "Content-Type: application/json" \\\r\n"https://${REGION_ID}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${REGION_ID}/endpoints/openapi/chat/completions" \\\r\n-d \'{\r\n "model": "deepseek-ai/deepseek-r1-0528-maas",\r\n "max_tokens": 200,\r\n "stream": true,\r\n "messages": [\r\n {\r\n "role": "user",\r\n "content": "which is bigger – 9.11 or 9.9"\r\n }\r\n ]\r\n}\”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3dfcfc117f70>)])]>

Get Predictions via the OpenAI Python API Client Library
Install the OpenAI Python API Library:

code_block
<ListValue: [StructValue([(‘code’, ‘pip install openai’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3dfcfc117f10>)])]>

Initialize the client and configure the endpoint URL. To get the access token to use as an API key, you can read more here. If run from a local machine, GOOGLE_APPLICATION_CREDENTIALS will authenticate your requests.

code_block
<ListValue: [StructValue([(‘code’, ‘import os\r\nimport openai\r\n\r\nPROJECT_ID = “ENTER_PROJECT_ID”\r\nLOCATION = "us-central1"\r\nMODEL_ID = "deepseek-ai/deepseek-r1-0528-maas"\r\nAPI_KEY = os.environ["GOOGLE_APPLICATION_CREDENTIALS"] # or add output from gcloud auth print-access-token \r\n\r\ndeepseek_vertex_endpoint_url = (\r\n f"https://{LOCATION}-aiplatform.googleapis.com/v1beta1/"\r\n f"projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi"\r\n)\r\n\r\nclient = openai.OpenAI(\r\n base_url=deepseek_vertex_endpoint_url,\r\n api_key=API_KEY\r\n)’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3dfcfc1e1670>)])]>

Make completions requests via the client:

code_block
<ListValue: [StructValue([(‘code’, ‘response = client.chat.completions.create(\r\n model="deepseek-ai/deepseek-r1-0528-maas",\r\n messages=[\r\n {"role": "system", "content": "You are a helpful assistant"},\r\n {"role": "user", "content": "How many r\’s are in strawberry ?"},\r\n ],\r\n stream=False,\r\n)\r\n\r\nprint(response.choices[0].message.content)\r\n\r\n# ChatCompletion("id=""",\r\n# "choices="[\r\n# "Choice(finish_reason=""length",\r\n# index=0,\r\n# "logprobs=None",\r\n# "message=ChatCompletionMessage(content=""<think>\\nFirst, the question is: \\"How many r\\\\\’s are in strawberry?\\" I need to count the number of times the letter \\\\\’r\\\\\’ appears in the word \\"strawberry\\".\\n\\nLet me write down the word: S-T-R-A",\r\n# "refusal=None",\r\n# "role=""assistant",\r\n# "annotations=None",\r\n# "audio=None",\r\n# "function_call=None",\r\n# "tool_calls=None))"\r\n# ],\r\n# created=,\r\n# "model=""deepseek-ai/deepseek-r1-0528-maas",\r\n# "object=""chat.completion",\r\n# "service_tier=None",\r\n# "system_fingerprint=""",\r\n# usage=CompletionUsage(completion_tokens=50,\r\n# prompt_tokens=18,\r\n# total_tokens=68,\r\n# "completion_tokens_details=None",\r\n# "prompt_tokens_details=None))"’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3dfcfc1e1a30>)])]>

What’s next?
Vertex AI Model Garden opens up new possibilities for building applications that require state-of-the-art foundation models. Here are some next steps:

Review documentation guide for DeepSeek R1 MaaS here and Llama MaaS here

Review pricing here for both models

Explore the Model Garden: Discover other models available as managed services

Build a proof-of-concept: Start with a small project to understand the model’s capabilities

Join the community: Share your experiences and learn from others in the Google Cloud AI Community

AI Summary and Description: Yes

**Summary:** The text discusses the expansion of Google Cloud’s Vertex AI Model Garden to include the DeepSeek R1 model as part of its Model-as-a-Service (MaaS) offerings. This service aims to simplify the deployment and management of large-scale AI models, reducing operational burdens and enabling businesses to focus on core application development, with a strong emphasis on security and compliance measures.

**Detailed Description:**
The announcement focuses on the following key points regarding the DeepSeek R1 model and its implications for AI application development:

– **Model-as-a-Service (MaaS) Expansion:**
– Google Cloud is enhancing its Vertex AI services by adding DeepSeek R1 and previously Llama 4 models.
– The aim is to create an open AI ecosystem that provides businesses access to a wide array of models for their specific needs.

– **Operational and Financial Challenges:**
– Large AI models like DeepSeek R1 necessitate substantial infrastructure, such as advanced H200 GPUs, leading to resource procurement and management challenges for organizations.
– Such challenges can detract from the development of core applications.

– **Benefits of Vertex AI’s MaaS:**
– **Serverless APIs:** The models are offered as fully managed services, simplifying infrastructure management for users.
– **Focus on Innovation:** Teams can concentrate on building and innovating their applications instead of dealing with complex GPU management.
– **Security and Compliance:** Vertex AI provides a secure enterprise-grade platform, ensuring data privacy and compliance, which is crucial in the context of evolving regulations.
– **Flexible Pricing Model:** The platform operates on a pay-as-you-go basis, allowing scalability concerning organizational needs.

– **Accessing DeepSeek R1:**
– Users can access the model through both UI and API, with a clear step-by-step guide provided for API integration.
– Measures are taken to ensure security, including no outbound internet access for the DeepSeek MaaS endpoint.

– **Next Steps for Users:**
– Review documentation and pricing for DeepSeek R1 and other models.
– Engage with the community to share experiences and insights, fostering collaboration in AI application development.

Overall, this development is significant for AI professionals as it addresses major pain points related to infrastructure management while promoting a secure and compliant environment for deploying AI models. It balances operational efficiency with robust security practices, essential for modern enterprise needs.

1 10 2 3 3d 4 5 7 a aaS access access token Act addresses advanced after AGI AI AI applications ai model AI models and Annotations anti API APIs app Application application development applications art as assistant at ated audio authorization benefits Best Bi building built business by bypass C capabilities catalog CERN challenge challenges chat CI CIA CleaR client Cloud Cloud credentials Cloud Shell co code Col collaboration commit community complexity compliance compliance measures concept Console content Context core credential credentials Curl Customer D data data privacy day de deep DeepSeek DeepSeek R1 deployment design development document documentation e e-learning ecosystem efficiency end endpoint endpoints enterprise Entra environment ERP evolving regulations exp Expansion experience export financial financial challenges first flexibility flexible pricing following for foundation model foundation models free full function g Go Google Google Cloud GPU GPUs grade gs H H200 H200 GPUs HR http HTTPS image implications in Inference Inforce infrastructure infrastructure management innovation insights Instance integration inter intern internet internet access io Iron IRS ite J json k Key l Labor Lance language large large-scale models leading learning led Li library llama Llama 4 local logprobs low M MaaS mac machine man managed service managed services management max measures mission Mode model model card Model-as-a-Service models Modern N needs new next NGO no non o oE of off on one open open models openai openapi operation operational efficiency OPM organization organizations oS other out output over pay per permissions platform point post Power practices pre pricing pricing model privacy pro procurement product products professionals project projects prompt proof proof-of-concept ps Py Python Q question R R1 rate Ray RCE red Region Regulation regulations resource resources response review Ro robust security robust security practices Role s scalability Scale scale model sec secure security security and compliance security practices server serverless service services SHA side Sig Sim small source specific SSE STAR start state system T Tails taking team Teams ted test text the Time times to token tokens tool TP trial type UI under up US usage use user Users uth V val Vertex Vertex AI Vision Wi x yt z