Cloud Blog: How to calculate your AI costs on Google Cloud

Source URL: https://cloud.google.com/blog/topics/cost-management/unlock-the-true-cost-of-enterprise-ai-on-google-cloud/
Source: Cloud Blog
Title: How to calculate your AI costs on Google Cloud

Feedly Summary: What is the true cost of enterprise AI?
As a technology leader and a steward of company resources, understanding these costs isn’t just prudent – it’s essential for sustainable AI adoption. To help, we’ll unveil a comprehensive approach to understanding and managing your AI costs on Google Cloud, ensuring your organization captures maximum value from its AI investments.
Whether you’re just beginning your AI journey or scaling existing solutions, this approach will equip you with the insights needed to make informed decisions about your AI strategy.
Why understanding AI costs matters now
Google Cloud offers a vast and ever-expanding array of AI services, each with its own pricing structure. Without a clear understanding of these costs, you risk budget overruns, stalled projects, and ultimately, a failure to realize the full potential of your AI investments. This isn’t just about saving money; it’s about responsible AI development – building solutions that are both innovative and financially sustainable.
Breaking down the Total Cost of Ownership (TCO) for AI on Google Cloud
Let’s dissect the major cost components of running AI workloads on Google Cloud:

Cost category

Description

Google Cloud services (Examples)

Model serving cost

The cost of running your trained AI model to make predictions (inference). This is often a per-request or per-unit-of-time cost.

OOTB models available in Vertex AI, Vertex AI Prediction, GKE (if self-managing), Cloud Run Functions (for serverless inference)

Training and tuning costs

The expense of training your AI model on your data and fine-tuning it for optimal performance. This includes compute resources (GPUs/TPUs) and potentially the cost of the training data itself.

Vertex AI Training, Compute Engine (with GPUs/TPUs), GKE or Cloud Run (with GPUs/TPUs)

Cloud hosting costs

The fundamental infrastructure costs for running your AI application, including compute, networking, and storage.

Compute Engine, GKE or Cloud Run, Cloud Storage, Cloud SQL (if your application uses a database)

Training data storage and adapter layers costs

The cost of storing your training data and any “adapter layers" (intermediate representations or fine-tuned model components) created during the training process.

Cloud Storage, BigQuery

Application layer and setup costs

The expenses associated with any additional cloud services needed to support your AI application, such as API gateways, load balancers, monitoring tools, etc.

Cloud Load Balancing, Cloud Monitoring, Cloud Logging, API Gateway, Cloud Functions (for supporting logic)

Operational support cost

The ongoing costs of maintaining and supporting your AI model, including monitoring performance, troubleshooting issues, and potentially retraining the model over time.

Google Cloud Support, internal staff time, potential third-party monitoring tools

aside_block
), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Let’s estimate costs with an example
Let’s illustrate this with a hypothetical, yet realistic, generative AI use case: Imagine you’re a retail customer with an automated customer support chatbot.
Scenario: A medium-sized e-commerce company wants to deploy a chatbot on their website to handle common customer inquiries (order status, returns, product information and more). They plan to use a pre-trained language model (like one available through Vertex AI Model Garden) and fine-tune it on their own customer support data.
Assumptions:

Model: Fine-tuning a low latency language model (in this case we will use Gemini 1.5 Flash).

Training data: 1 million customer support conversations (text data).

Traffic: 100K chatbot interactions per day.

Hosting: Vertex AI Prediction for serving the model.

Fine-tuning frequency: Monthly.

Cost estimation
As the retail customer in this example, here’s how you might approach this. 
1. First, discover your model serving cost:

Vertex AI Prediction (Gemini 1.5 Flash for Chat) pricing is modality-based pricing so in this case since our input and output is text, the usage unit will be characters. Let’s assume an average of 1000 input characters and 500 output characters per interaction.
Cost per 1M characters input: $0.0375.
Cost per 1M characters output: $0.15
Input cost per day: 100,000 interactions * 1000 characters * $0.0375 / 1000000 = $3.75
Output cost per day: 100,000 interactions * 500 characters * $0.15 / 1000000 characters = $7.5
Total model serving cost per day: $11.25
Total model serving cost per month (~30 days): ~$337

Servicing cost of Gemini Flash 1.5 LLM model

2. Second, identify your training and tuning costs:
In this scenario, we aim to enhance the model’s accuracy and relevance to our specific use case through fine-tuning. This involves inputting a million past chat interactions, enabling the model to deliver more precise and customized interactions.

Cost per training tokens: $8 / M tokens
Cost per training characters: $2 / M characters (where each token approximately equates to 4 characters)
Tuning cost (first month): 1,000,000 conversation (training data) * 1500 characters (input + output) * 2 /1,000,000 = $3,000
Tuning cost (subsequent month): 100,000 conversation (new training data) * 1500 characters (input + output) * 2 /1,000,000 = $300

3. Third, understand the cloud hosting costs:
Since we’re using Vertex AI Prediction, the underlying infrastructure is managed by Google Cloud. The cost is included in the per-request pricing. However, if we are self-managing the model on GKE or Compute Engine, we’d need to factor in VM costs, GPU/TPU costs (if applicable), and networking costs. For this example, we assume this is $0, as it is part of Vertex AI cost.
4. Fourth, define the training data storage and adapter layers costs:
The infrastructure costs for deploying machine learning models often raise concerns, but the data storage components can be economical at moderate scales. When implementing a conversational AI system, storing both the training data and the specialized model adapters represents a minor fraction of the overall costs. Let’s break down these storage requirements and their associated expenses.

1M conversations, assuming an average size of 5KB per conversation, would be roughly 5GB of data.
Cloud Storage cost for 5GB is negligible: $0.1 per month.
Adapter layers (fine-tuned model weights) might add another 1GB of storage. This would still be very inexpensive: $0.02 per month.
Total storage cost per month: < $1/month

5. Fifth, consider the application layer and setup costs: 
This depends heavily on the specific application. In this case we are using Cloud Run Functions and Logging. Cloud Run to handle pre- and post-processing of chatbot requests (e.g., formatting, database lookups). In this case let’s assume we use request-based billing so we are only charged when it processes the request. In this example we are processing 3M requests per month (100K * 30) and assuming 1 sec for average execution time: $14.30

Cloud Run function cost for request-based billing

Cloud Logging and Monitoring for tracking chatbot performance and debugging issues. Let’s estimate 100GB of logging volume (which is on higher end) and retaining the logs for 3 months: $28

Cloud Logging costs for storage and retention

Total application layer cost per month:~ $40
6. Finally, incorporate the Operational support cost:
This is the hardest to estimate, as it depends on the internal team’s size and responsibilities. Let’s assume a conservative estimate of 5 hours per week of an engineer’s time dedicated to monitoring and maintaining the chatbot, at an hourly rate of $100.

Total operational support cost per month: 5 hours/week * 4 weeks/month * $100/hour = $2000
Total estimated monthly cost (First month):
$ 340 (Serving) + $3000 (Training) + $1 (Storage) + $40 (Application) + $2000 (Operational) = $5,381
Total estimated monthly cost (Subsequent months):
$340 (Serving) + $300 (Training) + $1 (Storage) + $40 (Application) + $2000 (Operational) = $2,681

You can find the full estimate of cost here. Note that this does not include tuning and operational cost as it is not available in pricing export yet. 
Once you have a good understanding of your AI costs, it is important to develop an optimization strategy that encompasses infrastructure choices, resource utilization, and monitoring practices to maintain performance while controlling expenses. By understanding the various cost components and leveraging Google Cloud’s tools and resources, you can confidently embark on your AI journey. Cost management isn’t a barrier; it’s an enabler. It allows you to experiment, innovate, and build transformative AI solutions in a financially responsible way. 
Get started

Start understanding your AI costs today: Explore the Google Cloud Pricing Calculator and the Vertex AI Pricing Page.

Learn more at Google Cloud Next: Register for the Google Next session on AI Investment to Impact: Unlocking Sustainable ROI with Google Cloud.

Engage Google Cloud for expert guidance: Get expert help to design cost effect AI architectures, contact Google Cloud Consulting or PSO.

AI Summary and Description: Yes

Summary: The text delves into the financial aspects of implementing AI solutions within Google Cloud, emphasizing the significance of understanding costs associated with various AI services. It outlines the Total Cost of Ownership (TCO) for running AI workloads, including model serving, training, cloud hosting, data storage, and operational support costs, while presenting a practical example to aid organizations in making informed decisions.

Detailed Description: The text provides a detailed analysis of the economic considerations of enterprise AI deployment on Google Cloud. It emphasizes the need for technology leaders to be attuned to the costs associated with AI to prevent budget overruns and ensure sustainable development of AI solutions. Here are the key points discussed:

– Importance of Cost Awareness in AI:
– Understanding AI costs is crucial for effective budgeting and project management.
– Responsible AI development must balance innovation with financial sustainability.

– Breakdown of Total Cost of Ownership (TCO):
– **Model Serving Cost**:
– Describes the expenses incurred to run an AI model for predictions (inference), with examples from Google Cloud services.
– **Training and Tuning Costs**:
– Costs associated with training and fine-tuning models using computational resources (GPUs/TPUs).
– **Cloud Hosting Costs**:
– Infrastructure costs for running applications, covering compute, storage, and networking expenses.
– **Training Data Storage and Adapter Layers Costs**:
– Expenses related to storing training data and the resulting adapter layers from model training.
– **Application Layer and Setup Costs**:
– Additional costs for services that support AI applications, including monitoring and logging tools.
– **Operational Support Cost**:
– Ongoing costs for maintaining and supporting AI models over time, which can vary significantly based on the team’s size and responsibilities.

– Practical Example:
– The text illustrates a hypothetical scenario involving a retail customer deploying a chatbot using generative AI. It highlights various cost aspects, including serving costs, training, storage, application setup, and operational support.
– A comprehensive monthly cost estimation is provided, demonstrating the substantial initial investment but also a lowered cost in subsequent months.

– Conclusion:
– Emphasizes the importance of understanding AI costs for making informed decisions and optimizing resource utilization.
– Encourages organizations to develop strategies to manage and lower these costs, ultimately positioning cost management not as a hindrance but as an enabler for innovation.

Overall, the text equips professionals in AI, cloud, and infrastructure security with the insights necessary to approach AI investments strategically and financially responsibly, underscoring the complexities involved in AI cost management within Google Cloud.