Tag: Hypercomputer
-
Cloud Blog: Scaling high-performance inference cost-effectively
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/gke-inference-gateway-and-quickstart-are-ga/ Source: Cloud Blog Title: Scaling high-performance inference cost-effectively Feedly Summary: At Google Cloud Next 2025, we announced new inference capabilities with GKE Inference Gateway, including support for vLLM on TPUs, Ironwood TPUs, and Anywhere Cache. Our inference solution is based on AI Hypercomputer, a system built on our experience running models like…
-
Cloud Blog: Google is a Leader in the Gartner® Magic Quadrant for Strategic Cloud Platform Services
Source URL: https://cloud.google.com/blog/products/compute/google-is-a-leader-in-gartner-magic-quadrant-for-scps/ Source: Cloud Blog Title: Google is a Leader in the Gartner® Magic Quadrant for Strategic Cloud Platform Services Feedly Summary: For the eighth consecutive year, Gartner® has named Google a Leader in the Gartner Magic Quadrant™ for Strategic Cloud Platform Services, and this year Google is also now ranked the highest for…
-
Cloud Blog: Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs
Source URL: https://cloud.google.com/blog/products/compute/dynamic-workload-scheduler-calendar-mode-reserves-gpus-and-tpus/ Source: Cloud Blog Title: Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs Feedly Summary: Organizations need ML compute resources that can accommodate bursty peaks and periodic troughs. That means the consumption models for AI infrastructure need to evolve to be more cost-efficient, provide term flexibility, and support rapid…