Tag: GKE
-
Cloud Blog: Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs
Source URL: https://cloud.google.com/blog/products/compute/dynamic-workload-scheduler-calendar-mode-reserves-gpus-and-tpus/ Source: Cloud Blog Title: Understanding Calendar mode for Dynamic Workload Scheduler: Reserve ML GPUs and TPUs Feedly Summary: Organizations need ML compute resources that can accommodate bursty peaks and periodic troughs. That means the consumption models for AI infrastructure need to evolve to be more cost-efficient, provide term flexibility, and support rapid…
-
Cloud Blog: Celebrating 10 years of GKE: Incredible customer journeys, amazing AI futures
Source URL: https://cloud.google.com/blog/products/containers-kubernetes/10-years-of-gke-ebook/ Source: Cloud Blog Title: Celebrating 10 years of GKE: Incredible customer journeys, amazing AI futures Feedly Summary: The evolution of the cloud has been tremendous over the past decade. Every step of the way, Google Kubernetes Engine (GKE) has been there to meet new challenges. From giving DevOps more scalable foundations to…
-
Cloud Blog: Announcing a new monitoring library to optimize TPU performance
Source URL: https://cloud.google.com/blog/products/compute/new-monitoring-library-to-optimize-google-cloud-tpu-resources/ Source: Cloud Blog Title: Announcing a new monitoring library to optimize TPU performance Feedly Summary: For more than a decade, TPUs have powered Google’s most demanding AI training and serving workloads. And there is strong demand from customers for Cloud TPUs as well. When running advanced AI workloads, you need to be…
-
Cloud Blog: Application monitoring in Google Cloud: Bridging manual and AI-assisted troubleshooting
Source URL: https://cloud.google.com/blog/products/management-tools/get-to-know-cloud-observability-application-monitoring/ Source: Cloud Blog Title: Application monitoring in Google Cloud: Bridging manual and AI-assisted troubleshooting Feedly Summary: As developers and operators, you know that having access to the right information in the proper context is crucial for effective troubleshooting. This is why organizations invest a lot upfront curating monitoring resources across different business…