Tag: ML inference
-
AWS News Blog: Announcing Amazon EC2 M4 and M4 Pro Mac instances
Source URL: https://aws.amazon.com/blogs/aws/announcing-amazon-ec2-m4-and-m4-pro-mac-instances/ Source: AWS News Blog Title: Announcing Amazon EC2 M4 and M4 Pro Mac instances Feedly Summary: AWS has launched new EC2 M4 and M4 Pro Mac instances based on Apple M4 Mac mini, offering improved performance over previous generations and featuring up to 48GB memory and 2TB storage for iOS/macOS development workloads.…
-
Cloud Blog: Scaling high-performance inference cost-effectively
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/gke-inference-gateway-and-quickstart-are-ga/ Source: Cloud Blog Title: Scaling high-performance inference cost-effectively Feedly Summary: At Google Cloud Next 2025, we announced new inference capabilities with GKE Inference Gateway, including support for vLLM on TPUs, Ironwood TPUs, and Anywhere Cache. Our inference solution is based on AI Hypercomputer, a system built on our experience running models like…
-
Cloud Blog: Announcing smaller machine types for A3 High VMs
Source URL: https://cloud.google.com/blog/products/compute/announcing-smaller-machine-types-for-a3-high-vms/ Source: Cloud Blog Title: Announcing smaller machine types for A3 High VMs Feedly Summary: Today, an increasing number of organizations are using GPUs to run inference1 on their AI/ML models. Since the number of GPUs needed to serve a single inference workload varies, organizations need more granularity in the number of GPUs…