Tag: inference capabilities
-
Cloud Blog: Scaling high-performance inference cost-effectively
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/gke-inference-gateway-and-quickstart-are-ga/ Source: Cloud Blog Title: Scaling high-performance inference cost-effectively Feedly Summary: At Google Cloud Next 2025, we announced new inference capabilities with GKE Inference Gateway, including support for vLLM on TPUs, Ironwood TPUs, and Anywhere Cache. Our inference solution is based on AI Hypercomputer, a system built on our experience running models like…
-
Cloud Blog: How startups can help build — and benefit from — the AI revolution
Source URL: https://cloud.google.com/blog/products/ai-machine-learning/industry-leaders-on-whats-next-for-startups-and-ai/ Source: Cloud Blog Title: How startups can help build — and benefit from — the AI revolution Feedly Summary: Startups are at the forefront of generative AI development, pushing current capabilities and unlocking new potential. Building on our Future of AI: Perspectives for Startups 2025 report, several of the AI industry leaders…
-
Enterprise AI Trends: OpenAI’s Open Source Strategy
Source URL: https://nextword.substack.com/p/openai-open-source-strategy-gpt-oss Source: Enterprise AI Trends Title: OpenAI’s Open Source Strategy Feedly Summary: OpenAI assures everyone that they care about enterprise AI AI Summary and Description: Yes **Summary:** The text primarily discusses OpenAI’s recent release of open-weight models (gpt-oss-120b and gpt-oss-20b) and their implications for AI strategy, enterprise focus, and competitive dynamics in the…
-
Cloud Blog: From LLMs to image generation: Accelerate inference workloads with AI Hypercomputer
Source URL: https://cloud.google.com/blog/products/compute/ai-hypercomputer-inference-updates-for-google-cloud-tpu-and-gpu/ Source: Cloud Blog Title: From LLMs to image generation: Accelerate inference workloads with AI Hypercomputer Feedly Summary: From retail to gaming, from code generation to customer care, an increasing number of organizations are running LLM-based applications, with 78% of organizations in development or production today. As the number of generative AI applications…
-
Cloud Blog: Google, Bytedance, and Red Hat make Kubernetes generative AI inference aware
Source URL: https://cloud.google.com/blog/products/containers-kubernetes/google-bytedance-and-red-hat-improve-ai-on-kubernetes/ Source: Cloud Blog Title: Google, Bytedance, and Red Hat make Kubernetes generative AI inference aware Feedly Summary: Over the past ten years, Kubernetes has become the leading platform for deploying cloud-native applications and microservices, backed by an extensive community and boasting a comprehensive feature set for managing distributed systems. Today, we are…