Source URL: https://cloud.google.com/blog/products/containers-kubernetes/container-optimized-compute-delivers-autoscaling-for-autopilot/
Source: Cloud Blog
Title: GKE under the hood: Container-optimized compute delivers fast autoscaling for Autopilot
Feedly Summary: The promise of Google Kubernetes Engine (GKE) is the power of Kubernetes with ease of management, including planning and creating clusters, deploying and managing applications, configuring networking, ensuring security, and scaling workloads. However, when it comes to autoscaling workloads, customers tell us the fully managed mode of operation, GKE Autopilot, hasn’t always delivered the speed and efficiency they need. That’s because autoscaling a Kubernetes cluster involves creating and adding new nodes, which can sometimes take several minutes. That’s just not good enough for high-volume, fast-scale applications.
Enter the container-optimized compute platform for GKE Autopilot, a completely reimagined autoscaling stack for GKE that we introduced earlier this year. In this blog, we take a deeper look at autoscaling in GKE Autopilot, and how to start using the new container-optimized compute platform for your workloads today.
Understanding GKE Autopilot and its scaling challenges
With the fully managed version of Kubernetes, GKE Autopilot users are primarily responsible for their applications, while GKE takes on the heavy lifting of managing nodes and nodepools, creating new nodes, and scaling applications. With traditional Autopilot, if an application needed to scale quickly, GKE first needed to provision new nodes onto which the application could scale, which sometimes took several minutes.
To circumvent this, users often employed techniques like “balloon pods" — creating dummy pods with low priority to hold onto nodes; this helped ensure immediate capacity for demanding scaling use cases. However, this approach is costly, as it involves holding onto actively unused resources, and is also difficult to maintain.
aside_block
Introducing the container-optimized compute platform
We developed the container-optimized compute platform with a clear mission: to provide you with near-real-time, vertically and horizontally scalable compute capacity precisely when you need it, at optimal price and performance. We achieved this through a fundamental redesign of GKE’s underlying compute stack.
The container-optimized compute platform runs GKE Autopilot nodes on a new family of virtual machines that can be dynamically resized while they are running, from fractions of a CPU, all without disrupting workloads. To improve the speed of scaling and resizing, GKE clusters now also maintain a pool of dedicated pre-provisioned compute capacity that can be automatically allocated for workloads in response to increased resource demands. More importantly, given that with GKE Autopilot, you only pay for the compute capacity that you requested, this pre-provisioned capacity does not impact your bill.
The result is a flexible compute that provides capacity where and when it’s required. Key improvements include:
Up to 7x faster pod scheduling time compared to clusters without container-optimized compute
Significantly improved application response times for applications with autoscaling enabled
Introduction of in-place pod resize in Kubernetes 1.33, allowing for pod resizing without disruption
The container-optimized compute platform also includes pre-enabled high-performance Horizontal Pod Autoscaler (HPA) profile, which delivers:
Highly consistent horizontal scaling reaction times
Up to 3x faster HPA calculations
Higher resolution metrics, leading to improved scheduling decisions
Accelerated performance for up to 1000 HPA objects
All these features are now available out of the box in GKE Autopilot 1.32 or later.
The power of the new platform is evident in demonstrations where replica counts are rapidly scaled, showcasing how quickly new pods get scheduled.
How to leverage container-optimized compute
To benefit from these improvements in GKE Autopilot, simply create a new GKE Autopilot cluster based on GKE Autopilot 1.32 or later.
code_block
<ListValue: [StructValue([(‘code’, ‘gcloud container clusters create-auto <cluster_name> \\\r\n –location=<region> \\\r\n –project=<project_id>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9751f0ca30>)])]>
If your existing cluster is on an older version, upgrade it to 1.32 or newer to benefit from container-optimized compute platform’s new features offered.
To optimize performance, we recommend that you utilize the general purpose compute class for your workload. While the container-optimized compute platform supports various types of workloads, it works best with services that require gradual scaling and small (2 CPU or less) resource requests like web applications.
While the container-optimized compute platform is versatile, it is not currently suitable for specific deployment types:
One-pod-per-node deployments, such as anti-affinity situations
Batch workloads
The container-optimized compute platform marks a significant leap forward in improving application autoscaling within GKE and will unlock more capabilities in the future. We encourage you to try it out today in GKE Autopilot.
Container-Optimized Compute Platform for GKE Autopilot
AI Summary and Description: Yes
Summary: The text discusses the advancements made in Google Kubernetes Engine (GKE) Autopilot’s autoscaling capabilities through the introduction of a container-optimized compute platform. This innovation aims to address previous scaling challenges, enhancing speed and efficiency for Kubernetes workloads, particularly in high-demand environments.
Detailed Description: The text outlines the improvements introduced by Google in GKE Autopilot’s autoscaling functionality aimed at optimizing performance and efficiency. Key points include:
– **Background**: GKE Autopilot simplifies Kubernetes management, allowing users to focus on applications while Google manages the underlying infrastructure.
– **Scaling Challenges**: Traditional GKE Autopilot faced delays during autoscaling, as provisioning new nodes could take several minutes. Users resorted to inefficient workarounds (e.g., “balloon pods”) to mitigate delays, which were costly and hard to maintain.
– **Container-Optimized Compute Platform**:
– The new platform allows for dynamic resizing of virtual machines running GKE Autopilot without disrupting workloads, achieving near-real-time scaling.
– Pre-provisioned compute capacity can be allocated automatically in response to increased resource demands, optimizing cost as users only pay for what they use.
– **Key Improvements**:
– Up to 7x faster pod scheduling compared to prior configurations.
– Enhanced application responsiveness for auto-scaling applications.
– Introduction of in-place pod resizing for seamless adjustments.
– High-performance Horizontal Pod Autoscaler (HPA) delivering up to 3x faster calculations with higher resolution metrics.
– **Utilization Instructions**: To harness these enhancements, users can create new clusters with GKE Autopilot 1.32 or later, or upgrade existing clusters.
– **Limitations**: While versatile, the container-optimized compute platform is not ideal for certain deployment types like single-pod per node or batch workloads.
This update is particularly significant for security, privacy, and compliance professionals as it improves operational efficiency and resource utilization in cloud environments, thereby potentially reducing vulnerabilities related to excessive resource allocation or inefficient management practices. Understanding these advancements can help professionals better secure and optimize their cloud infrastructures.