Cloud Blog: Meet the new GKE: Extending Autopilot to all qualifying clusters

Source URL: https://cloud.google.com/blog/products/containers-kubernetes/gke-autopilot-now-available-to-all-qualifying-clusters/
Source: Cloud Blog
Title: Meet the new GKE: Extending Autopilot to all qualifying clusters

Feedly Summary: Autopilot is an operational mode for Google Kubernetes Engine (GKE) that provides a fully managed environment and takes care of operational details, like provisioning compute capacity for your workloads. Autopilot allows you to spend more time on developing your own applications and less time on managing node-level details. This year, we upgraded Autopilot’s autoscaling stack to a fully dynamic container-optimized compute platform that rapidly scales horizontally and vertically to support your workloads. Simply attach a horizontal pod autoscaler (HPA) or vertical pod autoscaler (VPA) to your environment, and experience a fully dynamic platform that can scale rapidly to serve your users.
More and more customers, including Hotspring and Contextual AI, understand that Autopilot can dramatically simplify Kubernetes cluster operations and enhance resource efficiency for their critical workloads. In fact, in 2024, 30% of active GKE clusters were created in Autopilot mode. The new container-optimized compute platform has also proved popular with customers, who report rapid performance improvements in provisioning time. The faster GKE provisions capacity, the more responsive your workloads become, improving your customers’ experience and optimizing costs. 
Today, we are pleased to announce that the best of Autopilot is now available in all qualified GKE clusters, not just dedicated Autopilot ones. Now, you can utilize Autopilot’s container-optimized compute platform and ease of operation from existing GKE clusters. It’s generally available, starting with clusters enrolled in the Rapid release channel and running GKE version 1.33.1-gke.1107000 or later. Most clusters  will qualify and be able to access these new features as they roll out to the other release channels, except clusters enrolled in the Extended channel and those that use the older routes-based networking. To access these new features, enroll in the Rapid channel and upgrade your cluster version, or wait to be auto-upgraded.

Say hi to the new GKE: A better default compute for every workload

How to use it
Autopilot features are offered in Standard clusters via compute classes, which are a modern way to group and specify compute requirements for workloads in GKE. GKE now has two built-in compute classes, autopilot and autopilot-spot, that are pre-installed on all qualified clusters running on GKE 1.33.1-gke.1107000 or later and enrolled in the Rapid release channel. Running your workload on Autopilot’s container-optimized compute platform is as easy as specifying the autopilot (or autopilot-spot) compute class, like so:

code_block
)])]>

Better still, you can make the Autopilot container-optimized compute platform the default for a namespace, a great way to save both time and money. You get efficient bin-packing, where the workload is charged for resource requests (and can even still burst!), rapid scaling, and you don’t have to plan your node shapes and sizes. 
Here’s how to set Autopilot as your default for a namespace:

code_block
<ListValue: [StructValue([(‘code’, ‘NAMESPACE_NAME=your_namespace\r\nkubectl label namespaces $NAMESPACE_NAME cloud.google.com/default-compute-class=autopilot’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f189d2fa1f0>)])]>

Pod sizes for the container-optimized compute platform start at 50 milli-CPU (that’s just 5% of 1 CPU core!), and can scale to 28vCPU. With the container-optimized compute platform you only pay for the resources your Pod requests, so you don’t have to worry about system overhead or empty nodes. Pods such as those larger than 28 vCPU or with specific hardware requirements can also run in Autopilot mode on specialized compute with node-based pricing via customized compute classes.
Run AI workloads on GPUs and TPUs with Autopilot
It’s easy to pair Autopilot’s container-optimized compute platform with specific hardware such as GPUs, TPUs and high-performance CPUs to run your AI workloads. You can run those workloads in the same cluster side by side Pods on the container-optimized compute platform. By choosing Autopilot mode for these AI workloads, you benefit from the Autopilot’s managed node properties, where we take a more active role in management. Furthermore, you also get our enterprise-grade privileged admission controls that require workloads to run in user-space, for better supportability, reliability and an improved security posture.
Here’s how to define your own customized compute class that runs in Autopilot mode with specific hardware, in this example a G2 machine type with NVIDIA L4s with two priority rules:

code_block
<ListValue: [StructValue([(‘code’, ‘apiVersion: cloud.google.com/v1\r\nkind: ComputeClass\r\nmetadata:\r\n name: gpu-l4-ap\r\nspec:\r\n autopilot:\r\n enabled: true\r\n priorities:\r\n – machineType: g2-standard-48\r\n spot: true\r\n gpu:\r\n type: nvidia-l4\r\n count: 4\r\n – machineType: g2-standard-24\r\n spot: true\r\n gpu:\r\n type: nvidia-l4\r\n count: 2’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f189d2fa070>)])]>

A new way to use compute classes
We’re also making compute classes work better with a new provisioning mode that automatically provisions resources for compute classes, without changing how other workloads are scheduled on existing node pools. This means you can now adopt the new deployment paradigm of compute class (including the new Autopilot-enabled compute classes) at your own pace, without affecting existing workloads and deployment strategies.
Until now, to use compute class in Standard clusters with automatic node provisioning, you needed to enable node auto-provisioning for the entire cluster. Node auto-provisioning has been part of GKE for many years, but it was previously an all-or-nothing decision — you couldn’t easily combine a manual node pool with a compute class provisioned by node auto-provisioning without potentially changing how workloads outside of the compute class were scheduled. Now you can, with our new automatically provisioned compute classes. All Autopilot compute classes use this system, so it’s easy to run workloads in Autopilot mode side-by-side with your existing deployments (e.g., on manual node pools). You can also enable this feature on any compute class starting with clusters in the Rapid channel running GKE version 1.33.3-gke.1136000 or later.
Here’s how:

code_block
<ListValue: [StructValue([(‘code’, ‘apiVersion: cloud.google.com/v1\r\nkind: ComputeClass\r\nmetadata:\r\n name: gpu-l4\r\nspec:\r\n nodePoolAutoCreation:\r\n enabled: true\r\n priorities:\r\n – machineType: g2-standard-48\r\n spot: true\r\n gpu:\r\n type: nvidia-l4\r\n count: 4\r\n – machineType: g2-standard-24\r\n spot: true\r\n gpu:\r\n type: nvidia-l4\r\n count: 2’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x7f189d2fa310>)])]>

With the Autopilot mode for compute classes in Standard clusters, and the new automatic provisioning mode for all compute classes, you can now introduce compute class as an option to more clusters without impacting how any of your existing workloads are scheduled. Customers we’ve spoken to like this, as they can adopt these new patterns gradually for new workloads and by migrating existing ones, without needing to plan a disruptive switch-over.
Autopilot for all
At Google Cloud, we believe in the power of GKE’s Autopilot mode to simplify operations for your GKE clusters and make them more efficient. Now, those benefits are available to all GKE customers! To learn more about GKE Autopilot and how to enable it for your clusters, check out these resources.

How to run workloads on Autopilot in GKE Standard clusters.

Learn how the container-optimized compute works under the hood to drive performance.

Watch the GKE Spotlight from NEXT ‘25, and read the announcement.

AI Summary and Description: Yes

Summary: The text describes the enhancements made to Google Kubernetes Engine (GKE) Autopilot, including new autoscaling capabilities, the introduction of compute classes, and the availability of Autopilot features in standard clusters. These updates aim to streamline Kubernetes operations and improve resource efficiency, particularly for AI workloads, and offer benefits such as simplified management and reduced costs.

Detailed Description:

The content emphasizes the advancements in the Google Kubernetes Engine (GKE) Autopilot and discusses how these enhancements simplify operations and increase efficiency for cloud-based workloads. Key highlights include:

– **Operational Mode Upgrades:**
– GKE’s Autopilot provides a fully managed environment focusing on reducing operational complexities associated with provisioning compute capacity.
– The upgraded autoscaling stack allows for rapid horizontal and vertical scaling, thus accommodating varying workload demands more effectively.

– **Increased Adoption:**
– Adoption of Autopilot is growing, with notable customers like Hotspring and Contextual AI recognizing its value over traditional Kubernetes operations. Reports indicate that 30% of active GKE clusters were created in Autopilot mode in 2024.

– **Improved Performance:**
– The container-optimized compute platform features faster provisioning times, leading to reduced time for workloads to become operational, optimized customer experiences, and overall cost efficiency.

– **Wider Availability:**
– The features of Autopilot are now applicable to all qualified GKE clusters, not just dedicated Autopilot environments, pushing for broader utility and function across various deployment types.

– **Compute Classes:**
– Introduction of compute classes like ‘autopilot’ and ‘autopilot-spot’ allows users to specify compute requirements easily and manage workload resources effectively.
– Provides dynamic resource allocation with efficient charge models, thereby catering to diverse workloads without wasting resources.

– **AI Workload Optimization:**
– Capable of running AI workloads on specialized hardware (GPUs and TPUs) while benefiting from managed node characteristics for improved operational management.
– Enterprises leverage this capability for better supportability and reliability in handling AI applications due to enhanced security measures built into the platform.

– **Gradual Adoption:**
– The new provisioning mode enables deployment of compute classes without disrupting existing operations, allowing companies to integrate the advanced features of Autopilot slowly, accommodating transitions smoothly.

– **Practical Implementation:**
– Instructions provided for users to incorporate Autopilot features into their existing setups, showcasing ease of use and integration.

In summary, the improvements in GKE Autopilot signify a strategic advancement in Kubernetes management, particularly focusing on enhancing resource efficiency and operational simplicity for cloud deployments, and expanding capabilities for AI workloads, which are vital for organizations looking to leverage machine learning and AI in their operations.