AWS News Blog: New Amazon EC2 P6e-GB200 UltraServers accelerated by NVIDIA Grace Blackwell GPUs for the highest AI performance

Jul 9, 2025

—

Source URL: https://aws.amazon.com/blogs/aws/new-amazon-ec2-p6e-gb200-ultraservers-powered-by-nvidia-grace-blackwell-gpus-for-the-highest-ai-performance/
Source: AWS News Blog
Title: New Amazon EC2 P6e-GB200 UltraServers accelerated by NVIDIA Grace Blackwell GPUs for the highest AI performance

Feedly Summary: Amazon announces the general availability of EC2 P6e-GB200 UltraServers, powered by NVIDIA Grace Blackwell GB200 superchips that enable up to 72 GPUs with 360 petaflops of computing power for AI training and inference at the trillion-parameter scale.

AI Summary and Description: Yes

**Summary:** The text discusses the launch of Amazon EC2 P6e-GB200 UltraServers, which leverage NVIDIA’s latest technology to provide high-performance computing for AI workloads. This offering is significant for professionals in AI and cloud computing due to its capability to handle complex models and tasks efficiently.

**Detailed Description:**

The announcement of Amazon EC2 P6e-GB200 UltraServers marks a substantial advancement in cloud computing infrastructure aimed at optimizing AI workloads. Here are the key features and implications:

– **High-Performance Components:**
– Utilizes NVIDIA GB200 NVL72 accelerators for superior GPU performance in AI training and inference.
– Each superchip combines two NVIDIA Blackwell tensor core GPUs with an Arm-based Grace CPU, enhancing overall computational capabilities.

– **Compute and Memory Capacity:**
– Each Grace Blackwell Superchip delivers 10 petaflops of FP8 compute and up to 372 GB of high bandwidth memory (HBM3e).
– Total bandwidth capabilities reach up to 28.8 Tbps with significant networking capabilities through the Elastic Fabric Adapter (EFAv4).

– **Infrastructure Connectivity:**
– Dedicated high-bandwidth interconnect allows seamless communication across multiple EC2 instances, designed specifically for AI tasks.

– **Use Cases for AI:**
– Targeted primarily for intensive AI applications like mixture of experts models and reasoning models at a trillion-parameter scale.
– Supports the development of generative AI applications including question answering, code generation, image and video generation, and speech recognition.

– **Deployment Flexibility:**
– Available in Dallas Local Zone, allowing developers to secure capacity blocks tailored for machine learning once purchased, ensuring stable and predictable costs.

– **Integration with AWS Services:**
– Seamlessly integrates with services like Amazon SageMaker, which automates infrastructure management for machine learning workloads, and Amazon EKS for Kubernetes orchestration.
– Advanced storage solutions through Amazon FSx for Lustre and Amazon S3 enable high-speed data access and storage management for large-scale AI and HPC workloads.

– **Practical Implications for Professionals:**
– The introduction of EC2 P6e-GB200 UltraServers provides organizations with robust infrastructure options to accelerate development and deployment of advanced AI solutions.
– Security and compliance professionals should note the deployment within AWS Nitro System, which enhances security and reliability at scale, critical for any enterprise adopting these technologies.

Overall, the EC2 P6e-GB200 UltraServers significantly elevate the capabilities of cloud computing resources directed towards artificial intelligence, providing a streamlined avenue for innovation in AI applications.

1 10 2 3 4 7 a accelerator accelerators access Act adapter ads advanced advanced AI advancement AI AI applications AI workloads Amazon Amazon EC2 Amazon EKS Amazon FSx Amazon S3 Amazon SageMaker and anti app Application applications ARM Arm-based art artificial Artificial Intelligence as at ated Auto availability AWS B200 bandwidth bandwidth memory based Bi black Blackwell by C C2 capabilities capability capacity chip chips CI CIA Cloud cloud computing cloud computing infrastructure co code code generation communication compliance compliance professionals computation computational capabilities compute Computing computing infrastructure computing power connectivity core cost Costs CPU critical cross D data data access de deployment deployment flexibility design developer developers development e efficient Elastic Fabric Adapter enterprise ERP exp expert Experts feature features flexibility for g Gen general generation generative Generative AI GPU GPUs Grace Grace Blackwell Superchip gs H HBM3 high high-performance high-performance computing high-speed HP HPC HR http HTTPS image implications in Inference infrastructure infrastructure management innovation Instance integration Integration with AWS Services Intel intelligence intensive inter io k Key Kubernetes l large learning led Li liability local logs low Lustre M mac machine Machine Learning machine learning workloads man management memory memory capacity Mixture ML Mode model models multi N network Networking networking capabilities new news Nitro System no Nvidia Nvidia Grace NVIDIA Grace Blackwell GPUs o of off on one OPM ops opt options orchestration organization organizations ory oS over parameter performance performance computing petaflop Power practical implications pre pro professionals ps Q question R rag rate RCE reasoning reasoning mode reasoning model reasoning models red reliability resource resources Ro s S3 SageMaker Scale sec secure security security and compliance server servers service services Sig solutions source specific Speech speech recognition speed SSE SSL storage storage management storage solutions superchip support system T targeted Task tasks tech technologies technology ted test text the to Tor TP training training and inference trillion two Uber Ultra up US use use cases V video video generation Well Wi workload workloads x z zone