Cloud Blog: Accelerating AI in healthcare using NVIDIA BioNeMo Framework and Blueprints on GKE

Source URL: https://cloud.google.com/blog/products/ai-machine-learning/accelerate-ai-in-healthcare-nvidia-bionemo-gke/
Source: Cloud Blog
Title: Accelerating AI in healthcare using NVIDIA BioNeMo Framework and Blueprints on GKE

Feedly Summary: The quest to develop new medical treatments has historically been a slow, arduous process, screening billions of molecular compounds across decade-long development cycles. The vast majority of therapeutic candidates do not even make it out of clinical trials. 
Now, AI is poised to dramatically accelerate this timeline. 
As part of our wide-ranging, cross-industry collaboration, NVIDIA and Google Cloud have supported the development of generative AI applications and platforms. NVIDIA BioNeMo is a powerful open-source collection of models specifically tuned to the needs of medical and pharmaceutical researchers.
Medical and biopharma organizations of all sizes are looking closely at predictive modeling and AI foundation models to help disrupt this space. With AI, they’re working on accelerating the identification and optimization of potential drug candidates to significantly shorten development timelines and address unmet medical needs. This has become a significant turning point for analyzing DNA, RNA, and protein sequences, and chemicals, predicting molecular interactions, and designing novel therapeutics at scale. 
With BioNeMo, companies in this space gain a more data-driven approach to developing medicines while reducing reliance on time-consuming experimental methods. But these breakthroughs are not without their own challenges. The shift to generative medicine requires a robust tech stack, including: powerful infrastructure to build, scale, and customize models; efficient resource utilization; agility for faster iteration; fault tolerance; and orchestration of distributed workloads.
Google Kubernetes Engine (GKE) offers a powerful solution for achieving many of these demanding workloads, and when taken together with NVIDIA BioNeMo, GKE can accelerate work on the platform. With BioNeMo running on GKE, organizations can achieve medical breakthroughs and new research with levels of speed and effectiveness that were unheard of before. 
In this blog, we’ll show you how to build and customize models and launch reference blueprints using  NVIDIA BioNeMo platform on GKE

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

NVIDIA’s BioNeMo platform on GKE 
NVIDIA BioNeMo is a generative AI framework that enables researchers to model and simulate biological sequences and structures. It places major demands for computing with powerful GPUs, scalable infrastructure for handling large datasets and complex models, and robust managed services for storage, networking, and security. 
GKE offers a highly scalable and flexible platform ideal for AI and machine learning — and particularly the demanding workloads found in biopharma research and development. GKE’s autoscaling features ensure efficient resource utilization, while its integration with other Google Cloud services simplifies the AI workflow. 
NVIDIA’s BioNeMo platform offers two potential synergistic components:
1. BioNeMo Framework: Large-Scale Training Platform for Drug Discovery AI
A scalable, open-source, training system for biomolecular AI models like ESM-2 and Evo2. It provides an optimized environment for training and fine-tuning biomolecular AI models. Built on NVIDIA NeMo and PyTorch Lightning, it offers:

Domain-Specific Optimization: Provides performant biomolecular AI architectures that can be scaled to billions of parameters (eg: BERT, Striped Hyena) along with representative model examples (e.g., ESM-2, Geneformer) built with CUDA-accelerated tooling tailored for drug discovery workflows.
GPU-accelerated performance: Delivers industry-leading speed through native integration with NVIDIA GPUs at scale, reducing training time for large language models and predictive models.
Comprehensive open-source resources: Includes programming tools, libraries, prepackaged datasets, and detailed documentation to support researchers and developers in deploying biomolecular AI solutions

Explore the preprint here for details.
2. BioNeMo Blueprints: Production Ready Workflows for Drug Discovery
BioNeMo Blueprints provide ready-to-use reference workflows for tasks such as protein binder design, virtual screening, and molecular docking. These workflows integrate advanced AI models like AlphaFold2, DiffDock 2.0, RFdiffusion, MolMIM, and ProteinMPNN to accelerate drug discovery processes. These blueprints provide solutions to patterns identified across several other industry use cases. Scientific developers can try NVIDIA inference microservices (NIMs) at build.nvidia.com and access them to test via a NVIDIA developer license.
The following graphic shows the components and features of GKE used by the BioNeMo platform. In this blog, we demonstrate how to deploy these components on GKE, combining NVIDIA’s domain-specific AI tools with Google Cloud’s managed Kubernetes infrastructure for:

Distributed pretraining and finetuning of models across NVIDIA GPU clusters
Blueprint-driven workflows using NIMs
Cost-optimized scaling via GKE’s dynamic node pools and preemptible VMs

Figure 1: NVIDIA BioNeMO Framework and BioNeMo Blueprints on GKE

Solution Architecture of BioNeMo framework
Here, we will walk through setting up the BioNeMo framework on GKE to perform ESM2 pretraining and fine-tuning.

Figure 2: BioNeMo framework on GKE

The above diagram shows an architectural overview of deploying the NVIDIA BioNeMo Framework on GKE for AI model pre-training, fine-tuning, and inferencing. Here’s a breakdown from an architectural perspective:

GKE: The core orchestration platform including the control plane managing the deployment and scaling of the BioNeMo Framework. This is deployed as a regional cluster, and can be optionally configured as a zonal cluster.

Node Pool: A group of worker nodes within the GKE cluster, specifically configured with NVIDIA GPUs for accelerated AI workloads.

Nodes: Individual machines within the node pool, equipped with NVIDIA GPUs.

NVIDIA BioNeMo Framework: The AI software platform running within GKE, enabling pre-training, fine-tuning, and inferencing of AI models.

Networking:

Virtual Private Cloud (VPC): A logically isolated network within GCP, ensuring secure communication between resources.

Load Balancer: Distributes incoming traffic to the BioNeMo services running in the GKE cluster, enhancing availability and scalability.

Storage:

Filestore (NFS): Provides high-performance network file storage for datasets and model checkpoints.

Cloud Storage: Object storage for storing datasets and other large files.

NVIDIA NGC Image Registry: Provides container images for BioNeMo and related software, ensuring consistent and optimized deployments.

Steps
We have published an example to pre-train, fine-tune, and infer an ESM-2 model using BioNeMo Framework on GKE in Pretraining and Fine-tuning ESM-2 LLM on GKE using BioNeMo Framework 2.0 GitHub repo. Here is an outline of the steps for pretraining:
1. Create a GKE cluster

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud container clusters create “gke-bionemo-esm2" \\\r\n –num-nodes="1" \\\r\n –location="<GCP region / zone>" \\\r\n –machine-type="e2-standard-2" \\\r\n –addons=GcpFilestoreCsiDriver’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9d8db5ef40>)])]>

2. Add node pool with NVIDIA GPUs

code_block
<ListValue: [StructValue([(‘code’, ‘gcloud container node-pools create "gke-bionemo-esm2-np" \\\r\n–cluster="gke-bionemo-esm2" \\\r\n–location="<GCP region / zone>" \\\r\n–node-locations="<GCP region / zone>" \\\r\n–num-nodes="1" \\\r\n–machine-type="g2-standard-2" \\\r\n–accelerator="type=nvidia-l4,count=1,gpu-driver-version=LATEST" \\\r\n–placement-type="COMPACT" \\\r\n–disk-type="pd-ssd" \\\r\n–disk-size="300GB"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9d8db5ea30>)])]>

3. Mount Google Cloud Filestore across all the nodes

code_block
<ListValue: [StructValue([(‘code’, ‘kubectl apply -f create-mount-fs.yaml’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9d8db5e130>)])]>

4. Run the pretraining job

code_block
<ListValue: [StructValue([(‘code’, ‘kubectl apply -f esm2-pretraining.yaml’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9d8db5e1c0>)])]>

5. Visualize results in TensorBoard

code_block
<ListValue: [StructValue([(‘code’, ‘kubectl port-forward pod/<pod-bionemo> 8000:6006’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9d8db5ebb0>)])]>

Open a web browser pointing to http://localhost:8000/#timeseries to see the loss curves. The details for fine-tuning and inference are laid out in the GitHub repo.
Solution Architecture of BioNeMo Blueprints
The below graphic shows a BioNeMo Blueprint that is deployed on GKE for inferencing. From an infrastructure standpoint, the components used across the Compute, Networking and Storage layer are similar to Figure 2:

NIMs are packaged as a unit with runtime and model-specific weights. Blueprints deploy one or more NIMs using Helm charts. Alternatively, they can be deployed using gcloud or docker commands and configured using kubectl commands. Each NIM needs a minimum of one NVIDIA GPU accessible through a GKE node pool. 

Three NIMs—AlphaFold2, DiffDock, and MolMIM—are deployed as individual Kubernetes deployments. Each deployment uses a GPU and a NIM container image, mounting a persistent volume claim for storing model checkpoints and data. Services expose each application on different ports. The number of GPUs can be configured to a higher value for better scalability.

Figure 3: NIM Blueprint on GKE

Steps
We have an example of deploying a BioNeMo blueprint for Generative Virtual Screening at Generative Virtual Screening for Drug Discovery on GKE GitHub repo. The setup steps, such as GKE cluster, node pool, and mounting filestore, are similar to BioNeMo training. The below steps give an outline of deploying the BioNeMo blueprint and using it for inference:
1. Deploy the BioNeMo blueprint

code_block
<ListValue: [StructValue([(‘code’, ‘kubectl create -f nim-bionemo-generative-virtual-screening.yaml’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9db414d280>)])]>

2. Use port forwarding to interact with the pod

code_block
<ListValue: [StructValue([(‘code’, ‘kubectl port-forward pod/<molmim-pod> 8010:8000 &’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9db414d6d0>)])]>

3. Test MolMIM NIM locally using a curl statement. The output will have the generated molecule.

code_block
<ListValue: [StructValue([(‘code’, ‘curl -X POST \\\r\n-H \’Content-Type: application/json\’ \\\r\n-d \'{\r\n "smi": "CC1(C2C1C(N(C2)C(=O)C(C(C)(C)C)NC(=O)C(F)(F)F)C(=O)NC(CC3CCNC3=O)C#N)C",\r\n "num_molecules": 5,\r\n "algorithm": "CMA-ES",\r\n "property_name": "QED",\r\n "min_similarity": 0.7,\r\n "iterations": 10\r\n}\’ \\\r\n"http://localhost:8011/generate"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9db1554880>)])]>

NVIDIA BioNeMo Blueprints workflows can be adapted to various domain-specific use cases beyond drug discovery. For example, researchers can leverage generative AI models like RFdiffusion and ProteinMPNN in protein engineering to design stable protein binders with high affinity, drastically reducing the experimental iteration cycles. 
By integrating modular NIM microservices with scalable platforms like GKE, industries ranging from biopharma to agriculture can deploy AI-driven solutions tailored to their unique challenges, enabling faster insights and more efficient processes at scale.
Conclusion
As we’ve explored in this blog post, GKE provides a robust and versatile platform for deploying and running both NVIDIA BioNeMo Framework and NVIDIA BioNeMo  Blueprint. By leveraging GKE’s scalability, container orchestration capabilities, and integration with Google Cloud’s ecosystem, you can streamline the development and deployment of AI solutions in the life sciences and other domains. 
Whether you’re accelerating drug discovery with BioNeMo or deploying generative AI models with NIMs, GKE empowers you to harness the power of AI and drive innovation. By leveraging the strengths of both platforms, you can streamline the deployment process, optimize performance, and scale your AI workloads seamlessly. 
Ready to experience the power of NVIDIA BioNeMo on Google Cloud? Get started today by exploring the BioNeMo Framework and NIM catalog, deploying your first generative AI model on GKE, and unlocking new possibilities for your applications.

We’d like to thank the NVIDIA team members who helped contribute to this guide, Juan Pablo Guerra, Solutions Architect, and Kushal Shah, Senior Solutions Architect.

AI Summary and Description: Yes

Summary: The provided text outlines the advancements in drug discovery facilitated by NVIDIA’s BioNeMo platform in collaboration with Google Cloud’s Kubernetes Engine (GKE). It highlights how generative AI tools can streamline the development of new medical treatments, fundamentally altering traditional lengthy processes in biopharma research.

Detailed Description: The text elaborates on the integration of NVIDIA BioNeMo with Google Kubernetes Engine (GKE) as a powerful solution for accelerating drug discovery through generative AI applications. Here are the major points of significance:

– **Generative AI in Biopharma**: AI technologies, particularly generative models, are transforming the landscape of medical research by enabling faster identification of drug candidates and the optimization of therapeutic approaches.

– **NVIDIA BioNeMo Platform**:
– An open-source framework specifically designed for biopharmaceutical researchers to model and simulate biological sequences.
– Facilitates a data-driven methodology that reduces dependence on traditional experimental procedures.

– **Infrastructure and Scalability**:
– The intersection of BioNeMo and GKE allows organizations to allocate massive computational resources efficiently, leveraging NVIDIA GPUs to handle extensive datasets and complex AI models.
– GKE’s features like autoscaling and comprehensive Google Cloud integration simplify the workflow for researchers and developers.

– **BioNeMo Components**:
1. **BioNeMo Framework**:
– This component offers tools for training and fine-tuning biomolecular AI models.
– It supports domain-specific optimizations and provides prepackaged datasets and libraries relevant to drug discovery.

2. **BioNeMo Blueprints**:
– These are reference workflows for tasks such as protein binder design and virtual screening. They encapsulate advanced AI models that streamline the drug discovery process.

– **Steps to Deployment**:
– The text gives detailed operational steps for setting up the BioNeMo framework and blueprints on GKE, including cluster creation, node pool management, and job execution instructions.

– **Broad Applications**: Beyond drug discovery, the text notes that the modular nature of the NIMs can be adapted to various sectors, including agriculture, showcasing the versatility of the models.

– **Conclusion**: The combination of NVIDIA BioNeMo and Google Cloud’s GKE is positioned to drive innovation in life sciences, facilitating timely medical breakthroughs and optimizing resource utilization in AI workloads.

This detailed overview underscores the practical implications for security and compliance professionals, especially in terms of safeguarding sensitive health data while using scalable cloud infrastructures in AI-driven environments. The transformation of biopharma research through AI indicates future compliance challenges and the necessity for robust security protocols in sensitive data algorithms.