Cloud Blog: Palo Alto Networks’ journey to productionizing gen AI

Source URL: https://cloud.google.com/blog/topics/partners/how-palo-alto-networks-builds-gen-ai-solutions/
Source: Cloud Blog
Title: Palo Alto Networks’ journey to productionizing gen AI

Feedly Summary: At Google Cloud, we empower businesses to accelerate their generative AI innovation cycle by providing a path from prototype to production. Palo Alto Networks, a global cybersecurity leader, partnered with Google Cloud to develop an innovative security posture control solution that can answer complex “how-to" questions on demand, provide deep insights into risk with just a few clicks, and guide users through remediation steps. 
Using advanced AI services, including Google’s Gemini models and managed Retrieval Augmented Generation (RAG) services such as Google Cloud’s Vertex AI Search, Palo Alto Networks had an ideal foundation for building and deploying gen AI-powered solutions. 
The end result was Prisma Cloud Co-pilot, the Palo Alto Networks Prisma Cloud gen AI offering. It helps simplify cloud security management by providing an intuitive, AI-powered interface to help understand and mitigate risks.

Technical challenges and surprises 
The Palo Alto Networks Prisma Cloud Co-pilot journey began in 2023 and launched in October 2024. During this time, Palo Alto Networks witnessed Google’s AI models evolve rapidly, from Text Bison (PaLM) to Gemini Flash 1.5. That rapid pace of innovation meant that each iteration brought new capabilities, necessitating a development process that could quickly adapt to the evolving landscape. 
To effectively navigate the dynamic landscape of evolving gen AI models, Palo Alto Networks established robust processes that proved invaluable to their success:

Prompt engineering and management: Palo Alto Networks used Vertex AI to help manage prompt templates and built a diverse prompt library to generate a wide range of responses. To rigorously test each new model’s capabilities, limitations, and performance across various tasks, Palo Alto Networks and Google Cloud team systematically created and updated prompts for each submodule. Additionally, Vertex AI’s Prompt Optimizer helped streamline the tedious trial-and-error process of prompt engineering.
Intent recognition: Palo Alto Networks used the Gemini Flash 1.5 model to develop an intent recognition module, which efficiently routed user queries to the relevant co-pilot component. This approach provided users with many capabilities through a unified and lightweight user experience.

Input guardrails: Palo Alto Networks created guardrails as a first line of defense against unexpected, malicious, or simply incorrect queries that could compromise the functionality and experience of the chatbot. These guardrails maintain the chatbot’s intended functionality by preventing known prompt injection attacks, such as circumventing system instructions; and restricting chatbot usage to its intended scope. Guardrails were created to detect if user queries are restricted to responses within the predefined domain of general cloud security, risks, and vulnerabilities to prevent unintended use. Any topics outside this scope did not receive a response from the chatbot. Additionally, since the chatbot was designed for proprietary code generation for Palo Alto Networks systems to query internal systems, requests for general-purpose code generation similarly did not receive a response.

Evaluation dataset curation: A robust and representative evaluation dataset serves as a foundation to accurately and quickly assess the performance of gen AI models. The Palo Alto Networks team took great care to choose high-quality evaluation data and keep it relevant by constantly refreshing it with representative questions and expert-validated answers. The accuracy and reliability of the evaluation dataset was sourced and validated directly from Palo Alto Networks subject matter experts.

Automated evaluation: In collaboration with Google Cloud, Palo Alto Networks developed an automated evaluation pipeline using Vertex AI’s gen AI evaluation service. This pipeline allowed Palo Alto Networks to rigorously scale their assessment of different gen AI models, and benchmark those models using custom evaluation metrics while focusing on key performance indicators such as accuracy, latency, and consistency of responses. 

Human evaluator training and red teaming: Palo Alto Networks invested in training their human evaluation team to identify and analyze specific loss patterns and provide detailed answers on a broad set of custom rubrics. This allowed them to pinpoint where a model’s response was inadequate and provide insightful feedback on model performance, which then guided model selection and refinement. The team also conducted red teaming exercises focused on key areas, including:

Manipulating the co-pilot: Can the co-pilot be tricked into giving bad advice by feeding it false information?
Extracting sensitive data: Can the co-pilot be manipulated into revealing confidential information or system details?
Bypassing security controls: Can the co-pilot be used to craft attacks that circumvent existing security measures?

Load testing: To ensure the gen AI solutions met real-time demands, Palo Alto Networks actively load tested them, working within the pre-defined QPM (query per minute) and latency parameters of Gemini models. They simulated user traffic scenarios to find the optimal balance between responsiveness and scalability using provisioned throughput, which helped ensure a smooth user experience even during peak usage.

aside_block
), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Operational and business challenges 
Operationalizing gen AI can introduce complex challenges across multiple functions, especially for compliance, legal, and information security. Evaluating ROI for gen AI solutions also requires new metrics. To address these challenges, Palo Alto Networks implemented the following techniques and processes:

Data residency and regional ML processing: Since many Palo Alto Networks customers need a regional approach for ML processing capabilities, we prioritized regional machine learning processing to help enable customer compliance with data residency needs and regional regulations, if applicable. Where Google does not offer an AI data center that matched Prisma Cloud data center locations, customers were able to choose having their data processed in the U.S. before gaining access to the Prisma Cloud Co-pilot. We implemented strict data governance policies and used Google Cloud’s secure infrastructure to help safeguard sensitive information and uphold user privacy.

Deciding KPIs and measuring success for gen AI apps: The dynamic and nuanced nature of gen AI applications demands a bespoke set of metrics tailored to capture its specific characteristics and comprehensively evaluate its efficacy. There are no standard metrics that work for all use cases. The Prisma Cloud AI Co-pilot team relied on technical and business metrics to measure how well the system was operating. 

Technical metrics, such as recall, helped to measure how thoroughly the system fetches relevant URLs when answering questions from documents, and to help increase the accuracy of prompt responses and provide source information for users. 

Customer experience metrics, such as measuring helpfulness, relied on explicit feedback and telemetry data analysis. This provided deeper insights into user experience that resulted in increased productivity and cost savings.

Collaborating with security and legal teams: Palo Alto Networks brought in legal, information security, and other critical stakeholders early in the process to identify risks and create guardrails for issues including, but not limited to: information security requirements, elimination of bias in the dataset, appropriate functionality of the tool, and data usage in compliance with applicable law and contractual obligations. 

Given customer concerns, enterprises must prioritize clear communication around data usage, storage, and protection.  By collaborating  with legal and information security teams early on to create transparency in marketing and product communications, Palo Alto Networks was able to build customer trust and help ensure they have a clear understanding of how and when  their data is being used. 
Ready to get started with Vertex AI ?
The future of generative AI is bright, and with careful planning and execution, enterprises can unlock its full potential. Explore your organization’s AI needs through practical pilots in Vertex AI, and rely on Google Cloud Consulting for expert guidance.

Learn more about Vertex AI customer use cases and stories.

Dive into our generative AI repository and explore tuning notebooks and samples.

AI Summary and Description: Yes

**Summary:** The text highlights a collaboration between Google Cloud and Palo Alto Networks that focuses on enhancing cloud security through generative AI innovations. The introduction of Prisma Cloud Co-pilot represents a significant advancement in simplifying cloud security management, leveraging advanced AI services. Practical insights into managing evolving AI models and ensuring compliance with security and legal standards are also discussed.

**Detailed Description:**

The collaboration between Google Cloud and Palo Alto Networks led to the creation of Prisma Cloud Co-pilot, a generative AI solution aimed at improving security posture and user experience within cloud environments. Key developments in this partnership include:

– **Generative AI Technology:** Utilization of Google’s advanced Gemini models and Retrieval Augmented Generation (RAG) services, facilitating AI-driven solutions for security management.

– **Intuitive Interface:** The Prisma Cloud Co-pilot provides an AI-powered interface to help users understand and mitigate cloud-related risks effectively.

– **Rapid Iteration:** The project began in 2023 and saw significant advancements in Google’s AI capabilities, necessitating an agile development process to adapt to ongoing changes.

– **Prompt Engineering:** Implementation of a prompt management system using Vertex AI, including a diverse prompt library to optimize responses across tasks.

– **Intent Recognition:** The use of Gemini Flash 1.5 to develop a system that efficiently understands and routes user queries.

– **Security Measures:**
– **Input Guardrails:** Establishment of guardrails to prevent unwanted queries from affecting the chatbot’s functionality, addressing concerns like prompt injections.
– **Evaluation Dataset Curation:** Careful design of evaluation datasets to ensure relevant and up-to-date assessments of AI model performance.
– **Automated Evaluation Pipeline:** Development of a rigorous evaluation process to measure key performance indicators, ensuring high accuracy and low latency in responses.

– **Human Evaluation and Red Teaming:** Training of evaluators to identify and analyze loss patterns in the AI model’s responses, leading to improved model accuracy and robustness against manipulation.

– **Load Testing:** Methods to simulate user demand and ensure that generative AI solutions can handle peak usage without sacrificing performance.

– **Operational Challenges:** Addressing the complexities of integrating generative AI while ensuring compliance with data residency regulations and information security protocols. These initiatives included prioritizing regional machine learning for customers with specific compliance needs.

– **Measurement of Success:** Development of tailored metrics beyond traditional KPIs to capture the performance of generative AI applications, focusing on both technical and customer experience criteria.

– **Collaboration with Legal Teams:** Early integration of legal and information security teams to evaluate and mitigate potential risks concerning data usage and compliance, enhancing transparency in communication with customers.

In conclusion, this collaboration not only advances cloud security through generative AI but also underscores the importance of compliance, legal considerations, and user experience in deploying these innovative technologies. The insights provided will serve as a valuable blueprint for security and compliance professionals navigating similar initiatives.