Cloud Blog: Operationalizing generative AI apps with Apigee

Source URL: https://cloud.google.com/blog/products/api-management/using-apigee-api-management-for-ai/
Source: Cloud Blog
Title: Operationalizing generative AI apps with Apigee

Feedly Summary: Generative AI is now well  beyond the hype and into the realm of practical application. But while organizations are eager to build enterprise-ready gen AI solutions on top of large language models (LLMs), they face challenges in managing, securing, and scaling these deployments, especially when it comes to APIs. As part of the platform team, you may already be building a unified gen AI platform. Some common questions you might have  are: 

How do you ensure security and safety for your organization? As with any API, LLM APIs represent an attack vector. What are the LLM-specific considerations you need to worry about?

How do you stay within budget when your LLM adoption grows, while ensuring that each team has appropriate LLM capacity they need to continue to innovate and make your business more productive?

How do you put the right observability capabilities in place to understand your usage patterns, help troubleshoot issues, and capture compliance data? 

How do you give end users of your gen AI applications the best possible experience, i.e., provide  responses from the most appropriate models with minimal downtime?

Apigee, Google Cloud’s API management platform, has enabled our customers to address API challenges like these for over a decade. Here is an overview of the AI-powered digital value chain leveraging Apigee API Management.

Figure 1: AI-powered Digital Value chain

Gen AI, powered by AI agents and LLMs, is changing how customers interact with businesses, creating a large opportunity for any business. Apigee streamlines the integration of gen AI agents into applications by bolstering their security, scalability, and governance through features like authentication, traffic control, analytics, and policy enforcement. It also manages interactions with LLMs, improving security and efficiency. Additionally, Application Integration, an Integration-Platform-as-a-Service solution from Google cloud, offers pre-built connectors that allow gen AI agents to easily connect with databases and external systems, helping them fulfill user requests.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

This blog details how Apigee’s customers have been using the product to address challenges specific to LLM APIs. We’re also releasing a comprehensive set of reference solutions that enable you to get started on addressing these challenges yourself with Apigee. You can also view a webinar on the same topic, complete with product demos.
Apigee as a proxy for agents

AI agents leverage capabilities from LLMs to accomplish tasks for end-users. These agents can be built using a variety of tools — from no-code and low-code platforms, to full-code frameworks like LangChain or LlamaIndex. Apigee acts as an intermediary between your AI application and its agents. It enhances security by allowing you to defend your LLM APIs against the OWASP Top 10 API Security risks, manages user authentication and authorization, and optimizes performance through features like semantic caching. Additionally, Apigee enforces token limits to control costs and can even orchestrate complex interactions between multiple AI agents for advanced use cases.
Apigee as a gateway between LLM application and models

Depending on the task at hand, your AI agents might need to tap into the power of different LLMs. Apigee simplifies this by intelligently routing and managing failover of requests to the most suitable LLM using Apigee’s flexible configurations and templates. It also streamlines the onboarding of new AI applications and agents while providing robust access control for your LLMs. Beyond LLMs, agents often need to connect with databases and external systems to fully address users’ needs. Apigee’s robust API Management platform enables these interactions via managed APIs, and for more complex integrations, where custom business logic is required, you can leverage Google Cloud’s Application Integration platform. 
It’s important to remember that these patterns aren’t one-size-fits-all. Your specific use cases will influence the architecture pattern for an agent and LLM interaction. For example, you might not always need to route requests to multiple LLMs. In some scenarios, you could connect directly to databases and external systems from the Apigee agent proxy layer. The key is flexibility — Apigee lets you adapt the architecture to match your exact needs. 
Now let’s break down the specific areas where Apigee helps one by one:
AI safetyFor any API managed with Apigee, you can call out to Model Armor, Google Cloud’s model safety offering that allows you to inspect every prompt and response to protect you against potential prompt attacks and help your LLMs respond within the guardrails you set. For example, you can specify that your LLM application does not provide answers about financial or political topics. 
Latency and costModel response latency continues to be a major factor when building LLM-powered applications, and this will only get worse as more reasoning happens during inference. With Apigee, you can implement a semantic cache that allows you to cache responses to any model for semantically similar questions. This dramatically reduces the time end users need to wait for a response. 
In this solution, Vertex AI Vector Search and Vertex AI Embeddings API process your prompts and help you identify similar prompts for which you can then retrieve a response from Apigee’s Cache. See Semantic Cache in Apigee reference solution to get started.
PerformanceDifferent models are good at different things. For example, Gemini Pro models provide the highest quality answers, while Gemini Flash models excel at speed and efficiency. You can route users’ prompts to the best model for the job, depending on the use case or application. 
You can decide which model to use by specifying it in your API call and Apigee routes it to your desired model while keeping a consistent API contract. See this reference solution to get started.
Distribution and usage limitsWith Apigee you can create a unified portal with self-service access to all the models in your organization. You can also set up usage limits by individual apps and developers to maintain capacity for those who need it, while also controlling overall costs. See how you can set up usage limits in Apigee using LLM token counts here. 
Availability Due to the high computational demands of LLM inference, model providers regularly restrict the number of tokens you can use in a certain time window. If you reach a model limit, requests from your applications will get throttled, which could lead to your end users being locked out of the model. In order to prevent this, you can implement a circuit breaker in Apigee so that requests are re-routed to a model with available capacity. See this reference solution to get started.
ReportingAs a platform team, you need visibility into usage of the various models you support as well as which apps are consuming how many tokens. You might want to use this data for internal cost reporting or to optimize. Whatever your motivation, with Apigee, you can build dashboards that let you see usage based on the actual tokens counts — the currency of LLM APIs. This way you can see the true usage volume across your applications. See this reference solution to get started. 
Auditing and troubleshootingPerhaps you need to log all interactions with LLMs (prompts, responses, RAG data) to meet compliance or troubleshooting requirements. Or perhaps you want to analyze response quality to continue to improve your LLM applications. With Apigee you can safely log any LLM interaction with Cloud Logging, de-identify it, and inspect it from a familiar interface. Get started here. 
SecurityWith APIs increasingly seen as an attack surface, security is paramount to any API program. Apigee can act as a secure gateway for LLM APIs, allowing you to control access with API keys, OAuth 2.0, and JWT validation. This helps you enforce using enterprise security standards to authenticate users and applications that interact with your models. Apigee can also help prevent abuse and overload by enforcing rate limits and quotas, safeguarding LLMs from malicious attacks and unexpected traffic spikes. 
In addition to these security controls, you can also use Apigee to control the model providers and models that can be used. You can do this by creating policies that define the models that can be accessed by which users or applications. For example, you could create a policy that only allows certain users to access your most powerful LLMs, or you could create a policy that only allows certain applications to access your LLMs for specific tasks. This gives you granular control over how your LLMs are used, so they are only used for their intended purposes.
But Apigee can offer even more advanced protection with its Advanced API Security functionality. This allows you to defend your LLM APIs against the OWASP Top 10 API Security vulnerabilities. 
By integrating Apigee with your LLM architecture, you create a secure and reliable environment for your AI applications to thrive.
Ready to unlock the full potential of gen AI? 
Explore Apigee’s comprehensive capabilities for operationalizing AI and start building secure, scalable, and efficient gen AI solutions today! Visit our Apigee generative AI samples page to learn more and get started, watch a webinar with more details, or contact us here!

Related Article

Google Cloud Apigee named a Leader in the 2024 Gartner® Magic Quadrant™ for API Management
For the ninth consecutive time, Gartner has named Google Cloud Apigee a Leader in the 2024 Gartner® Magic Quadrant™ for API Management.

Read Article

AI Summary and Description: Yes

Summary: The text discusses the practical applications of generative AI, particularly in the context of managing and securing APIs that leverage large language models (LLMs) like Google’s Apigee platform. The challenges faced by organizations deploying these technologies are highlighted, as well as how Apigee facilitates addressing security, scalability, and operational efficiency concerns in this evolving domain.

Detailed Description:
The text provides valuable insights for security and compliance professionals involved with AI, cloud, and IT infrastructure, particularly highlighting how generative AI applications pose unique security considerations due to their reliance on LLM APIs. It discusses the strategic role of Apigee, Google Cloud’s API management platform, in overcoming these challenges. Key points include:

– **Security Risks and Management**:
– LLM APIs introduce potential attack vectors that organizations must mitigate.
– Apigee can act as a secure gateway, facilitating user authentication and authorization while adhering to security best practices.
– It protects against common API security vulnerabilities as identified by OWASP.

– **Operational Efficiency**:
– Organizations are encouraged to manage API costs effectively as LLM adoption scales. Apigee provides mechanisms to control access and usage limits to maintain budgetary efficiency.
– The platform enables the establishment of observability tools for tracing usage patterns and troubleshooting issues, essential for compliance and governance.

– **User Experience and Performance**:
– The importance of routing requests to the most appropriate LLMs to enhance user experience is emphasized. Apigee optimizes this through intelligent routing strategies based on user requests.
– Scalability is reinforced via features that ensure minimal downtime and a high-quality interaction with end-users.

– **Integration with External Systems**:
– Apigee simplifies the connection of AI agents with databases and other external systems, which aids in responding to user requests effectively.
– Highlighted is the need for flexibility in architecture patterns, tailored to individual use cases.

– **Logging, Auditing, and Compliance**:
– The necessity for logging all interactions with LLMs for compliance and quality assessment is stressed.
– Apigee offers features for detailed reporting and auditing, crucial for organizations meeting regulatory requirements.

– **Best Practices for LLM Deployment**:
– Organizations are guided in leveraging Apigee’s capabilities to build a robust, secure API environment that underpins their generative AI strategies.
– The integration of advanced security controls situates Apigee as a critical component of safe and effective LLM API implementations.

Overall, the text presents a compelling case for leveraging Apigee in the context of generative AI applications, positioning it as a comprehensive solution for maintaining security, compliance, and optimal performance. Security and compliance professionals should take note of these strategies to enhance their practices in the deployment of AI technologies.