Cloud Blog: Announcing Vertex AI Agent Engine Memory Bank available for everyone in preview

Jul 8, 2025

—

Source URL: https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-memory-bank-in-public-preview/
Source: Cloud Blog
Title: Announcing Vertex AI Agent Engine Memory Bank available for everyone in preview

Feedly Summary: Developers are racing to productize agents, but a common limitation is the absence of memory. Without memory, agents treat each interaction as the first, asking repetitive questions and failing to recall user preferences. This lack of contextual awareness makes it difficult for an agent to personalize their assistance–and leaves developers frustrated.
How we normally mitigate memory problems: So far, a common approach to this problem has been to leverage the LLM’s context window. However, directly inserting entire session dialogues into an LLM’s context window is both expensive and computationally inefficient, leading to higher inference costs and slower response times. Also, as the amount of information fed into an LLM grows, especially with irrelevant or misleading details, the quality of the model’s output significantly declines, leading to issues like “lost in the middle” and “context rot”.
How we can solve it now: Today, we’re excited to announce the public preview of Memory Bank, the newest managed service of the Vertex AI Agent Engine, to help you build highly personalized conversational agents to facilitate more natural, contextual, and continuous engagements. Memory Bank helps us address memory problems in four ways:

Personalize interactions: Go beyond generic scripts. Remember user preferences, key events, and past choices to tailor every response.

Maintain continuity: Pick up conversations seamlessly where they left off across multiple sessions, even if days or weeks have passed.

Provide better context: Arm your agent with the necessary background on a user, leading to more relevant, insightful, and helpful responses.

Improve user experience: Eliminate the frustration of repeating information and create more natural, efficient, and engaging conversations.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Where you can access it: Memory Bank is integrated with the Agent Development Kit (ADK) and Agent Engine Sessions. You can define an agent using ADK, enable Agent Engine Sessions to store and manage conversation history within individual sessions. Now, you can enable Memory Bank to provide long-term memory for agents to store, retrieve, and manage relevant information across multiple sessions. You can also use Memory Bank to manage your memories with other agent frameworks including LangGraph and CrewAI.
Here’s how Memory Bank works

It understands and extracts memories from interactions: Using Gemini models, Memory Bank can analyze a user’s conversation history with the agent (stored in Agent Engine Sessions) to extract key facts, preferences, and context to generate new memories. This happens asynchronously in the background, without you needing to build complex extraction pipelines.

It stores and updates memories intelligently: Key information—like “My preferred temperature is 71 degrees," or "I prefer aisle seats on flights" — is stored persistently and organized by your defined scope, such as user ID. When new information arises, Memory Bank (using Gemini) can consolidate it with existing memories, resolving contradictions and keeping the memories up to date.

It recalls relevant information: When a user starts a new conversation (session), the agent can retrieve these stored memories. This can be a simple retrieval of all facts or a more advanced similarity search (using embeddings) to find the memories most relevant to the current topic, ensuring the agent is always equipped with the right context.

A diagram illustrating how an AI agent uses conversation history from Agent Engine Sessions to generate and retrieve persistent memories about the user from Memory Bank.

This entire process is grounded in Google Research’s novel research method (accepted by ACL 2025), which enables an intelligent, topic-based approach to how agents learn and recall information, setting a new standard for agent memory performance.
Let’s take an example. Imagine you’re a retailer in the beauty industry. You have a personal beauty companion equipped with memory that recommends products and skincare routines.

As shown in the illustration, the agent is able to remember the user’s skin type (maintaining context) even after it evolves over time and be able to make personalized recommendations. This is the power of an agent with long-term memory.
Get started today with Memory Bank
You can integrate Memory Bank into your agent in two primary ways:

Develop an agent with Google Agent Development Kit (ADK) for an out-of-the-box experience

Develop an agent that orchestrates API calls to Memory Bank if you are building your agent with any other framework.

To get started, please refer to the official user guide and the developer blog. For hands-on examples, the Google Cloud Generative AI repository on GitHub offers a variety of sample notebooks, including integration with ADK and deployment to the Agent Engine runtime. For those wishing to try Memory Bank with third-party frameworks, we also provide notebook samples for LangGraph and CrewAI.

Memory Bank with ADK agent

Memory Bank with CrewAI agent

Memory Bank with LangGraph agent

If you’re a developer using Agent Development Kit (ADK) but have never used Google Cloud before, you can still start by using our new express mode registration for Agent Engine Sessions and Memory Bank. Here’s how it works:

Use the key to access Agent Engine Sessions and Memory Bank

Build and test your agent within the free tier usage quotas

Seamlessly upgrade to a full Google Cloud project when you are ready for production

If you want to know more about Memory Bank, join the Vertex AI Google Cloud community to share your experiences, ask questions, and collaborate on new projects.

AI Summary and Description: Yes

**Summary:**
The text discusses the introduction of Memory Bank, a new service in the Vertex AI Agent Engine that addresses the limitation of memory in conversational AI agents. This service enhances agent personalization, maintains continuity across interactions, and improves the overall user experience by intelligently storing and recalling user preferences and context.

**Detailed Description:**
The announcement highlights the challenges developers face when creating conversational agents due to their lack of memory. Memory Bank aims to solve these issues through various innovative features:

– **Personalized Interactions:**
– Agents can remember user preferences and past interactions, moving beyond generic responses to create tailored experiences.

– **Continuity Maintenance:**
– The service enables agents to continue conversations seamlessly across multiple sessions, regardless of the time elapsed since the last interaction.

– **Contextual Awareness:**
– Agents receive relevant background information, allowing for more insightful and helpful responses based on the user’s previous experiences.

– **Enhanced User Experience:**
– By removing repetitive questioning, Memory Bank creates a more engaging and efficient interaction for users.

Key functionalities of Memory Bank include:

– **Understanding and Extracting Memories:**
– Using AI models, Memory Bank analyzes previous interactions to extract important user details such as preferences and key facts, doing this asynchronously for performance optimization.

– **Intelligent Memory Storage and Updates:**
– It organizes and updates memories on a user-by-user basis, resolving any discrepancies autonomously.

– **Effective Information Recall:**
– The system facilitates the retrieval of relevant memories during new interactions, providing agents with contextual information tailored to current conversations.

Furthermore, Memory Bank is poised to set a new standard for agent memory performance, supported by innovative research from Google. Examples illustrate its application in various scenarios, such as a personalized beauty advice agent.

The service can be integrated through:

1. **Agent Development Kit (ADK):**
– Offers a streamlined, out-of-the-box experience for developers.

2. **API Orchestration:**
– Allows for integration with third-party frameworks for a custom approach.

Memory Bank reflects a significant advancement in how conversational agents can operate by addressing foundational limitations in memory and contextualization, making it highly relevant for AI practitioners focused on enhancing user interactions.

1 2 2025 5 7 a access account Act actions addresses advanced advancement after agent agent development Agent Development Kit Agent Engine agent framework agent frameworks agent memory performance agents AGI AI ai model AI models and API app Application Arch ARM art as assistance async asynchronous ated Auto autonomous aware awareness based based approach beyond book Box building by C calling challenge challenges CI CIA Cloud co Col community computation Console Context context rot context window Contextual Awareness continuous conversation conversational Conversational Agents conversational AI cost Costs crew cross Current D day days de DeFi deployment developer developers development dual e e-learning effective efficient embeddings end engagement event exp experience extraction face fact fail feature features fine first focused for framework frameworks free Free tier full function g Gemini Gemini model Gemini models Gen generative Generative AI GIS git GitHub Gmail Go Google Google Cloud Google Cloud project Google Research grade graph gs H hands high Highlight HR http HTTPS image in industry Inference inference costs information innovative features integration Intel inter interaction interactions io iOS IRS issue ite J k keeping Key l Labor LangGraph leading learning led left Li limitations llm lm long long-term memory low M mac machine maintenance making man managed service memory Memory Bank mid middle Mila mini ML Mode model models multi my N NCA new no non notebook o of off on one OPM opt optimization orchestration ory oS other out output over party performance performance optimization personalization Pipeline pipelines Power pre Preview pro problem process product production products project projects ps public Q quality question R rag rate RCE ready recall recommendations red repository research response response times responses retail retrieval review right Ro RoT row RSA Rust s sam sample notebooks scope search service SHA side Sig Sim similarity search Simple solid solving source SSE SSL STAR start storage stored support system T Tails ted test text the third third-party Time times to Tor TP trial trie two type UI under up update updates upgrade US usage use user user experience user interaction user interactions user preferences Users V val vents Vertex Vertex AI WAN Ware Wi Wind x z