Source URL: https://cloud.google.com/blog/topics/developers-practitioners/google-cloud-and-spring-ai-10/
Source: Cloud Blog
Title: Google Cloud and Spring AI 1.0
Feedly Summary: A big thank you to Fran Hinkelmann and Aaron Wanjala for their contributions and support in making this blog post happen.After a period of intense development, Spring AI 1.0 has officially landed, bringing a robust and comprehensive solution for AI engineering right to your Java ecosystem. This isn’t just another library; it’s a strategic move to position Java and Spring at the forefront of the AI revolution.With an overwhelming number of enterprises already running on Spring Boot, the path to integrating AI into existing business logic and data has never been smoother. Spring AI 1.0 empowers developers to seamlessly connect their applications with cutting-edge AI models, unlocking new possibilities without the typical integration headaches. Get ready to level up your JVM applications with intelligent capabilities!
Spring AI provides support for various AI models and technologies:
Image models can generate images from text prompts that are provided to them.
Transcription models can take audio files and convert them to text.
Embedding models can convert arbitrary data into vectors, which are data types that are optimized for semantic similarity search.
Chat models should be familiar! You’ve no doubt even had a brief conversation with one somewhere.
They’re versatile. You can get them to help you correct a document or write a poem or seemingly anything. They’re awesome, but they have some drawbacks.
Chat models are open-minded and given to distraction. You can help manage the chat model by providing additional capabilities to do the following:
Keep chat models on track: use system prompts to define their behavior.
Give them a sense of memory: implement memory to track conversational context.
Let AI models access external functions: enable tool calling
Provide relevant information directly in the request: use prompt stuffing for private data.
Fetch and utilize specific enterprise data: leverage vector stores for retrieval augmented generation (RAG).
Ensure accuracy: evaluation uses another model to validate outputs.
Let AI applications connect with other services: use the Model Context Protocol (MCP). MCP works regardless of the application’s programming language, so that you can build agentic workflows for complex tasks.
Spring AI integrates smoothly with Spring Boot, offering familiar abstractions and starter dependencies from the Spring Initializr, giving you the convention-over-configuration setup that you expect. You can easily connect your existing logic and data to AI models within Spring Boot applications.
With Spring AI, you can leverage robust solutions to make chat models more effective and to deeply integrate these models into larger systems.
Prerequisites
To make a call to a Gemini model in Vertex AI, you will need to obtain credentials for the service that you want to use and then configure your local development environment.
You start by setting up your Google Cloud environment:
In the Google Cloud console, create or select a Google Cloud project.
Ensure that Billing is enabled for your project. Generative AI models usually require a billed account.
In the console, enable the Vertex AI API.
In your terminal, install the gcloud CLI, which is essential for managing resources and setting up authentication for local development where you run your application.
Set up your application default credentials by running the following commands:
code_block
You’re in. Now you get to build.
The Build
You start by building a simple Google Cloud-optimized SpringAI application:
Go to Spring Initializr.
In the Artifact field, enter google.
For the Project type, select Apache Maven (used in this blog) or Gradle.
Click Add Dependencies, and then select these dependencies: Spring Web, Spring Boot Actuator Actuator, GraalVM Native Support, Spring Data JDBC, Vertex AI Gemini, Vertex AI Embeddings, PGvector Vector Database, MCP Client, and Docker Compose Support.
Select the Java version that you want to use. We recommend that you use the latest available version. This blog uses GraalVM, which is a distribution of OpenJDK with some extra utilities that let you compile your code into images specific to the OS and architecture of the respective machine. These images operate in a fraction of the RAM and startup in a fraction of the time compared to regular JRE-based applications.
If you’re using sdkman.io, you can install it on your local machine by using the following command:
sdk install java 24-graalce
Click Generate and then save the .zip file that you can open in your IDE.
Add the required dependencies
Extract the .zip file that you downloaded, and then open the pom.xml file.
In the pom.xml file, at the end of the <dependencyManagement>, add the following lines:
code_block
<ListValue: [StructValue([(‘code’, ‘<dependency>\r\n<groupId>com.google.cloud</groupId>\r\n<artifactId>spring-cloud-gcp-dependencies</artifactId>\r\n <version>${spring-cloud-gcp.version}</version>\r\n <type>pom</type>\r\n<scope>import</scope>\r\n</dependency>\r\n<dependency>\r\n <groupId>com.google.cloud</groupId>\r\n <artifactId>libraries-bom</artifactId>\r\n <version>26.60.0</version>\r\n <type>pom</type>\r\n<scope>import</scope>\r\n</dependency>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d2729220>)])]>
Configuration
In the pom.xml file, you’ll need to update these configuration values, starting with the application.properties:
code_block
<ListValue: [StructValue([(‘code’, ‘# set application name\r\nspring.application.name=google\r\n\r\n# configure actuators supported in the app\r\nmanagement.endpoints.web.exposure.include=*\r\n\r\n# docker compose configuration\r\nspring.docker.compose.lifecycle-management=start_only\r\n\r\n# configure the chat and embedding models\r\n# vertex embedding\r\nspring.ai.vertex.ai.embedding.project-id=<your_user>\r\nspring.ai.vertex.ai.embedding.location=us-central1\r\n# vertex chat\r\nspring.ai.vertex.ai.gemini.project-id=<your_user>\r\nspring.ai.vertex.ai.gemini.location=us-central1\r\nspring.ai.vertex.ai.gemini.chat.options.model=gemini-2.5-pro-preview-05-06\r\n\r\n# initialize the schema in the vector store\r\nspring.ai.vectorstore.pgvector.initialize-schema=true\r\n\r\n# database connection parameters\r\nspring.datasource.password=secret\r\nspring.datasource.username=myuser\r\nspring.datasource.url=jdbc:PostgreSQL://localhost/mydatabase’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d2729d90>)])]>
The following are more details about the sections that you updated:
configure actuators supported in the app: tells Spring Boot’s observability integration, the Spring Boot Actuator, to expose all Actuator endpoints.
docker compose configuration: leverages the built-in Docker Compose support. Spring Boot automatically detects the compose.yml file in the root of the directory and it runs the Docker image for you before the application starts. The Spring Initializer generated the Docker compose.yml file for you. But you don’t want Spring Boot to restart the container each time because PostgreSQL isn’t serverless! Tell Spring Boot to start the container only if it isn’t running.
configure the chat and embedding models: the vertex embedding and vertex chat values configure which Gemini chat and Gemini embedding models to use from Google Cloud’s Gemini. This example uses two suitable models for the use case.
initialize the schema in the vector store: configures the app to use PostgreSQL loaded with a plugin that supports the vector type. Spring AI has an abstraction called VectorStore which handles writing data to various vector stores. This configuration ensures that Spring AI will initialize the storage that’s required to treat PostgreSQL as a vector store.
database connection parameters: configures the connection to the SQL database. Do you need this, strictly speaking? No. The Docker Compose support in Spring Boot will automatically connect to the SQL database. It can be handy for reference, though.
The Database
You have a SQL database with no data. Imagine! If a database has no data, is it really a database? Or is it just a base? You can get Spring Boot to initialize the database by running some SQL commands and installing some data on startup. Two files will run on startup: src/main/resources/schema.sql and /src/main/resources/data.sql. First, the schema.sql file:
code_block
<ListValue: [StructValue([(‘code’, ‘drop table if exists dog;\r\ncreate table if not exists dog(\r\n id serial primary key,\r\n name text not null,\r\n description text not null,\r\n owner text\r\n);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d2729520>)])]>
Simple enough. It defines a dog table. The src/main/resources/data.sql file gets some dogs in there:
code_block
<ListValue: [StructValue([(‘code’, “…\r\nINSERT INTO dog(id, name, description) values (45, ‘Prancer’, ‘A silly, goofy dog who slobbers all over everyone.’);"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4250>)])]>
Nice! The database will be initialized on every restart. This configuration helps to avoid duplicate data by dropping the table on every restart and re-inserting the same rows. If this were a real database, you might use an upsert, which PostgresSQL supports with its insert on conflict… do syntax.
To make short work of building a data access repository and an entity you use Spring Data JDBC. You’ll create an entity called Dog to model and map to the data in the repository.
In the GoogleApplication.java file, following GoogleApplication, add the following lines:
code_block
<ListValue: [StructValue([(‘code’, ‘interface DogRepository \r\n extends ListCrudRepository<Dog, Integer> {}\r\n\r\nrecord Dog(@Id int id, String name, String owner, String description) {}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4fa0>)])]>
The Assistant
Now for the meat… or the dog bone of the matter, as our pal Prancer might say!
Add Chat clients
First, you build an AssistantController class, by adding the following, again at the end of the GoogleApplication.java file:
code_block
<ListValue: [StructValue([(‘code’, ‘@Controller\r\n@ResponseBody\r\nclass AssistantController {\r\n private final ChatClient ai;\r\n\r\n private final Map<String, PromptChatMemoryAdvisor> advisors = new ConcurrentHashMap<>();\r\n\r\n AssistantController(ChatClient.Builder ai) {\r\n var system = """\r\nYou are an AI powered assistant to help people adopt a dog from the adoption\r\n agency named Pooch Palace with locations in Mountain View, \r\nSeoul, Tokyo, Singapore, Paris,\r\n Mumbai, New Delhi, Barcelona, San Francisco, and London. \r\nInformation about the dogs available\r\n will be presented below. If there is no information, then \r\nreturn a polite response suggesting we\\s\r\n don\’t have any dogs available.\r\n """;\r\n this.ai = ai\r\n .defaultSystem(system)\r\n .build();\r\n }\r\n}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4610>)])]>
The ChatClient that’s added in the preceding code is your one-stop-shop for all your chat model interactions. It in turn depends on the (autoconfigured) ChatModel that talks to, in this case, Google’s legendary Gemini. You typically only have one ChatModel configured in the application, but you might reasonably have many ChatClients, with different defaults and scenarios configured appropriately. You create a new ChatClient by using the ChatClient.Builder, which the preceding code injects into the constructor.
Set the HTTP endpoint
Next, you set up an HTTP endpoint, /{user}/inquire. When a request comes in, you use a system prompt to ensure that the model acts like it’s an actual employee at our fictitious dog adoption agency, Pooch Palace.
Add the following method to the AssistantController controller class file:
code_block
<ListValue: [StructValue([(‘code’, ‘@GetMapping("/{user}/inquire")\r\n String inquire(@PathVariable String user,\r\n @RequestParam String question) {\r\n var c = \r\nMessageWindowChatMemory\r\n.builder()\r\n.chatMemoryRepository(new \r\nInMemoryChatMemoryRepository()).build();\r\n var advisor = this.advisors\r\n .computeIfAbsent(user, \r\n_ -> PromptChatMemoryAdvisor.builder(c).build());\r\n return this.ai\r\n .prompt()\r\n .advisors(advisor)\r\n .user(question)\r\n .call()\r\n .content();\r\n }’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4430>)])]>
This method defines a PromptChatMemoryAdvisor that keeps track of everything that’s said between a particular user and the model. The method then transmits that transcript to the model on every subsequent request to remind it.
Test the endpoint
Try the following requests out:
code_block
<ListValue: [StructValue([(‘code’, ‘http :8080/lee/inquire question=="my name is Lee."’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c44c0>)])]>
and:
code_block
<ListValue: [StructValue([(‘code’, ‘http :8080/lee/inquire question=="what\’s my name?"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4f40>)])]>
It should confirm that your name is Lee. And it might even try to keep you on track in adopting a dog. Let’s see what else it can do. Ask it:
code_block
<ListValue: [StructValue([(‘code’, ‘http :8080/lee/inquire question=="do you have any silly dogs?"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4b20>)])]>
It will respond that it doesn’t have any information about dogs in any Pooch Palace location, and it might even encourage you to check the local listings.
Source data
The problem is that it doesn’t have access to the data in the database. We shouldn’t give it all of the data…
We could perhaps give it all of the data. There are only 18 or so records in this particular database. That would fit easily into the token limit for most (if not all?) LLMs. It would certainly fit into a request made to Google’s LLMs large-context window, though.
All LLMs have this concept of tokens – an approximation for the amount of data consumed and produced by an LLM. Google Gemini 2.5 Pro has a very large token size. If you’re using a local model like the open Gemma model, which you can run locally, then the only cost to running a model is the complexity and CPU cost. If you run a hosted model like Gemini, then there’s also a dollars-and-cents cost.
So, even though you could send all of the data along with your request to one of these models, you should try to limit the data that you send. It’s the principle of the thing! Instead, store everything in a vector store and then find things that might be potentially germane to the request, and finally transmit only that data to the model for consideration. This process of sourcing data from a database and then using that date to inform the response that’s produced by a model is called retrieval augmented generation (or RAG).
Add a parameter of type VectorStore vectorStore to the class constructor. You’ll read all of the data from the dog table and then write it all out to the VectorStore implementation that’s backed by PostgreSQL. Add the following to the constructor, at the very top:
code_block
<ListValue: [StructValue([(‘code’, ‘if (db.sql("select count(*) from vector_store").query(Integer.class).single().equals(0)) {\r\n repository.findAll().forEach(d -> {\r\n var dogument = new Document("id: %s, name: %s, description: %s".formatted(\r\n d.id(), d.name(), d.description()\r\n ));\r\n vectorStore.add(List.of(dogument));\r\n });\r\n}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4400>)])]>
The only thing to do now is to tell the ChatClient to first consult the VectorStore for relevant data to include in the body of the request to the model. You do this with an advisor. Change the definition of the ChatClient:
code_block
<ListValue: [StructValue([(‘code’, ‘this.ai = ai\r\n.defaultAdvisors(new QuestionAnswerAdvisor(vectorStore))\r\n…’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4be0>)])]>
Test again and refine the solution
Run the program and ask again:
code_block
<ListValue: [StructValue([(‘code’, ‘http :8080/lee/inquire question=="do you have any silly dogs?"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4940>)])]>
This time, it should respond that indeed there’s a silly dog named Prancer (hi, buddy!) in the shelter who might be just the dog for us!
Well, naturally the next thing you’ll want to do is adopt this dog. But when might you stop by to adopt and pick up this dog? You’ll need to connect your LLM to the patent-pending, class-leading scheduling algorithm.
Add the DogAdoptionsScheduler class to the GoogleApplication.java file:
code_block
<ListValue: [StructValue([(‘code’, ‘@Component\r\nclass DogAdoptionsScheduler {\r\n\r\n @Tool(description = "schedule an appointment to pickup" + \r\n "or adopt a dog at a Pooch Palace location")\r\n String scheduleAppointment(\r\n @ToolParam(description = "the id of the dog") String dogId,\r\n @ToolParam(description = "the name of the dog") String dogName) \r\nthrows Exception {\r\n var i = Instant\r\n .now()\r\n .plus(3, ChronoUnit.DAYS)\r\n .toString();\r\n System.out.println("scheduled appointment for " + i +\r\n " for dog " + dogName + " with id " + dogId + ".");\r\n return i;\r\n }\r\n}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4700>)])]>
You’ve annotated the method with Spring AI’s @Tool and @ToolParam annotations. These annotations furnish descriptions that the model will use, along with the shape of the methods, to intuit whether they might be of use. You’ll want to tell the model these tools are available, too. Inject the newly defined DogAdoptionScheduler scheduler into the constructor of the controller from the GoogleApplication.java file and then add the following to the definition of the ChatClient:
code_block
<ListValue: [StructValue([(‘code’, ‘this.ai = \r\n …\r\n .defaultTools(scheduler)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4310>)])]>
Restart the program and try it out. Ask about Prancer again:
code_block
<ListValue: [StructValue([(‘code’, ‘http :8080/lee/inquire question=="do you have any silly dogs?"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c4730>)])]>
The response has information now, so then ask how you can get Prancer:
code_block
<ListValue: [StructValue([(‘code’, ‘http :8080/lee/inquire question=="fantastic. When can I schedule an appointment to pick up Prancer from the San Francisco location?"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31d13c46d0>)])]>
Did it confirm it’s going to schedule an adoption for three days hence? Congratulations! You successfully have model access to the data and business logic of the service.
Introducing Model Context Protocol
This service is all written in Spring and it uses Spring AI, but of course there are other languages and technology stacks out there, and those stacks might want to leverage this patent-pending, industry-leading scheduling algorithm. You’ll want to extract that functionality and make it a tool that’s available to all interactions with the LLM by using Model Context Protocol. This protocol was first designed by Anthropic and it provides an easy way for any LLM to assimilate tools into their toolbox, no matter what programming language the tools were written in.
Go back to Spring Initializr and set up a new project:
1. Generate a new project named scheduler.
2. Add Spring Web, and MCP Server as dependencies.
3. Choose the latest version of Java and Maven, as before.
4. Unzip the resulting .zip file and open it in your IDE.
5. Transplant (cut and paste for the win!) the DogAdoptionsScheduler component from the google project to the SxchedulerApplication.java file in this new project.
6. Start the new project on a different port; add server.port=8081 to application.properties.
7. Define the following bean in SchedulerApplication:
code_block
<ListValue: [StructValue([(‘code’, ‘@Bean\r\nMethodToolCallbackProvider dogMethodToolCallbackProvider(DogAdoptionsScheduler scheduler) {\r\n return MethodToolCallbackProvider\r\n .builder()\r\n .toolObjects(scheduler)\r\n .build();’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31c4b96400>)])]>
8. Start the application.
9. Return to the assistant and then modify it to work with a remote service instead of the component that it used before.
10. Add the following bean definition to the GoogleApplication.java class:
code_block
<ListValue: [StructValue([(‘code’, ‘@Bean\r\nMcpSyncClient mcpSyncClient() {\r\n var mcp = McpClient\r\n .sync( HttpClientSseClientTransport.builder("http://localhost:8081").build())\r\n .build();\r\n mcp.initialize();\r\n return mcp;\r\n }’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31c4b96790>)])]>
11. Change the controller of AssistantController to inject a McpSyncClient client instead of the DogAdoptionsScheduler that was there before. Modify the ai definition accordingly:
code_block
<ListValue: [StructValue([(‘code’, ‘…\r\nthis.ai = ai.defaultToolCallbacks(\r\n new SyncMcpToolCallbackProvider(client))\r\n…’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31c4b96520>)])]>
12. Restart the application and then try the interaction again:
code_block
<ListValue: [StructValue([(‘code’, ‘http :8080/dogs/lee/inquire question=="do you have any silly dogs?" \r\n\r\nhttp :8080/dogs/lee/inquire question=="fantastic. When can I schedule an appointment to pick up Prancer from the San Francisco location?"’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31c4b96ca0>)])]>
You should see this time that the model responds basically the same as last time, except that the request was handled in the newly minted scheduler component, not in the google service! We love it when a plan comes together.
Production worthy AI
When you deploy your application to Production, Google Cloud offers two great choices that are compatible with PostgreSQL, both supporting pgVector capabilities. You can use Google Cloud SQL which is basically just PostgreSQL hosted by Google. What else could you need? We reckon, for a lot of use cases, not much! But if you really want superior performance, availability, and scale – then look no further than Google Cloud’s AlloyDB.
AlloyDB is scalable for all sizes and it offers a 99.99% available SLA, inclusive of maintenance. And it’s purpose built for your AI workloads. We’re using PostgreSQL in a Docker image in this example, but when it’s time to deploy, we’re going to spin up an AlloyDB instance.
Codebase
The code for this walkthrough is available in Github.
Next Steps
You’ve just built a production-worthy, AI-ready, Spring AI- and Google-powered application in no time at all. You’ve only begun to scratch the surface! Check out Spring AI 1.0 at the Spring Initializr today and learn more about Google Cloud here.
Resources
Spring AI 1.0 blog
Josh’s YouTube channel
AI Summary and Description: Yes
**Summary:** The text provides an extensive walkthrough of Spring AI 1.0, emphasizing its integration with Java and Spring ecosystems to facilitate AI model incorporation. This significant advancement positions Java developers to leverage AI technologies seamlessly within their existing applications, enhancing both functionality and user experience.
**Detailed Description:**
The text serves as a comprehensive guide to Spring AI 1.0, detailing its functionalities and applications within AI engineering, particularly focused on Java. Below are the major points outlined:
– **Integration with Java Ecosystem:**
– Spring AI 1.0 provides a robust solution for AI engineering catered specifically for Java programmers, particularly those using Spring Boot.
– The library aims to simplify integration hurdles, allowing developers to incorporate AI models directly into their applications.
– **Support for Various AI Models:**
– The library supports diverse AI models:
– **Image Models:** Generate images from text prompts.
– **Transcription Models:** Convert audio files into text.
– **Embedding Models:** Facilitate semantic similarity searches by converting arbitrary data into vectors.
– **Chat Models:** Engage in conversation, serve as assistants, or aid in document tasks.
– **Chat Model Management:**
– Strategies to maintain chat model focus include:
– System prompts to define behavior.
– Implement memory to retain conversation context.
– Allowing access to external functions and enterprise data.
– Ensuring accuracy through secondary model evaluations.
– **Spring Boot Integration:**
– The framework provides a familiar environment for developers, allowing straightforward configuration and application setup.
– It outlines dependency management through Maven or Gradle for seamless integration.
– **Development Steps:**
– Instructions cover setting up a Google Cloud environment, configuring the application, and establishing endpoints for user interactions with AI.
– Demonstrates a comprehensive example of building a Spring AI application from scratch, including database configurations and integration of large language models (LLMs).
– **Model Context Protocol (MCP):**
– The document introduces MCP, a protocol for the seamless integration of tools written in any programming language with LLMs. This allows for broader functionality and inter-platform communication.
– **Production Deployment:**
– Highlights options for production environments, mentioning Google Cloud SQL and AlloyDB as viable databases for hosting AI workloads, thus ensuring scalability and performance.
– **Next Steps and Resources:**
– Encourages developers to explore further and leverage Spring AI 1.0 in conjunction with Google Cloud services, promoting continuous learning and enhancement of AI capabilities within applications.
This text is particularly valuable for professionals involved in software development and AI, as it empowers them with practical knowledge to enhance existing systems and manage AI integrations effectively. It represents a shift in how Java developers can operationalize AI technologies, reinforcing the importance of secure, efficient infrastructure in AI projects.