Cloud Blog: Build richer gen AI experiences using model endpoint management

Source URL: https://cloud.google.com/blog/products/databases/use-model-endpoint-management-on-alloydb/
Source: Cloud Blog
Title: Build richer gen AI experiences using model endpoint management

Feedly Summary: Model endpoint management is available on AlloyDB, AlloyDB Omni and Cloud SQL for PostgreSQL.
Model endpoint management helps developers to build new experiences using SQL and provides a flexible interface to call gen AI models running anywhere — right from the database. You can generate embeddings inside the database, perform quality control on your vector search and analyze sentiment in the database, making it easier to monitor results. This feature is available through the google_ml_integration extension, which enables an integration with Vertex AI for both AlloyDB and Cloud SQL for PostgreSQL.
Previously, the google_ml_integration extension only allowed users to call models hosted on the Vertex AI platform. With model endpoint management, you can leverage models running on any platform — including your own local environment. We also added ease-of-use support for models running on Open AI, Hugging Face, and Anthropic, as well as Vertex AI’s latest embedding models so you can easily access these models. We have preconfigured the connectivity details and input/output transformation functions for these providers, so that you can easily register the model and simply set up the authentication details.
For Vertex AI models, we have pre-registered embedding and Gemini models so that you can easily start calling them. Plus, newer embedding models have built-in support meaning you are able to access the latest versions of pre-registered models allowing you to start making prediction calls out-of-the-box.
In this blog, we’ll walk you through three example workflows that leverage model endpoint management to build richer generative AI experiences.

Generating embeddings with Open AI embeddings models

Leveraging Gemini to evaluate vector search results

Running sentiment analysis to analyze user sentiment

aside_block
), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>

First, register your model.
To use your own model, register your model using the create model function, where you specify model endpoint connectivity details. You can then configure a set of optional parameters that allow you to transform the input and output of the model arguments to a format suitable for your database. Here’s an example of registering Anthropic’s Claude model.

Once you register your model, you can call it with the predict row function for any AI model — or you can use the embedding convenience function to call an embedding model.

#1: Generate embeddings with Open AI embeddings models
Model endpoint management allows you to leverage the embedding convenience function with any embeddings model, even ones that don’t run on Google Cloud. Say you want to generate embeddings with OpenAI’s ada embeddings model. With our ease-of-use support you need only register your authentication credentials, register the model, and start generating embeddings. You first need to configure the authentication for the endpoint you would like to reach — you can do so either by creating a PostgreSQL function to specify your API key in the header of the API call or by creating a secret with secret manager and then registering the secret with model endpoint management.
To register your secret, you simply need to specify the secret path and create an ID for the secret. You can find the secret path in the resource manager by clicking on the secret, and then clicking “copy resource name” on the specific version of the secret you want to use.

code_block
<ListValue: [StructValue([(‘code’, “CALL google_ml.create_sm_secret(\r\nsecret_id => ‘open-ai-secret’,\r\nsecret_path => ‘projects/{project_id}/secrets/{secret_id}/versions/{secret_version}’);"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae775b580>)])]>

Once your secret has been registered, you can register your model and point to the secret, open_ai_secret, when you register the openai-ada model. Our ease-of-use support handles the input and output formatting so that you can generate embeddings from data in your database and directly use the output embedding for vector search.

code_block
<ListValue: [StructValue([(‘code’, "call google_ml.create_model(\r\n model_id => ‘openai-ada-002’,\r\n model_provider => ‘open_ai’,\r\n model_type => ‘text_embedding’,\r\n model_qualified_name => ‘text-embedding-ada-002’,\r\n model_auth_type => ‘secret_manager’,\r\n model_auth_id => ‘open-ai-secret’);"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae775b040>)])]>

You then need only specify the name of the model you have registered in the first argument and the text in the second argument. For instance, if you want to generate an embedding on the word “I love Google Databases”, you would invoke the embedding function like so:

code_block
<ListValue: [StructValue([(‘code’, ‘select google_ml.embedding(‘openai-ada-002’, ‘I love Google Databases’);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae775b6d0>)])]>

If you want to generate an embedding in-line while performing a vector search, combine the embedding function with vector search in SQL using the following syntax:

code_block
<ListValue: [StructValue([(‘code’, "select id, name from items\r\n ORDER BY embedding\r\n <-> google_ml.embedding(‘openai-ada-002’, ‘I love Google Databases’) LIMIT 10;"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae775b9d0>)])]>

Model endpoint management also has built in integrations with Vertex AI’s latest embedding models, allowing you to access any of Vertex AI’s supported text embedding models. We recommend the embedding() function for in line SQL queries or to generate stored embeddings on datasets smaller than 100k rows.
#2: Leverage Gemini to evaluate vector search results
In addition to a deep integration with embedding models, model endpoint management provides developers out-of-the-box support for the latest Gemini models. Gemini Pro and Gemini Flash Light are both available as pre-registered models in AlloyDB and Cloud SQL for PostgreSQL. Leveraging Gemini, you can generate content, perform sentiment analysis or analyze the quality of vector search results. Let’s see how you might analyze the quality of your vector search results with Gemini using the predict row function.
Suppose you have a table apparels with an ID, product_description and embedding column. We can use model endpoint management to call Gemini to validate the vector search results by comparing a user’s search query against the product descriptions. This allows us to see discrepancies between the user’s query and the products returned by the vector search.

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT\r\nLLM_RESPONSE\r\nFROM (\r\nSELECT\r\njson_array_elements( google_ml.predict_row( model_id =>\’gemini-1.5-pro:streamGenerateContent\’,\r\n request_body => CONCAT(\'{\r\n "contents": [\r\n { "role": "user",\r\n "parts":\r\n [ { "text": "Read this user search text: \’, user_text, \’ Compare it against the product inventory data set: \’, content, \’ Return a response with 3 values: 1) MATCH: if the 2 contexts are at least 85% matching or not: YES or NO 2) PERCENTAGE: percentage of match, make sure that this percentage is accurate 3) DIFFERENCE: A clear short easy description of the difference between the 2 products. Remember if the user search text says that some attribute should not be there, and the record has it, it should be a NO match."\r\n } ]\r\n }\r\n] }\’\r\n)::json))-> \’candidates\’ -> 0 -> \’content\’ -> \’parts\’ -> 0 -> \’text\’\r\nAS LLM_RESPONSE\r\n FROM (\r\n SELECT\r\n id || \’ – \’ || product_description AS literature,\r\n product_description AS content,\r\n \’I want womens tops, pink casual only pure cotton.\’ user_text\r\n FROM\r\n apparels\r\n ORDER BY\r\n embedding <=> embedding(\’text-embedding-005\’,\r\n \’I want womens tops, pink casual only pure cotton.\’)::vector\r\n LIMIT\r\n 5 ) AS xyz ) AS X;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae775b880>)])]>

We are able to pass in the vector search results to Gemini to evaluate how well the user’s query matches the descriptions qualitatively, and note differences in natural language. This allows you to build quality control to your vector search use case so that your vector search application improves over time. For the full end to end use case follow this code lab.
#3: Run sentiment analysis to analyze user sentiment
One of the benefits of calling Gemini in the database is its versatility. Above, we showed how you can use it to check the quality of your vector search. Now, let’s take a look at how you might use it to analyze the sentiment of users.
Say you are an e-commerce company and you want to perform sentiment analysis on user review information stored in the database. You have a table products which stores the name of the product and their descriptions. You have another table of product reviews, product_reviews, storing user reviews of those products joined on the id of the product. You just added headphones to your online offering and want to see how well they are doing in terms of customer sentiment. You can use Gemini through model endpoint management to analyze the sentiment as positive or negative in the database and view the results as a separate column.
First, create a wrapper function in SQL to send a prompt and the text you want to analyze the sentiment on to Gemini with the predict row function.

code_block
<ListValue: [StructValue([(‘code’, ‘– Pass in the prompt for Gemini and text you want to analyze the sentiment of\r\nCREATE OR REPLACE FUNCTION get_sentiment(prompt text)\r\nRETURNS VARCHAR(100)\r\nLANGUAGE plpgsql\r\nAS $$\r\nDECLARE\r\n prompt_output VARCHAR(100);\r\n predict_row_input text;\r\nBEGIN\r\n SELECT \'{\r\n "contents": [{"role": "user","parts": [{"text": "Only return just the output value for the input. input: \’ || prompt || \’. output:"}]}]}\’ INTO predict_row_input;\r\n — Execute the prediction query with the input country name\r\n SELECT trim(replace(google_ml.predict_row(\’gemini-1.5-pro:generateContent\’, predict_row_input::json)-> \’candidates\’ -> 0 -> \’content\’ -> \’parts\’ -> 0 -> \’text\’#>> \'{}\’, E\’\\n\’,\’\’))\r\n INTO prompt_output;\r\n — Return the continent name\r\n RETURN prompt_output;\r\nEND; $$;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae87e08b0>)])]>

Now let’s say you want to analyze the sentiment on a single review — you could do it like so:

code_block
<ListValue: [StructValue([(‘code’, "SELECT\r\n get_sentiment(\r\n ‘Please output a sentiment for a given review input. The sentiment value return should be a single word positive/negative/neutral. Input review: These headphones are amazing! Great sound quality and comfortable to wear.’);"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae7762370>)])]>

You can then generate predictions on only reviews containing the word “headphones” by using a LIKE clause and calling your get sentiment function:

code_block
<ListValue: [StructValue([(‘code’, "SELECT\r\n review_id,\r\n product_review,\r\n gemini_prompt_get_scalar(\r\n ‘Please output a sentiment for a given review input. The sentiment value return should be a single word positive/negative/neutral. Input review:’\r\n || product_review)\r\nFROM product_reviews\r\nWHERE product_id IN (SELECT product_id FROM products WHERE name LIKE ‘%Headphones%’);"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e1ae7762250>)])]>

This should output whether the review was “positive, negative or neutral” for user reviews regarding headphones. Allowing you to see what the user sentiment is around this new product. Later, you can use aggregators to see whether the majority of the sentiment is positive or negative.
Get started
Model endpoint management is now available in AlloyDB, AlloyDB Omni and Cloud SQL for PostgreSQL. To get started with it, follow our documentation on AlloyDB and Cloud SQL for PostgreSQL.

AI Summary and Description: Yes

Summary: The text introduces the model endpoint management feature in AlloyDB, AlloyDB Omni, and Cloud SQL for PostgreSQL, which facilitates developers in building AI experiences by allowing integration with various generative AI models. This feature supports the generation of embeddings, sentiment analysis, and vector search evaluation, making it relevant for AI and cloud computing professionals looking to enhance their applications with AI capabilities.

Detailed Description:
The introduction of model endpoint management in AlloyDB, AlloyDB Omni, and Cloud SQL for PostgreSQL significantly enhances developers’ ability to integrate and utilize generative AI models in their applications. Here are the key features and points of interest:

– **Model Endpoint Management**:
– Allows developers to interface with generative AI models directly from the database.
– Supports generating embeddings, performing quality control on vector searches, and conducting sentiment analysis.

– **Integration with Various Platforms**:
– Initially limited to models hosted on Vertex AI, it now expands support to models from OpenAI, Hugging Face, and Anthropic.
– The google_ml_integration extension makes it convenient to connect and authenticate models from various platforms.

– **Pre-configured Models**:
– Pre-registered embedding and Gemini models streamline the process of calling these models, requiring less setup from the user.
– Users can start making prediction calls instantly.

– **Example Workflows**:
– The blog outlines specific scenarios demonstrating the use of model endpoint management:
1. **Generating Embeddings**: Example of utilizing OpenAI’s embeddings models directly within the database.
2. **Evaluating Vector Searches**: Using Gemini to validate the relevancy of vector search results against a user’s query.
3. **Sentiment Analysis**: Applying Gemini for sentiment analysis on user reviews, allowing businesses to gauge customer feedback effectively.

– **Practical Implementation**:
– The text includes SQL snippets showcasing how to register models, create secrets for authentication, and call models for embedding and sentiment analysis.
– Developers can leverage existing data in their databases to enhance AI functionality without the complexity of external calls.

This feature is particularly relevant for professionals in AI, cloud computing, and database management, providing them with tools to enhance their applications and reduce integration complexity, resulting in improved user experiences and more efficient data processing capabilities.