Cloud Blog: Tailor your search engine with AI-powered hybrid search in Spanner

Source URL: https://cloud.google.com/blog/topics/developers-practitioners/hybrid-search-in-spanner-combine-full-text-and-vector-search/
Source: Cloud Blog
Title: Tailor your search engine with AI-powered hybrid search in Spanner

Feedly Summary: Search is at the heart of how we interact with the digital ecosystem, from online shopping to finding critical information. Enter generative AI, and user expectations are higher than ever. For applications to meet diverse user needs, they need to deliver fast, accurate and contextually relevant results, regardless of how queries are framed. For example:

Online shoppers expect to find “waterproof hiking boots with ankle support" just as easily as a specific "SummitEdge Pro" model.

Legal professionals need to pinpoint precise case citations or explore nuanced legal concepts with varied search terms.

Doctors require precision when searching for critical patient information. A doctor looking for “allergy to penicillin” must locate the record accurately, whether the information is labeled as “drug sensitivities” or misspelled as “peniciln”. 

Spanner, Google’s always-on multi-model database with virtually unlimited scale, addresses these challenges with AI-powered hybrid search capabilities. Spanner allows developers to combine vector search, full-text search, and machine learning (ML) model reranking capabilities in a unified platform directly integrated with the operational data store, using a familiar SQL interface. 
In this post, we will explore how you can build a customized search engine for an ecommerce marketplace using Spanner.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Building a tailored search engine on Spanner
For ecommerce – along with many other industries – a single search method often falls short, resulting in dissatisfied users, incomplete information, or lost revenue. Keyword search excels at precision but struggles with alternate phrasing or natural language; vector search captures semantics but may overlook specific terms. Combining the strengths of both would enable organizations to deliver a more effective search experience.
SpanMart, a hypothetical ecommerce marketplace, allows users to search for products using keywords or natural language. Its products table supports multiple search methods with two specialized columns and associated indexes:

A description_tokens column: This is a tokenized version of the description column, breaking down the text into individual terms. A search index (products_by_description) on this column accelerates full-text search, acting like an inverted index in information retrieval.

An embedding column: This stores vector representations of the product descriptions, capturing semantic meaning rather than individual words. Similar descriptions are mapped close together in the “embedding space”. These embeddings are generated using models like Vertex AI Embeddings. A vector index (products_by_embedding) organizes these embeddings using a ScaNN tree structure for efficient semantic searches.

Here’s how the products table and its indexes are defined Spanner:

code_block
<ListValue: [StructValue([(‘code’, ‘CREATE TABLE products (\r\n id INT64,\r\n description STRING(MAX),\r\n description_tokens TOKENLIST AS (TOKENIZE_FULLTEXT(description)) HIDDEN,\r\n embedding ARRAY<FLOAT32>(vector_length=>768),\r\n) PRIMARY KEY(id);\r\n\r\nCREATE SEARCH INDEX products_by_description ON products(description_tokens);\r\nCREATE VECTOR INDEX products_by_embedding ON products(embedding)\r\n WHERE embedding IS NOT NULL\r\n OPTIONS(distance_type="COSINE", num_leaves=1000);’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb02e14aa90>)])]>

With these components in place, SpanMart can build an intelligent search pipeline that integrates: 

Vector search for semantic relevance.
Full-text search for precise keyword matching.
Result fusion for combining the results from different retrieval methods. 
ML model reranking for advanced result refinement.

This pipeline operates entirely within Spanner, where the operational data is stored. By avoiding integration with separate search engines or vector databases, Spanner eliminates the need for multiple technical stacks, complex ETL pipelines, and intricate application logic for inter-system communication. This reduces architectural and operational overhead and avoids potential performance inefficiencies.
The diagram below illustrates a high-level overview of how these components work together in Spanner.

Combining the power of vector and full-text search
When a user searches for products on SpanMart, the system first uses the embedding model to convert the user query into a vector that captures its semantic meaning. Then, SpanMart can build two queries:

Approximate nearest neighbors (ANN) vector search:

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT id, description\r\nFROM products @{FORCE_INDEX=products_by_embedding}\r\nWHERE embedding IS NOT NULL\r\nORDER BY APPROX_COSINE_DISTANCE(embedding, @vector,\r\n OPTIONS=>JSON\'{"num_leaves_to_search": 10}\’)\r\nLIMIT 200;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb02e14ad60>)])]>

Full-text search (FTS):

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT id, description\r\nFROM products\r\nWHERE SEARCH(description_tokens, @query)\r\nORDER BY SCORE(description_tokens, @query) DESC\r\nLIMIT 200;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb02e14abb0>)])]>

These two queries excel in different scenarios and complement each other. For instance, when a user searches for a specific product model number, such as “Supercar T-6468”, the full-text search query can accurately find the exact model, while the vector search query suggests similar items. Conversely, for more complex natural language queries, such as “gift for an 8-year-old who enjoys logical reasoning but not a toy”, full-text search may struggle to yield useful results, whereas vector search can provide a relevant list of recommendations. Combining both queries would produce robust results for both styles of searches.
Reciprocal rank fusion (RRF)
RRF is a simple yet effective technique for combining results from multiple search queries. It calculates a relevance score for each record based on its position in all result sets, rewarding records ranked highly in individual searches. This method is particularly useful when the relevance scores from the individual searches are calculated in different spaces, making them difficult to compare directly. RRF addresses this by focusing on the relative rankings instead of scores within each result set.
Here’s how RRF works in our example:

Calculate rank reciprocals: For each product, calculate its rank reciprocal in each result set by taking the inverse of its rank after adding a constant (e.g., 60). This constant prevents top-ranked products from dominating the final score and allows lower-ranked products to contribute meaningfully. For instance, a product ranked 5th in one result set would have a rank reciprocal of 1/(5 + 60) = 1/65 in that result set.

Sum rank reciprocals: Sum the rank reciprocals from all result sets to get the final RRF score of a product.

The formula for RRF is –

 – where:

d is a product description

R is the set of retrievers (in this case, the two search queries)

rankr (?) is the rank of product description d in the results of retriever r.

k is a constant

Implementing RRF within Spanner’s SQL interface is relatively straightforward. Here’s how:

code_block
<ListValue: [StructValue([(‘code’, ‘@{optimizer_version=7}\r\nWITH ann AS (\r\n SELECT offset + 1 AS rank, id, description\r\n FROM UNNEST(ARRAY(\r\n SELECT AS STRUCT id, description\r\n FROM products @{FORCE_INDEX=products_by_embedding}\r\n WHERE embedding IS NOT NULL\r\n ORDER BY APPROX_COSINE_DISTANCE(embedding, @vector,\r\n OPTIONS=>JSON\'{"num_leaves_to_search": 10}\’)\r\n LIMIT 200)) WITH OFFSET AS offset\r\n),\r\nfts AS (\r\n SELECT offset + 1 AS rank, id, description\r\n FROM UNNEST(ARRAY(\r\n SELECT AS STRUCT id, description\r\n FROM products\r\n WHERE SEARCH(description_tokens, @query)\r\n ORDER BY SCORE(description_tokens, @query) DESC\r\n LIMIT 200)) WITH OFFSET AS offset\r\n)\r\nSELECT SUM(1 / (60 + rank)) AS rrf_score, id, ANY_VALUE(description) AS description\r\nFROM ((\r\n SELECT rank, id, description\r\n FROM ann\r\n)\r\nUNION ALL (\r\n SELECT rank, id, description\r\n FROM fts\r\n))\r\nGROUP BY id\r\nORDER BY rrf_score DESC\r\nLIMIT 50;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb02e14aeb0>)])]>

Explanations:

Common table expressions (CTEs): These are the WITH clauses, which are used in this query to improve readability. However, due to a current limitation, they may cause the query optimizer to default to an older version that lacks full-text search support. For now, the query uses the @{optimizer_version=7} hint to suggest a more recent optimizer version. 

ANN CTE: This is the same as the previous ANN query, but with a twist. We assign a rank to each product in the results. While Spanner doesn’t support a direct way to assign ranks, there’s a workaround. By converting the results into an array of structs, we can use the offset of each element within the array as its rank. Since array offsets start at zero, we use offset + 1 to represent the actual rank. Note that this is purely a SQL language workaround without performance impact. The query planner effectively optimizes away the array conversion and directly assigns an offset to each row in the result set.

FTS CTE: Similarly, this part mirrors the previous full-text search query, with the rank assigned using the array offset.

Combining and ranking: The results from both CTEs are unioned, and grouped by the product id. For each product, we calculate the rrf_score and then select the top 50 products.

While RRF is an effective technique, Spanner’s versatile SQL interface empowers application developers to explore and implement various other result fusion methods. For instance, developers can normalize scores across different searches to a common range and then combine them using a weighted sum, assigning different importance to each search method. This flexibility allows for fine-grained control over the search experience and enables developers to tailor it to specific application requirements.
Using an ML model to rerank search results 
ML model-based reranking is a powerful way of refining search results to deliver improved results to the users. It applies an advanced yet computationally expensive model to a narrowed set of initial candidates, retrieved using methods like vector search, full-text search, or their combination, as discussed earlier. Due to its high computational cost, ML model-based reranking is applied after the initial retrieval reduces the result set to a small set of promising candidates.
Spanner’s integration with Vertex AI makes it possible to perform ML model-based reranking directly within Spanner. You can use a model deployed to your Vertex AI endpoint, including those available from the Vertex AI Model Garden. Once the model is deployed, you can create a corresponding reranker MODEL in Spanner.

code_block
<ListValue: [StructValue([(‘code’, "CREATE OR REPLACE MODEL reranker\r\nINPUT (text STRING(MAX), text_pair STRING(MAX))\r\nOUTPUT (score FLOAT32)\r\nREMOTE\r\nOPTIONS (endpoint = ‘//aiplatform.googleapis.com/projects/<project_id>/locations/<location>/endpoints/<endpoint_id>’);"), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb02e14aa00>)])]>

In this example, SpanMart employs a Cross-Encoder model for reranking. This model takes two text inputs –  text and text_pair – and outputs a relevance score indicating how well the two texts align. Unlike vector search, which uses an embedding model to independently map each text into a fixed-dimensional space before measuring their similarity, a Cross-Encoder directly evaluates the two texts together. This allows the Cross-Encoder to capture richer contextual and semantic nuances in complex queries, such as “gift for an 8-year-old who enjoys logical reasoning but not a toy”. In a more advanced setup, the reranker could leverage a custom-trained model that incorporates additional signals such as product reviews, promotions, and user-specific data like browsing and purchase history, to offer an even more comprehensive search experience.
Once this model is defined in Spanner, we can proceed to add reranking on the initial search results using the following query:

code_block
<ListValue: [StructValue([(‘code’, ‘@{optimizer_version=7}\r\nWITH ann AS (\r\n SELECT offset + 1 AS rank, id, description\r\n FROM UNNEST(ARRAY(\r\n SELECT AS STRUCT id, description\r\n FROM products @{FORCE_INDEX=products_by_embedding}\r\n WHERE embedding IS NOT NULL\r\n ORDER BY APPROX_COSINE_DISTANCE(embedding, @vector,\r\n OPTIONS=>JSON\'{"num_leaves_to_search": 10}\’)\r\n LIMIT 200)) WITH OFFSET AS offset\r\n),\r\nfts AS (\r\n SELECT offset + 1 AS rank, id, description\r\n FROM UNNEST(ARRAY(\r\n SELECT AS STRUCT id, description\r\n FROM products\r\n WHERE SEARCH(description_tokens, @query)\r\n ORDER BY SCORE(description_tokens, @query) DESC\r\n LIMIT 200)) WITH OFFSET AS offset\r\n),\r\nrrf AS (\r\n SELECT SUM(1 / (60 + rank)) AS rrf_score,\r\n id,\r\n ANY_VALUE(description) AS description\r\n FROM ((\r\n SELECT rank, id, description\r\n FROM ann\r\n )\r\n UNION ALL (\r\n SELECT rank, id, description\r\n FROM fts\r\n ))\r\n GROUP BY id\r\n ORDER BY rrf_score DESC\r\n LIMIT 50\r\n)\r\nSELECT id, text AS description\r\nFROM ML.PREDICT(MODEL reranker, (\r\n SELECT id, description AS text, @query AS text_pair\r\n FROM rrf\r\n))\r\nORDER BY score DESC\r\nLIMIT 10;’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eb02e14a190>)])]>

Explanations:

ANN, FTS and RRF CTEs: These are the same previously defined approximate nearest neighbors, full-text search and reciprocal rank fusion queries, respectively.

ML.PREDICT Ranking: This step applies the reranker model to each product description as text from the RRF results, along with the search query as text_pair. The model assigns a relevance score to each product. The products are then sorted by these scores, and the top 10 are selected.

Get started
In this post, we demonstrated one approach to combine full-text search and vector search in Spanner, but developers are encouraged to explore other approaches, such as refining full-text search results with vector search, or combining multiple search results with customized fusion methods.
Learn more about Spanner and try it out today. For additional information, check out:

Spanner vector search documentation and codelab

Spanner full-text search overview

Spanner vertex AI integration overview and tutorial on generating ML predictions using SQL

AI Summary and Description: Yes

Summary: The text discusses the evolving demands of search technologies powered by generative AI, highlighting the importance of delivering accurate and contextually relevant results across various applications. It introduces Google Spanner’s capabilities in integrating vector and full-text search methods, providing a comprehensive solution designed for the e-commerce sector and beyond. The insights presented are particularly beneficial for professionals navigating the intersection of AI technology, database management, and operational efficiencies.

Detailed Description:
The text outlines several key concepts and technologies relevant to the development of intelligent and effective search systems using Google Spanner. These include:

– **Evolving User Expectations**:
– Users now demand fast and accurate search results that meet diverse expectations across various sectors, such as e-commerce, legal, and healthcare.
– Complex queries require systems that can understand synonyms, context, and semantic meanings rather than relying solely on keyword matching.

– **Google Spanner Overview**:
– Spanner is a multi-model database designed to scale without sacrificing performance, offering a SQL interface for developers.
– It supports hybrid search capabilities, combining vector search and full-text search to optimize search results.

– **Design of a Tailored Search Engine**:
– An example (SpanMart) illustrates how to build a search engine that integrates different search methodologies, such as:
– **Vector Search**: Uses embeddings to capture semantic meanings (i.e., addresses the intricacy of language).
– **Full-Text Search**: Utilizes traditional keyword search methods to match queries directly.
– This dual approach enables Spanner to provide a more robust search experience by addressing the shortcomings of each method when used in isolation.

– **Search Pipeline Components**:
– The text delineates several architectural components for implementing Search in Spanner:
– **Tokens and Embedding Columns**: To store tokenized descriptions and vector representations necessary for both search methodologies.
– **Result Fusion Techniques**: Discusses Reciprocal Rank Fusion (RRF) to combine search results effectively across different methods.
– **ML Model Reranking**: Leveraging machine learning models from Vertex AI for refining and ranking search results based on relevance.

– **Implementation Details**:
– Code snippets illustrate how to create the database schema, handle search queries, and integrate machine learning-based reranking, thus giving practical insights into workflows and potential architectures.

– **Future Exploration**:
– Encourages proactive experimentation with hybrid search solutions and advanced methods for integrating results and improving user experience.

This comprehensive approach allows security and compliance professionals to recognize the importance of maintaining effective security architectures intertwined with AI and data management strategies, ensuring that user data and queries remain secure while still meeting evolving consumer demands.