Cloud Blog: Boost your Search and RAG agents with Vertex AI’s new state-of-the-art Ranking API

Source URL: https://cloud.google.com/blog/products/ai-machine-learning/launching-our-new-state-of-the-art-vertex-ai-ranking-api/
Source: Cloud Blog
Title: Boost your Search and RAG agents with Vertex AI’s new state-of-the-art Ranking API

Feedly Summary: The AI era has supercharged expectations: users now issue more complex queries and demand pinpoint results, meaning there’s an 82% chance of losing a customer if they can’t quickly find what they need. Similarly, AI agents require ultra-relevant context for reliable task execution. However, when traditional search methods deliver noise – with generally up to 70% of retrieved passages lacking a true answer – both agentic workflows and user experiences suffer from untrustworthy and unreliable results.
To help businesses meet these rising expectations, we’re launching our new state-of-the-art Vertex AI Ranking API. It makes it easy to boost the precision of information surfaced within search, agentic workflows, and retrieval-augmented generation (RAG) systems. This means you can elevate your legacy search system and AI application in minutes, not months.

aside_block
), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Go beyond simple retrieval
This is where precise ranking becomes essential. Think of the Vertex AI Ranking API as the precision filter at the crucial final stage of your retrieval pipeline. It intelligently sifts through the initial candidate set, identifying and elevating only the most pertinent information. This refinement step is key to unlocking higher quality, more trustworthy, and more efficient AI applications.
Vertex AI Ranking API acts as this powerful, yet easy-to-integrate, refinement layer. It takes the candidate list from your existing search or retrieval system and re-orders it based on deep semantic understanding, ensuring the best results rise to the top. Here’s how it helps you uplevel your systems:

Upgrade legacy search systems: Easily add state-of-the-art relevance scoring to existing search outputs, improving user satisfaction and business outcomes on commercial searches without overhauling your current stack.

Strengthen RAG systems: Send fewer, more relevant documents to your generative models. This improves answer trustworthiness while reducing latency and operating costs by optimizing context window usage.

Support intelligent agents: Guide AI agents with highly relevant information, streamlining their context and traces, and significantly improving the success rate of task completion.

Figure 1: Ranking API usage in a typical search and retrieval flow

What’s new in Ranking API
Today, we’re launching our new semantic reranker models:

semantic-ranker-default-004 – our most accurate model for any use case
semantic-ranker-fast-004 – our fastest model for latency-critical use cases

Our model establishing a new benchmark for ranking performance:

State-of-the-art ranking: Based on evaluations using the industry-standard BEIR dataset, our model leads in accuracy among competitive standalone reranking API services. The nDCG is a metric that’s used to evaluate the quality of a ranking system by assessing how well ranked items align with their actual relevance and prioritizes relevant results at the top. We’ve published our evaluation scripts to ensure reproducibility of results.

Figure 2: semantic-ranker-default-004 leads in NDCG@5 on BEIR datasets compared to other rankers.

Industry-leading low latency: Our default model (semantic-ranker-default-004) is at least 2x faster than competitive reranking API services at any scale. Our fast model (semantic-ranker-fast-004) is tuned for latency-critical applications and typically exhibits 3x lower latency than our default model.

We’re also launching long context ranking with a limit of 200k total tokens per API request. Providing longer documents to the Ranking API allows it to better understand nuanced relationships between queries and information such as for customer reviews or product specifications in Retail. 
Real-world impact across domains
The benefits aren’t just theoretical. Benchmarks on industry-specific datasets demonstrate that integrating the Ranking API can significantly boost the quality of search results across diverse high-value domains such as retail, news, finance, and healthcare.

Figure 3: nDCG@5 performance improvement with semantic-ranker-default-004 in various high-value domains based on internal datasets. Lexical & Semantic search baseline uses the best result of Vertex AI text-embedding-004 and BM25 based retrieval.

Elevate your search results in minutes
We designed the Vertex AI Ranking API for seamless integration. Adding this powerful relevance layer is straightforward, with several options:

Try it live: Experience the difference on real-world data by enabling our Ranking API in the interactive Vertex Vector Search demo (link)  

Build with Vertex AI: Integrate directly into any existing system for maximum flexibility (link)

Enable it in RAG Engine: Select Ranking API in your RAG Engine to get more robust and accurate answers from your generative AI applications (link) 

Use it in AlloyDB: For a truly streamlined experience, leverage the built-in ai.rank() SQL function directly within AlloyDB – a novel integration simplifying search use cases with AlloyDB (link)

AI Frameworks: Use our native integrations with popular AI frameworks like GenKit and LangChain (link)

Use it in Elasticsearch: Quickly boost accuracy with our built-in Ranking API integration in Elasticsearch (link)

AI Summary and Description: Yes

**Summary:** The text introduces the Vertex AI Ranking API, which enhances search precision and retrieval operations within AI applications. This tool addresses the challenges of user expectations for accurate results and provides businesses with an advanced mechanism to uplift their existing systems, thus potentially increasing customer satisfaction and operational efficiency.

**Detailed Description:**

The Vertex AI Ranking API is a new offering aimed at improving the precision of information retrieval within AI systems. By addressing the common issues of irrelevant results that plague traditional search methods, this API enhances both user experiences and the reliability of AI agents. Here are the major points of relevance:

– **Market Need:**
– Users have increased expectations for accuracy in search results, with a high likelihood of losing customers if they cannot find relevant information quickly.
– Traditional search results often contain noise, leading to trust issues in AI outputs.

– **API Functionality:**
– **Precision Enhancer:** The API serves as a “precision filter,” refining the candidate set of results to ensure that only the most pertinent information is surfaced.
– **Integration Ease:** It can be easily integrated into legacy systems to enhance relevance scoring without the need for extensive revisions to existing infrastructures.

– **Key Benefits:**
– **Upgrade Legacy Systems:** Businesses can improve outcomes on commercial searches by enhancing the relevance of existing search outputs.
– **Support for RAG Systems:** Fewer relevant documents are sent to generative models, enhancing answer quality and optimizing resource use.
– **Intelligent Agents:** The API helps AI agents with context clarity, leading to higher task completion success rates.

– **Performance Metrics:**
– New semantic reranker models have been introduced, focusing on accuracy and speed.
– The default model has outperformed competitive services in benchmark tests, making it ideal for a range of applications.
– It features industry-leading low latency, which is essential for applications that require quick responses.

– **Real-World Application:**
– The Ranking API has shown significant improvements in high-value domains such as retail, finance, and healthcare, demonstrating its versatility and effectiveness across various industries.

– **Integration Options:**
– Users can experience the API live, or integrate it directly into existing systems for increased flexibility, including options for AlloyDB and popular AI frameworks.

By providing this advanced ranking capability, the Vertex AI Ranking API represents a significant step forward for organizations seeking to enhance their AI functionalities and address the challenges posed by user expectations in the digital landscape. This offering is particularly relevant for professionals involved in AI, infrastructure, and cloud computing security, as it contributes to better data handling and AI application reliability.