Simon Willison’s Weblog: Claude can now search the web

Mar 20, 2025

—

Source URL: https://simonwillison.net/2025/Mar/20/claude-can-now-search-the-web/#atom-everything
Source: Simon Willison’s Weblog
Title: Claude can now search the web

Feedly Summary: Claude can now search the web
Claude 3.7 Sonnet on the paid plan now has a web search tool that can be turned on as a global setting.
This was sorely needed. ChatGPT, Gemini and Grok all had this ability already, and despite Anthropic’s excellent model quality it was one of the big remaining reasons to keep other models in daily rotation.
Surprisingly there are no details on how it works under the hood. Is this a partnership with someone like Bing, or is it Anthropic’s own proprietary index populated by their own crawlers?
I think it may be their own infrastructure, but I’ve been unable to confirm that.
Their support site offers some inconclusive hints.
Does Anthropic crawl data from the web, and how can site owners block the crawler? talks about their ClaudeBot crawler but the language indicates it’s used for training data, with no mention of a web search index.
Blocking and Removing Content from Claude looks a little more relevant, and has a heading “Blocking or removing websites from Claude web search" which includes this eyebrow-raising tip:

Removing content from your site is the best way to ensure that it won’t appear in Claude outputs when Claude searches the web.

And then this bit, which does mention "our partners":

The noindex robots meta tag is a rule that tells our partners not to index your content so that they don’t send it to us in response to your web search query. Your content can still be linked to and visited through other web pages, or directly visited by users with a link, but the content will not appear in Claude outputs that use web search.

Both of those documents were last updated "over a week ago", so it’s not clear to me if they reflect the new state of the world given today’s feature launch or not.
I got this delightful response trying out Claude search where it mistook my recent Squadron automata for a software project:

Tags: anthropic, claude, generative-ai, llm-tool-use, ai, llms

AI Summary and Description: Yes

Summary: The introduction of a web search feature in Claude 3.7 enhances its utility, aligning it more closely with competitors like ChatGPT and Gemini. However, the lack of transparency regarding the underlying mechanics raises questions about data handling and compliance implications.

Detailed Description:
The text discusses Claude 3.7, an AI language model from Anthropic, which has introduced a web search capability. This feature is significant due to several reasons related to AI and cloud computing security:

– **Enhanced Functionality**: The web search tool can now be activated as a global setting, increasing Claude’s competitiveness with existing models that already offered this capability.
– **Underlying Infrastructure**: There are no clear details on the technology that powers this search tool. The text speculates whether Anthropic employs its own infrastructure or partners with established search engines like Bing.
– **Data Crawler Insights**: The mention of ClaudeBot raises concerns about how Anthropic collects data, particularly in terms of compliance and ethical implications surrounding data usage.
– **Content Removal Strategy**: The advice on removing content from websites to prevent it from appearing in Claude outputs suggests a potential gap in content control and privacy measures that website owners may need to consider.
– **Regulatory Compliance**: The reference to the noindex robots meta tag highlights the importance of compliance in content indexing strategies, a key area for legal and governance frameworks related to digital content.

Overall, this development could have implications for privacy, information security, and compliance, especially considering how AI models interact with publicly available data and the responsibilities that come with it. Professionals in security and compliance must pay attention to how AI organizations handle data collection, user privacy, and transparency surrounding AI outputs.

.NET 2 2025 3 5 7 7 Sonnet a Act AI ai model AI models and Anthropic Arch art as Auto Best bing bots by C cell CERN chat ChatGPT CIA Claude CleaR Cloud cloud computing cloud computing security co Col competitive competitiveness competitors compliance compliance implications Computing concerns content content removal control crawler crawlers D data data collection data crawler Data Handling data usage day de development digital digital content document e E 3 end ethical ethical implications event Excel feature for framework frameworks functionality g Gemini Gen generative git Go governance governance framework governance frameworks GPT Grok gs H high Highlight HR http HTTPS implications in indexing information information security infrastructure insights inter ite J k Key l language language model led Legal Li Link linked llm llms lm Meta mini Mode model models my N no o oE of off on one OPM organization organizations ory out Outputs over partnership potential Power pre privacy Privacy Measures professionals project public quality question R raising rate RCE ready red regulatory regulatory compliance response Ro robots RoT s search search engine search feature Search tool sec security security and compliance side Sig Sim software source SSE state Strategy T Tags: Tails tech technology text the to tool Tor TP training training data transparency under up update US usage use user user privacy Users V val Ware web web search website Wi x