Source URL: https://simonwillison.net/2025/Sep/10/claude-web-fetch-tool/#atom-everything
Source: Simon Willison’s Weblog
Title: Claude API: Web fetch tool
Feedly Summary: Claude API: Web fetch tool
New in the Claude API: if you pass the web-fetch-2025-09-10 beta header you can add {“type": "web_fetch_20250910", "name": "web_fetch", "max_uses": 5} to your "tools" list and Claude will gain the ability to fetch content from URLs as part of responding to your prompt.
What’s particularly interesting here is their approach to safety for this feature:
Enabling the web fetch tool in environments where Claude processes untrusted input alongside sensitive data poses data exfiltration risks. We recommend only using this tool in trusted environments or when handling non-sensitive data.
To minimize exfiltration risks, Claude is not allowed to dynamically construct URLs. Claude can only fetch URLs that have been explicitly provided by the user or that come from previous web search or web fetch results. However, there is still residual risk that should be carefully considered when using this tool.
My first impression was that this looked like an interesting new twist on this kind of tool. Prompt injection exfiltration attacks are a risk with something like this because malicious instructions that sneak into the context might cause the LLM to send private data off to an arbitrary attacker’s URL, as described by the lethal trifecta. But what if you could enforce, in the LLM harness itself, that only URLs from user prompts could be accessed in this way?
Unfortunately this isn’t quite that smart. From later in that document:
For security reasons, the web fetch tool can only fetch URLs that have previously appeared in the conversation context. This includes:
URLs in user messages
URLs in client-side tool results
URLs from previous web search or web fetch results
The tool cannot fetch arbitrary URLs that Claude generates or URLs from container-based server tools (Code Execution, Bash, etc.).
Note that URLs in "user messages" are obeyed. That’s a problem, because in many prompt-injection vulnerable applications it’s those user messages (the JSON in the {"role": "user", "content": "…"} block) that often have untrusted content concatenated into them – or sometimes in the client-side tool results which are also allowed by this system!
Anthropic do provide a much stronger mechanism here: you can allow-list domains using the "allowed_domains": ["docs.example.com"] parameter.
Provided you use allowed_domains and restrict them to domains which absolutely cannot be used for exfiltrating data (which turns out to be a tricky proposition) it should be possible to safely build some really neat things on top of this new tool.
Tags: apis, security, ai, prompt-injection, generative-ai, llms, claude, exfiltration-attacks, llm-tool-use, lethal-trifecta
AI Summary and Description: Yes
Summary: The introduction of the web-fetch tool in the Claude API illustrates a significant advancement in AI capabilities, particularly for accessing external information. However, this feature raises substantial security concerns regarding data exfiltration and prompt injection risks, necessitating the use of strict safeguards to mitigate potential threats.
Detailed Description: The Claude API’s new web-fetch tool allows users to fetch content from URLs as part of AI responses, which introduces both opportunities and risks.
– **Feature Overview**:
– The web-fetch tool can be activated using a specific beta header, allowing the inclusion of structured parameters that define its function.
– This capability allows Claude to provide more contextual and up-to-date information directly from the web within its conversations with users.
– **Security Considerations**:
– **Data Exfiltration Risks**: Enabling this tool in environments with untrusted inputs or sensitive data poses serious risks. Malicious actors could exploit vulnerabilities to extract private information by tricking the model into sending it to an attacker-controlled URL.
– **Controlled URL Fetching**: To mitigate these risks, the web-fetch tool can only retrieve URLs that are explicitly provided by users, derived from the conversation context, or obtained through previous outputs from the web-search or web-fetch functionalities.
– **Prompt Injection Concern**: There is a heightened risk of prompt injection attacks where attackers could embed malicious instructions in user messages. Since user inputs are respected by the fetching mechanism, this vulnerability needs careful monitoring.
– **Recommended Safeguards**:
– While Claude does restrict fetching arbitrary URLs, its reliance on potentially unsafe user-generated content remains a concern. Anthropic provides a stronger security feature through the allow-listing of domains, meaning only specified, safe domains can be accessed by the tool.
– Implementing allow-lists effectively can help in constructing secure applications on top of the web-fetch functionality, but identifying truly safe domains poses significant challenges.
– **Implications**:
– Security and compliance professionals should carefully evaluate the integration of such features within their systems. This includes assessing the level of control they have over allowed URLs and the associated risk of prompt injections.
– Organizations must devise strategies to ensure the safe usage of advanced AI tools while maintaining data privacy and security prerequisites.
In conclusion, while the web-fetch tool in the Claude API opens up new avenues for AI, it necessitates diligent attention to security frameworks to avert potential threats related to data integrity and confidentiality.