The Cloudflare Blog: From Googlebot to GPTBot: who’s crawling your site in 2025

Source URL: https://blog.cloudflare.com/from-googlebot-to-gptbot-whos-crawling-your-site-in-2025/
Source: The Cloudflare Blog
Title: From Googlebot to GPTBot: who’s crawling your site in 2025

Feedly Summary: From May 2024 to May 2025, crawler traffic rose 18%, with GPTBot growing 305% and Googlebot 96%.

AI Summary and Description: Yes

Summary: The text discusses the evolution of web crawlers, particularly focusing on the rise of AI crawlers used for training large language models (LLMs). It highlights significant trends in the web crawling landscape over the year 2024-2025, including the dominance of crawlers from OpenAI and Google, along with rising challenges concerning content rights and data privacy. This analysis is particularly relevant for professionals in AI, cloud, and information security due to the implications of AI crawlers on web infrastructure and compliance.

Detailed Description: The content articulates the transformative impact of AI on web crawling, detailing how traditional web crawlers have adapted to new challenges and opportunities in the era of AI and LLMs. Key points include:

– **Historical Context**: The role of web crawlers in indexing the web and assisting in search engine functionality is cited, showcasing their significance in internet infrastructure since the early 1990s.

– **Rise of AI Crawlers**: A new category of crawlers, termed AI crawlers, has emerged, designed to collect data for AI model training. These have raised concerns over content rights and the unauthorized use of website data.

– **Trends and Statistics**:
– AI and search crawling traffic increased by 18% from May 2024 to May 2025.
– Googlebot showed a remarkable growth of 96% in the same period, reflecting increased indexing capabilities aligned with new AI features launched by Google.
– Notable shifts in the AI crawler space include:
– GPTBot from OpenAI surged from 2.2% market share to 7.7%, a 305% increase in requests.
– ClaudeBot from Anthropic decreased significantly from 11.7% to 5.4%.

– **Webmasters’ Responses**: The text mentions that some site owners are using robots.txt files and firewalls to control access by AI crawlers. There is a growing trend toward enforceable protections over passive measures due to the challenges AI crawlers pose.

– **Future Implications**: As AI crawlers reshape the landscape of web content access, websites must find a balance between leveraging visibility through AI and protecting their content rights.

This analysis points to a major evolution in how data is collected and controlled, driven by AI advancements, facilitating a complex relationship between webmasters and AI technology providers. Security and compliance professionals will need to consider new strategies for data governance and rights management in this changing technological environment.