Tag: data scraping
-
The Register: Cloudflare builds an AI to lead AI scraper bots into a horrible maze of junk content
Source URL: https://www.theregister.com/2025/03/21/cloudflare_ai_labyrinth/ Source: The Register Title: Cloudflare builds an AI to lead AI scraper bots into a horrible maze of junk content Feedly Summary: Slop-making machine will feed unauthorized scrapers what they so richly deserve, hopefully without poisoning the internet Cloudflare has created a bot-busting AI to make life hell for AI crawlers.… AI…
-
Slashdot: AI Crawlers Haven’t Learned To Play Nice With Websites
Source URL: https://slashdot.org/story/25/03/19/1027251/ai-crawlers-havent-learned-to-play-nice-with-websites?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: AI Crawlers Haven’t Learned To Play Nice With Websites Feedly Summary: AI Summary and Description: Yes Summary: SourceHut is experiencing service disruptions due to aggressive web crawling by AI companies collecting data for training large language models (LLMs). They have implemented mitigations, including blocking certain cloud providers due to…
-
The Cloudflare Blog: Trapping misbehaving bots in an AI Labyrinth
Source URL: https://blog.cloudflare.com/ai-labyrinth/ Source: The Cloudflare Blog Title: Trapping misbehaving bots in an AI Labyrinth Feedly Summary: How Cloudflare uses generative AI to slow down, confuse, and waste the resources of AI Crawlers and other bots that don’t respect “no crawl” directives. AI Summary and Description: Yes Summary: The text introduces Cloudflare’s “AI Labyrinth,” an…
-
The Register: AI crawlers haven’t learned to play nice with websites
Source URL: https://www.theregister.com/2025/03/18/ai_crawlers_sourcehut/ Source: The Register Title: AI crawlers haven’t learned to play nice with websites Feedly Summary: SourceHut says it’s getting DDoSed by LLM bots SourceHut, an open source git-hosting service, says web crawlers for AI companies are slowing down services through their excessive demands for data.… AI Summary and Description: Yes Summary: The…
-
Hacker News: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt
Source URL: https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/ Source: Hacker News Title: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the creation of a new malware named Nepenthes, designed by a software developer to combat AI web crawlers that ignore “no scraping” directives…
-
Hacker News: Nepenthes is a tarpit to catch AI web crawlers
Source URL: https://zadzmo.org/code/nepenthes/ Source: Hacker News Title: Nepenthes is a tarpit to catch AI web crawlers Feedly Summary: Comments AI Summary and Description: Yes Summary: The text describes “Nepenthes,” a tarpit software devised to trap web crawlers, particularly those scraping data for large language models (LLMs). It offers unique functionalities and deployment setups, with explicit…