Tag: robots.txt

  • Hacker News: Please stop externalizing your costs directly into my face

    Source URL: https://drewdevault.com/2025/03/17/2025-03-17-Stop-externalizing-your-costs-on-me.html Source: Hacker News Title: Please stop externalizing your costs directly into my face Feedly Summary: Comments AI Summary and Description: Yes Summary: The text reflects a sysadmin’s frustration with the disruptive impact of LLM crawlers on operational stability. It discusses ongoing battles against the misuse of computing resources by malicious bots, underscoring…

  • The Register: AI crawlers haven’t learned to play nice with websites

    Source URL: https://www.theregister.com/2025/03/18/ai_crawlers_sourcehut/ Source: The Register Title: AI crawlers haven’t learned to play nice with websites Feedly Summary: SourceHut says it’s getting DDoSed by LLM bots SourceHut, an open source git-hosting service, says web crawlers for AI companies are slowing down services through their excessive demands for data.… AI Summary and Description: Yes Summary: The…

  • Slashdot: BlueSky Proposes ‘New Standard’ for When Scraping Data for AI Training

    Source URL: https://tech.slashdot.org/story/25/03/17/0434237/bluesky-proposes-new-standard-for-when-scraping-data-for-ai-training?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: BlueSky Proposes ‘New Standard’ for When Scraping Data for AI Training Feedly Summary: AI Summary and Description: Yes Summary: The article discusses Bluesky’s proposal for user data consent regarding scraping for generative AI training and archiving. This initiative signifies a potential shift in how user data privacy is managed…

  • The Cloudflare Blog: No hallucinations here: track the latest AI trends with expanded insights on Cloudflare Radar

    Source URL: https://blog.cloudflare.com/expanded-ai-insights-on-cloudflare-radar/ Source: The Cloudflare Blog Title: No hallucinations here: track the latest AI trends with expanded insights on Cloudflare Radar Feedly Summary: Today, we are launching a new dedicated “AI Insights” page on Cloudflare Radar that incorporates this graph and builds on it with additional metrics. AI Summary and Description: Yes **Short Summary…

  • Hacker News: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt

    Source URL: https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/ Source: Hacker News Title: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the creation of a new malware named Nepenthes, designed by a software developer to combat AI web crawlers that ignore “no scraping” directives…

  • Slashdot: OpenAI’s Bot Crushes Seven-Person Company’s Website ‘Like a DDoS Attack’

    Source URL: https://tech.slashdot.org/story/25/01/11/0449242/openais-bot-crushes-seven-person-companys-website-like-a-ddos-attack?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: OpenAI’s Bot Crushes Seven-Person Company’s Website ‘Like a DDoS Attack’ Feedly Summary: AI Summary and Description: Yes Summary: The incident highlights serious implications for both security and compliance, showcasing how AI bots can unintentionally cause significant disruptions to online businesses through excessive data scraping. The lack of a properly…

  • Hacker News: OpenAI’s bot crushed this seven-person company’s web site ‘like a DDoS attack’

    Source URL: https://techcrunch.com/2025/01/10/how-openais-bot-crushed-this-seven-person-companys-web-site-like-a-ddos-attack/ Source: Hacker News Title: OpenAI’s bot crushed this seven-person company’s web site ‘like a DDoS attack’ Feedly Summary: Comments AI Summary and Description: Yes Summary: The text highlights a significant incident involving Triplegangers’ CEO Oleksandr Tomchuk, whose e-commerce site was subjected to aggressive scraping by OpenAI’s bot, leading to operational disruptions and…

  • Hacker News: The Rise of the AI Crawler

    Source URL: https://vercel.com/blog/the-rise-of-the-ai-crawler Source: Hacker News Title: The Rise of the AI Crawler Feedly Summary: Comments AI Summary and Description: Yes Summary: The text analyzes traffic and behaviors of AI crawlers such as OpenAI’s GPTBot and Anthropic’s Claude, revealing their significant presence and operation patterns on the web. Insights include their JavaScript rendering limitations, content…

  • The Cloudflare Blog: Robotcop: enforcing your robots.txt policies and stopping bots before they reach your website

    Source URL: https://blog.cloudflare.com/ai-audit-enforcing-robots-txt Source: The Cloudflare Blog Title: Robotcop: enforcing your robots.txt policies and stopping bots before they reach your website Feedly Summary: Today, the AI Audit dashboard gets an upgrade: you can now quickly see which AI services are honoring your robots.txt policies and then automatically enforce the policies against those that aren’t. AI…