Tag: web scraping
-
Hacker News: FOSS infrastructure is under attack by AI companies
Source URL: https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/ Source: Hacker News Title: FOSS infrastructure is under attack by AI companies Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses recent disruptions faced by open-source projects due to aggressive AI crawlers that disregard robots.txt protocols, leading to significant operations challenges and increased workloads for system administrators. It highlights…
-
Slashdot: AI Crawlers Haven’t Learned To Play Nice With Websites
Source URL: https://slashdot.org/story/25/03/19/1027251/ai-crawlers-havent-learned-to-play-nice-with-websites?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: AI Crawlers Haven’t Learned To Play Nice With Websites Feedly Summary: AI Summary and Description: Yes Summary: SourceHut is experiencing service disruptions due to aggressive web crawling by AI companies collecting data for training large language models (LLMs). They have implemented mitigations, including blocking certain cloud providers due to…
-
The Register: We did not have Brave clashing with Rupert Murdoch on our 2025 bingo card, but there it is
Source URL: https://www.theregister.com/2025/03/13/brave_news_corp_content/ Source: The Register Title: We did not have Brave clashing with Rupert Murdoch on our 2025 bingo card, but there it is Feedly Summary: Indie browser maker asks judge for legal shield against copyright threats over AI summaries Brave has gone to court to head off potential legal action from News Corp…
-
Hacker News: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt
Source URL: https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/ Source: Hacker News Title: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the creation of a new malware named Nepenthes, designed by a software developer to combat AI web crawlers that ignore “no scraping” directives…