Tag: scraping

  • Hacker News: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt

    Source URL: https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/ Source: Hacker News Title: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the creation of a new malware named Nepenthes, designed by a software developer to combat AI web crawlers that ignore “no scraping” directives…

  • Hacker News: Texas Is Enforcing Its State Data Privacy Law. So Should Other States

    Source URL: https://www.eff.org/deeplinks/2025/01/texas-enforcing-its-state-data-privacy-law-so-should-other-states Source: Hacker News Title: Texas Is Enforcing Its State Data Privacy Law. So Should Other States Feedly Summary: Comments AI Summary and Description: Yes Summary: The provided text discusses the Texas Attorney General’s lawsuit against Allstate Corporation for alleged violations of the Texas Data Privacy and Security Act (TDPSA), specifically regarding the…

  • Slashdot: Developer Creates Infinite Maze That Traps AI Training Bots

    Source URL: https://slashdot.org/story/25/01/23/2135205/developer-creates-infinite-maze-that-traps-ai-training-bots?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Developer Creates Infinite Maze That Traps AI Training Bots Feedly Summary: AI Summary and Description: Yes Summary: The text discusses the development of an open-source program called Nepenthes, designed to trap AI web crawlers in an endless loop of link generation, effectively wasting their resources. This innovative approach provides…

  • Hacker News: Thoughts on a Month with Devin

    Source URL: https://www.answer.ai/posts/2025-01-08-devin.html Source: Hacker News Title: Thoughts on a Month with Devin Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text provides an in-depth analysis of an AI-driven programming assistant named Devin, highlighting both its potential and failures in software development tasks. The initial successes in API interactions and documentation are contrasted…

  • Hacker News: Nepenthes is a tarpit to catch AI web crawlers

    Source URL: https://zadzmo.org/code/nepenthes/ Source: Hacker News Title: Nepenthes is a tarpit to catch AI web crawlers Feedly Summary: Comments AI Summary and Description: Yes Summary: The text describes “Nepenthes,” a tarpit software devised to trap web crawlers, particularly those scraping data for large language models (LLMs). It offers unique functionalities and deployment setups, with explicit…

  • Slashdot: OpenAI’s Bot Crushes Seven-Person Company’s Website ‘Like a DDoS Attack’

    Source URL: https://tech.slashdot.org/story/25/01/11/0449242/openais-bot-crushes-seven-person-companys-website-like-a-ddos-attack?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: OpenAI’s Bot Crushes Seven-Person Company’s Website ‘Like a DDoS Attack’ Feedly Summary: AI Summary and Description: Yes Summary: The incident highlights serious implications for both security and compliance, showcasing how AI bots can unintentionally cause significant disruptions to online businesses through excessive data scraping. The lack of a properly…

  • Hacker News: OpenAI’s bot crushed this seven-person company’s web site ‘like a DDoS attack’

    Source URL: https://techcrunch.com/2025/01/10/how-openais-bot-crushed-this-seven-person-companys-web-site-like-a-ddos-attack/ Source: Hacker News Title: OpenAI’s bot crushed this seven-person company’s web site ‘like a DDoS attack’ Feedly Summary: Comments AI Summary and Description: Yes Summary: The text highlights a significant incident involving Triplegangers’ CEO Oleksandr Tomchuk, whose e-commerce site was subjected to aggressive scraping by OpenAI’s bot, leading to operational disruptions and…

  • Wired: Meta Secretly Trained Its AI on a Notorious Piracy Database, Newly Unredacted Court Docs Reveal

    Source URL: https://www.wired.com/story/new-documents-unredacted-meta-copyright-ai-lawsuit/ Source: Wired Title: Meta Secretly Trained Its AI on a Notorious Piracy Database, Newly Unredacted Court Docs Reveal Feedly Summary: One of the most important AI copyright legal battles just took a major turn. AI Summary and Description: Yes Summary: Meta has faced a significant legal setback regarding its training practices for…

  • Krebs on Security: Web Hacking Service ‘Araneida’ Tied to Turkish IT Firm

    Source URL: https://krebsonsecurity.com/2024/12/web-hacking-service-araneida-tied-to-turkish-it-firm/ Source: Krebs on Security Title: Web Hacking Service ‘Araneida’ Tied to Turkish IT Firm Feedly Summary: Cybercriminals are selling hundreds of thousands of credential sets stolen with the help of a cracked version of Acunetix, a powerful commercial web app vulnerability scanner, new research finds. The cracked software is being resold as…

  • Slashdot: YouTube Is Letting Creators Opt Into Third-Party AI Training

    Source URL: https://news.slashdot.org/story/24/12/16/2144228/youtube-is-letting-creators-opt-into-third-party-ai-training?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: YouTube Is Letting Creators Opt Into Third-Party AI Training Feedly Summary: AI Summary and Description: Yes Summary: YouTube is enabling creators to allow third-party companies to use their videos for AI model training, with an opt-out default setting, aiming to enhance creator value in the AI landscape. This initiative…