web crawler – Page 3 – Experimental News Clipping Site

Hacker News: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt

Jan 28, 2025

—

by

Source URL: https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/ Source: Hacker News Title: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the creation of a new malware named Nepenthes, designed by a software developer to combat AI web crawlers that ignore “no scraping” directives…

Slashdot: Developer Creates Infinite Maze That Traps AI Training Bots

Jan 23, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://slashdot.org/story/25/01/23/2135205/developer-creates-infinite-maze-that-traps-ai-training-bots?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Developer Creates Infinite Maze That Traps AI Training Bots Feedly Summary: AI Summary and Description: Yes Summary: The text discusses the development of an open-source program called Nepenthes, designed to trap AI web crawlers in an endless loop of link generation, effectively wasting their resources. This innovative approach provides…

Hacker News: Nepenthes is a tarpit to catch AI web crawlers

Jan 16, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://zadzmo.org/code/nepenthes/ Source: Hacker News Title: Nepenthes is a tarpit to catch AI web crawlers Feedly Summary: Comments AI Summary and Description: Yes Summary: The text describes “Nepenthes,” a tarpit software devised to trap web crawlers, particularly those scraping data for large language models (LLMs). It offers unique functionalities and deployment setups, with explicit…

Hacker News: Understanding Ruby 3.3 Concurrency: A Comprehensive Guide

Nov 8, 2024

—

by

system automation

in Uncategorized

Source URL: https://blog.bestwebventures.in/understanding-ruby-concurrency-a-comprehensive-guide Source: Hacker News Title: Understanding Ruby 3.3 Concurrency: A Comprehensive Guide Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text provides an in-depth exploration of Ruby 3.3’s enhanced concurrency capabilities, which are critical for developing efficient applications in AI and machine learning. With improved concurrency models like Ractors, Threads, and…

Hacker News: TikTok parent launched a scraper gobbling up world’s data 25x faster than OpenAI

Oct 7, 2024

—

by

system automation

in Uncategorized

Source URL: https://fortune.com/2024/10/03/bytedance-tiktok-bytespider-scraper-bot/ Source: Hacker News Title: TikTok parent launched a scraper gobbling up world’s data 25x faster than OpenAI Feedly Summary: Comments AI Summary and Description: Yes Summary: ByteDance’s aggressive data scraping through its web crawler, Bytespider, highlights the competitive race in generative AI development between major tech firms, particularly in relation to large…

Tag: web crawler

Hacker News: AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt

Slashdot: Developer Creates Infinite Maze That Traps AI Training Bots

Hacker News: Nepenthes is a tarpit to catch AI web crawlers

Hacker News: Understanding Ruby 3.3 Concurrency: A Comprehensive Guide

Hacker News: TikTok parent launched a scraper gobbling up world’s data 25x faster than OpenAI