Tag: web standards

  • Simon Willison’s Weblog: ChatGPT agent’s user-agent

    Source URL: https://simonwillison.net/2025/Aug/4/chatgpt-agents-user-agent/#atom-everything Source: Simon Willison’s Weblog Title: ChatGPT agent’s user-agent Feedly Summary: I was exploring how ChatGPT agent works today. I learned some interesting things about how it exposes its identity through HTTP headers, then made a huge blunder in thinking it was leaking its URLs to Bingbot and Yandex… but it turned out…

  • The Register: Perplexity AI accused of scraping content against websites’ will with unlisted IP ranges

    Source URL: https://www.theregister.com/2025/08/04/perplexity_ai_crawlers_accused_data_raids/ Source: The Register Title: Perplexity AI accused of scraping content against websites’ will with unlisted IP ranges Feedly Summary: Cloudflare finds AI search biz ignoring crawl prohibitions and trying to hide its spiders Perplexity, an AI search startup, has been spotted trying to disguise its content-scraping bots while flouting websites’ no-crawl directives.……