Tag: generated
-
The Register: Search-capable AI agents may cheat on benchmark tests
Source URL: https://www.theregister.com/2025/08/23/searchcapable_ai_agents_may_cheat/ Source: The Register Title: Search-capable AI agents may cheat on benchmark tests Feedly Summary: Data contamination can make models seem more capable than they really are Researchers with Scale AI have found that search-based AI models may cheat on benchmark tests by fetching the answers directly from online sources rather than deriving…
-
Schneier on Security: Subverting AIOps Systems Through Poisoned Input Data
Source URL: https://www.schneier.com/blog/archives/2025/08/subverting-aiops-systems-through-poisoned-input-data.html Source: Schneier on Security Title: Subverting AIOps Systems Through Poisoned Input Data Feedly Summary: In this input integrity attack against an AI system, researchers were able to fool AIOps tools: AIOps refers to the use of LLM-based agents to gather and analyze application telemetry, including system logs, performance metrics, traces, and alerts,…
-
Tomasz Tunguz: When One AI Grades Another’s Work
Source URL: https://www.tomtunguz.com/evolution-of-ai-judges-improving-evoblog/ Source: Tomasz Tunguz Title: When One AI Grades Another’s Work Feedly Summary: Since launching EvoBlog internally, I’ve wanted to improve it. One way of doing this is having an LLM judge the best posts rather than a static scoring system. I appointed Gemini 2.5 to be that judge. This post is a…
-
Slashdot: Google’s ‘AI Overview’ Pointed Him to a Customer Number. It Was a Scam
Source URL: https://yro.slashdot.org/story/25/08/18/0223228/googles-ai-overview-pointed-him-to-a-customer-number-it-was-a-scam?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Google’s ‘AI Overview’ Pointed Him to a Customer Number. It Was a Scam Feedly Summary: AI Summary and Description: Yes Summary: The text discusses a scam where a real estate developer was tricked into providing credit card information to an impersonator posing as a customer service representative for a…