Tag: gs

  • The Register: Search-capable AI agents may cheat on benchmark tests

    Source URL: https://www.theregister.com/2025/08/23/searchcapable_ai_agents_may_cheat/ Source: The Register Title: Search-capable AI agents may cheat on benchmark tests Feedly Summary: Data contamination can make models seem more capable than they really are Researchers with Scale AI have found that search-based AI models may cheat on benchmark tests by fetching the answers directly from online sources rather than deriving…

  • Simon Willison’s Weblog: ChatGPT release notes: Project-only memory

    Source URL: https://simonwillison.net/2025/Aug/22/project-memory/#atom-everything Source: Simon Willison’s Weblog Title: ChatGPT release notes: Project-only memory Feedly Summary: ChatGPT release notes: Project-only memory The feature I’ve most wanted from ChatGPT’s memory feature (the newer version of memory that automatically includes relevant details from summarized prior conversations) just landed: With project-only memory enabled, ChatGPT can use other conversations in that project…

  • Simon Willison’s Weblog: DeepSeek 3.1

    Source URL: https://simonwillison.net/2025/Aug/22/deepseek-31/#atom-everything Source: Simon Willison’s Weblog Title: DeepSeek 3.1 Feedly Summary: DeepSeek 3.1 The latest model from DeepSeek, a 685B monster (like DeepSeek v3 before it) but this time it’s a hybrid reasoning model. DeepSeek claim: DeepSeek-V3.1-Think achieves comparable answer quality to DeepSeek-R1-0528, while responding more quickly. Drew Breunig points out that their benchmarks…

  • Simon Willison’s Weblog: too many model context protocol servers and LLM allocations on the dance floor

    Source URL: https://simonwillison.net/2025/Aug/22/too-many-mcps/#atom-everything Source: Simon Willison’s Weblog Title: too many model context protocol servers and LLM allocations on the dance floor Feedly Summary: too many model context protocol servers and LLM allocations on the dance floor Useful reminder from Geoffrey Huntley of the infrequently discussed significant token cost of using MCP. Geoffrey estimate estimates that…

  • Cloud Blog: Don’t just speculate, investigate! Gemini Cloud Assist now offers root-cause analysis

    Source URL: https://cloud.google.com/blog/products/management-tools/gemini-cloud-assist-investigations-performs-root-cause-analysis/ Source: Cloud Blog Title: Don’t just speculate, investigate! Gemini Cloud Assist now offers root-cause analysis Feedly Summary: Debugging in a complex, distributed cloud environment can feel like searching for a needle in a haystack. The sheer volume of data, intertwined dependencies, and ephemeral issues make traditional troubleshooting methods time-consuming and often reactive.…

  • The Register: Fake CAPTCHA tests trick users into running malware

    Source URL: https://www.theregister.com/2025/08/22/clickfix_report/ Source: The Register Title: Fake CAPTCHA tests trick users into running malware Feedly Summary: ClickFix tricks Microsoft’s security team has published an in-depth report into ClickFix, the social engineering attack which tricks users into executing malicious commands in the guise of proving their humanity.… AI Summary and Description: Yes Summary: Microsoft’s security…

  • Schneier on Security: AI Agents Need Data Integrity

    Source URL: https://www.schneier.com/blog/archives/2025/08/ai-agents-need-data-integrity.html Source: Schneier on Security Title: AI Agents Need Data Integrity Feedly Summary: Think of the Web as a digital territory with its own social contract. In 2014, Tim Berners-Lee called for a “Magna Carta for the Web” to restore the balance of power between individuals and institutions. This mirrors the original charter’s…

  • The Register: Don’t cave to Euro censorship or backdoor demands, Uncle Sam warns US tech firms

    Source URL: https://www.theregister.com/2025/08/22/ftc_us_censorship/ Source: The Register Title: Don’t cave to Euro censorship or backdoor demands, Uncle Sam warns US tech firms Feedly Summary: FTC chair: Companies could face enforcement if they give in The head of America’s consumer watchdog has issued a stark warning to some of the biggest names in the tech sphere –…

  • The Register: DeepSeek’s new V3.1 release points to potent new Chinese chips coming soon

    Source URL: https://www.theregister.com/2025/08/22/deepseek_v31_chinese_chip_hints/ Source: The Register Title: DeepSeek’s new V3.1 release points to potent new Chinese chips coming soon Feedly Summary: Point release retuned with new FP8 datatype for better compatibility with homegrown silicon Chinese AI darling DeepSeek unveiled an update to its flagship large language model that the company claims is already optimized for…