DeepSeek – Page 5 – Experimental News Clipping Site

Simon Willison’s Weblog: How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM

May 31, 2025

—

by

Source URL: https://simonwillison.net/2025/May/31/snitchbench-with-llm/#atom-everything Source: Simon Willison’s Weblog Title: How often do LLMs snitch? Recreating Theo’s SnitchBench with LLM Feedly Summary: A fun new benchmark just dropped! Inspired by the Claude 4 system card – which showed that Claude 4 might just rat you out to the authorities if you told it to “take initiative" in…

Simon Willison’s Weblog: deepseek-ai/DeepSeek-R1-0528

May 31, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/May/31/deepseek-aideepseek-r1-0528/ Source: Simon Willison’s Weblog Title: deepseek-ai/DeepSeek-R1-0528 Feedly Summary: deepseek-ai/DeepSeek-R1-0528 Sadly the trend for terrible naming of models has infested the Chinese AI labs as well. DeepSeek-R1-0528 is a brand new and much improved open weights reasoning model from DeepSeek, a major step up from the DeepSeek R1 they released back in January.…

Simon Willison’s Weblog: Talking AI and jobs with Natasha Zouves for News Nation

May 30, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/May/30/ai-and-jobs-with-natasha-zouves/#atom-everything Source: Simon Willison’s Weblog Title: Talking AI and jobs with Natasha Zouves for News Nation Feedly Summary: I was interviewed by News Nation’s Natasha Zouves about the very complicated topic of how we should think about AI in terms of threatening our jobs and careers. I previously talked with Natasha two years…

Slashdot: Researchers Warn Against Treating AI Outputs as Human-Like Reasoning

May 29, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://tech.slashdot.org/story/25/05/29/1411236/researchers-warn-against-treating-ai-outputs-as-human-like-reasoning?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Researchers Warn Against Treating AI Outputs as Human-Like Reasoning Feedly Summary: AI Summary and Description: Yes Summary: Researchers at Arizona State University are challenging the misconception of AI language models’ intermediate outputs as “reasoning” or “thinking.” They argue that this anthropomorphization can mislead users about AI’s actual functioning, highlighting…

Simon Willison’s Weblog: Devstral

May 21, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/May/21/devstral/#atom-everything Source: Simon Willison’s Weblog Title: Devstral Feedly Summary: Devstral New Apache 2.0 licensed LLM release from Mistral, this time specifically trained for code. Devstral achieves a score of 46.8% on SWE-Bench Verified, outperforming prior open-source SoTA models by more than 6% points. When evaluated under the same test scaffold (OpenHands, provided by…

Cloud Blog: AI Hypercomputer developer experience enhancements from Q1 25: build faster, scale bigger

May 16, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://cloud.google.com/blog/products/compute/ai-hypercomputer-enhancements-for-the-developer/ Source: Cloud Blog Title: AI Hypercomputer developer experience enhancements from Q1 25: build faster, scale bigger Feedly Summary: Building cutting-edge AI models is exciting, whether you’re iterating in your notebook or orchestrating large clusters. However, scaling up training can present significant challenges, including navigating complex infrastructure, configuring software and dependencies across numerous…

Simon Willison’s Weblog: Gemini 2.5 Models now support implicit caching

May 9, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/May/9/gemini-implicit-caching/#atom-everything Source: Simon Willison’s Weblog Title: Gemini 2.5 Models now support implicit caching Feedly Summary: Gemini 2.5 Models now support implicit caching I just spotted a cacheTokensDetails key in the token usage JSON while running a long chain of prompts against Gemini 2.5 Flash – despite not configuring caching myself: {“cachedContentTokenCount": 200658, "promptTokensDetails":…

Simon Willison’s Weblog: What people get wrong about the leading Chinese open models: Adoption and censorship

May 6, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://simonwillison.net/2025/May/6/what-people-get-wrong-about-the-leading-chinese-models/#atom-everything Source: Simon Willison’s Weblog Title: What people get wrong about the leading Chinese open models: Adoption and censorship Feedly Summary: What people get wrong about the leading Chinese open models: Adoption and censorship While I’ve been enjoying trying out Alibaba’s Qwen 3 a lot recently, Nathan Lambert focuses on the elephant in…

Slashdot: South Korea Says DeepSeek Transferred User Data, Prompts Without Consent

Apr 24, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://slashdot.org/story/25/04/24/2021250/south-korea-says-deepseek-transferred-user-data-prompts-without-consent?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: South Korea Says DeepSeek Transferred User Data, Prompts Without Consent Feedly Summary: AI Summary and Description: Yes Summary: South Korea’s data protection authority has raised significant concerns regarding DeepSeek, a Chinese AI startup, for illegally transferring user information without consent. This incident highlights critical issues surrounding data privacy and…

Slashdot: Google Says DOJ Breakup Would Harm US In ‘Global Race With China’

Apr 22, 2025

—

by

Kurt Seifried

in Uncategorized

Source URL: https://tech.slashdot.org/story/25/04/22/0137218/google-says-doj-breakup-would-harm-us-in-global-race-with-china?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Google Says DOJ Breakup Would Harm US In ‘Global Race With China’ Feedly Summary: AI Summary and Description: Yes Summary: Google is contending that the U.S. Department of Justice’s (DOJ) move to break up its Chrome and Android businesses could undermine national security and hinder America’s competitive edge in…

Tag: DeepSeek