Tag: tasks

  • Simon Willison’s Weblog: AbsenceBench: Language Models Can’t Tell What’s Missing

    Source URL: https://simonwillison.net/2025/Jun/20/absencebench/#atom-everything Source: Simon Willison’s Weblog Title: AbsenceBench: Language Models Can’t Tell What’s Missing Feedly Summary: AbsenceBench: Language Models Can’t Tell What’s Missing Here’s another interesting result to file under the “jagged frontier" of LLMs, where their strengths and weaknesses are often unintuitive. Long context models have been getting increasingly good at passing "Needle…

  • Simon Willison’s Weblog: Mistral-Small 3.2

    Source URL: https://simonwillison.net/2025/Jun/20/mistral-small-32/ Source: Simon Willison’s Weblog Title: Mistral-Small 3.2 Feedly Summary: Mistral-Small 3.2 Released on Hugging Face a couple of hours ago, so far there aren’t any quantizations to run it on a Mac but I’m sure those will emerge pretty quickly. This is a minor bump to Mistral Small 3.1, one of my…

  • Simon Willison’s Weblog: Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk

    Source URL: https://simonwillison.net/2025/Jun/19/atlassian-prompt-injection-mcp/ Source: Simon Willison’s Weblog Title: Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk Feedly Summary: Cato CTRL™ Threat Research: PoC Attack Targeting Atlassian’s Model Context Protocol (MCP) Introduces New “Living off AI” Risk Stop me if you’ve heard this one before: A…

  • Simon Willison’s Weblog: How OpenElections Uses LLMs

    Source URL: https://simonwillison.net/2025/Jun/19/how-openelections-uses-llms/#atom-everything Source: Simon Willison’s Weblog Title: How OpenElections Uses LLMs Feedly Summary: How OpenElections Uses LLMs The OpenElections project collects detailed election data for the USA, all the way down to the precinct level. This is a surprisingly hard problem: while county and state-level results are widely available, precinct-level results are published in…

  • Security Today: Cloud Security Alliance Brings AI-Assisted Auditing to Cloud Computing

    Source URL: https://news.google.com/rss/articles/CBMi3wFBVV95cUxPNUxPT19wWVJuMXo0RWFnbGc5TUg5Z3o1QXlma2dTMXJhZldSLWZqTWg0TEJtb3NWUEo3bUczQ2lTUW9aVW11SXVQZ0E4UzR2WXRGX2xzelZaTVl2SHc2MUJvV2NScXNuUnJPNWktSmRYc1RHdjY3dE5obzcyRDZlSEdIVEo0V2NJcm1HTWU2emp4SnR2bzY4V1BGc2hUN044RmVrb2JsVWRMRDVTQm93VjVMam9nSEhyT0FmbGdzRTZoTDh0cW5LTkVEanI2dS1iMnVvTEhLa3ZZdDZZZUVJ?oc=5 Source: Security Today Title: Cloud Security Alliance Brings AI-Assisted Auditing to Cloud Computing Feedly Summary: Cloud Security Alliance Brings AI-Assisted Auditing to Cloud Computing AI Summary and Description: Yes Summary: The Cloud Security Alliance’s introduction of AI-assisted auditing for cloud computing signifies a pivotal advancement in enhancing cloud security measures. This development…

  • Simon Willison’s Weblog: Coding agents require skilled operators

    Source URL: https://simonwillison.net/2025/Jun/18/coding-agents/#atom-everything Source: Simon Willison’s Weblog Title: Coding agents require skilled operators Feedly Summary: I wrote this recently in a conversation about whether coding agents can work as a replacement for human programmers. The “agentic" coding tools we have right now work like this: A skilled individual with both deep domain understanding and deep…

  • Simon Willison’s Weblog: Trying out the new Gemini 2.5 model family

    Source URL: https://simonwillison.net/2025/Jun/17/gemini-2-5/ Source: Simon Willison’s Weblog Title: Trying out the new Gemini 2.5 model family Feedly Summary: After many months of previews, Gemini 2.5 Pro and Flash have reached general availability with new, memorable model IDs: gemini-2.5-pro and gemini-2.5-flash. They are joined by a new preview model with an unmemorable name: gemini-2.5-flash-lite-preview-06-17 is a…

  • Cloud Blog: Gemini momentum continues with launch of 2.5 Flash-Lite and general availability of 2.5 Flash and Pro on Vertex AI

    Source URL: https://cloud.google.com/blog/products/ai-machine-learning/gemini-2-5-flash-lite-flash-pro-ga-vertex-ai/ Source: Cloud Blog Title: Gemini momentum continues with launch of 2.5 Flash-Lite and general availability of 2.5 Flash and Pro on Vertex AI Feedly Summary: The momentum of the Gemini 2.5 era continues to build. Following our recent announcements, we’re empowering enterprise builders and developers with even greater access to the intelligence,…