Tag: llm

  • Simon Willison’s Weblog: Agents are models using tools in a loop

    Source URL: https://simonwillison.net/2025/May/22/tools-in-a-loop/#atom-everything Source: Simon Willison’s Weblog Title: Agents are models using tools in a loop Feedly Summary: I was going slightly spare at the fact that every talk at this Anthropic developer conference has used the word “agents" dozens of times, but nobody ever stopped to provide a useful definition. I’m now in the…

  • Simon Willison’s Weblog: llm-anthropic 0.16

    Source URL: https://simonwillison.net/2025/May/22/llm-anthropic-016/#atom-everything Source: Simon Willison’s Weblog Title: llm-anthropic 0.16 Feedly Summary: llm-anthropic 0.16 New release of my LLM plugin for Anthropic adding the new Claude 4 Opus and Sonnet models. You can see pelicans on bicycles generated using the new plugin at the bottom of my live blog covering the release. I also released…

  • Simon Willison’s Weblog: Live blog: Claude 4 launch at Code with Claude

    Source URL: https://simonwillison.net/2025/May/22/code-with-claude-live-blog/ Source: Simon Willison’s Weblog Title: Live blog: Claude 4 launch at Code with Claude Feedly Summary: I’m at Anthropic’s Code with Claude event, where they are launching Claude 4. I’ll be live blogging the keynote here. Tags: llm-release, liveblogging, anthropic, claude, generative-ai, ai, llms AI Summary and Description: Yes Summary: The text…

  • Cloud Blog: Train AI for less: Improve ML Goodput with elastic training and optimized checkpointing

    Source URL: https://cloud.google.com/blog/products/ai-machine-learning/elastic-training-and-optimized-checkpointing-improve-ml-goodput/ Source: Cloud Blog Title: Train AI for less: Improve ML Goodput with elastic training and optimized checkpointing Feedly Summary: Want to save some money on large AI training? For a typical PyTorch LLM training workload that spans thousands of accelerators for several weeks, a 1% improvement in ML Goodput can translate to…

  • Simon Willison’s Weblog: Devstral

    Source URL: https://simonwillison.net/2025/May/21/devstral/#atom-everything Source: Simon Willison’s Weblog Title: Devstral Feedly Summary: Devstral New Apache 2.0 licensed LLM release from Mistral, this time specifically trained for code. Devstral achieves a score of 46.8% on SWE-Bench Verified, outperforming prior open-source SoTA models by more than 6% points. When evaluated under the same test scaffold (OpenHands, provided by…

  • Simon Willison’s Weblog: Gemini Diffusion

    Source URL: https://simonwillison.net/2025/May/21/gemini-diffusion/ Source: Simon Willison’s Weblog Title: Gemini Diffusion Feedly Summary: Gemini Diffusion Another of the announcements from Google I/O yesterday was Gemini Diffusion, Google’s first LLM to use diffusion (similar to image models like Imagen and Stable Diffusion) in place of transformers. Google describe it like this: Traditional autoregressive language models generate text…

  • The Register: Microsoft-backed AI out-forecasts hurricane experts without crunching the physics

    Source URL: https://www.theregister.com/2025/05/21/earth_system_model_hurricane_forecast/ Source: The Register Title: Microsoft-backed AI out-forecasts hurricane experts without crunching the physics Feedly Summary: LLM trained on decades of weather data claimed to be faster, and cheaper Scientists have developed a machine learning model that can outperform official agencies at predicting tropical cyclone tracks, and do it faster and cheaper than…

  • Tomasz Tunguz: My Prompt, My Reality

    Source URL: https://www.tomtunguz.com/user-perception-quality/ Source: Tomasz Tunguz Title: My Prompt, My Reality Feedly Summary: “Now with LLMs, a bunch of the perceived quality depends on your prompt. So you have users that are prompting with different skills or different level of skills. And the outcome of that prompt may be perceived as low quality, but that’s…