Tag: transformers

  • Simon Willison’s Weblog: Load Llama-3.2 WebGPU in your browser from a local folder

    Source URL: https://simonwillison.net/2025/Sep/8/webgpu-local-folder/#atom-everything Source: Simon Willison’s Weblog Title: Load Llama-3.2 WebGPU in your browser from a local folder Feedly Summary: Load Llama-3.2 WebGPU in your browser from a local folder Inspired by a comment on Hacker News I decided to see if it was possible to modify the transformers.js-examples/tree/main/llama-3.2-webgpu Llama 3.2 chat demo (online here,…

  • Simon Willison’s Weblog: Introducing EmbeddingGemma

    Source URL: https://simonwillison.net/2025/Sep/4/embedding-gemma/#atom-everything Source: Simon Willison’s Weblog Title: Introducing EmbeddingGemma Feedly Summary: Introducing EmbeddingGemma Brand new open weights (under the slightly janky Gemma license) 308M parameter embedding model from Google: Based on the Gemma 3 architecture, EmbeddingGemma is trained on 100+ languages and is small enough to run on less than 200MB of RAM with…

  • Simon Willison’s Weblog: Introducing Gemma 3 270M: The compact model for hyper-efficient AI

    Source URL: https://simonwillison.net/2025/Aug/14/gemma-3-270m/#atom-everything Source: Simon Willison’s Weblog Title: Introducing Gemma 3 270M: The compact model for hyper-efficient AI Feedly Summary: Introducing Gemma 3 270M: The compact model for hyper-efficient AI New from Google: Gemma 3 270M, a compact, 270-million parameter model designed from the ground up for task-specific fine-tuning with strong instruction-following and text structuring…

  • The Register: Boffins detail new algorithms to losslessly boost AI perf by up to 2.8x

    Source URL: https://www.theregister.com/2025/07/17/new_algorithms_boost_ai_perf/ Source: The Register Title: Boffins detail new algorithms to losslessly boost AI perf by up to 2.8x Feedly Summary: New spin on speculative decoding works with any model – now built into Transformers We all know that AI is expensive, but a new set of algorithms developed by researchers at the Weizmann…

  • Simon Willison’s Weblog: AbsenceBench: Language Models Can’t Tell What’s Missing

    Source URL: https://simonwillison.net/2025/Jun/20/absencebench/#atom-everything Source: Simon Willison’s Weblog Title: AbsenceBench: Language Models Can’t Tell What’s Missing Feedly Summary: AbsenceBench: Language Models Can’t Tell What’s Missing Here’s another interesting result to file under the “jagged frontier" of LLMs, where their strengths and weaknesses are often unintuitive. Long context models have been getting increasingly good at passing "Needle…