Tag: smaller models
-
Docker: Fine-Tuning Local Models with Docker Offload and Unsloth
Source URL: https://www.docker.com/blog/fine-tuning-models-with-offload-and-unsloth/ Source: Docker Title: Fine-Tuning Local Models with Docker Offload and Unsloth Feedly Summary: I’ve been experimenting with local models for a while now, and the progress in making them accessible has been exciting. Initial experiences are often fantastic, many models, like Gemma 3 270M, are lightweight enough to run on common hardware.…
-
The Register: Little LLM on the RAM: Google’s Gemma 270M hits the scene
Source URL: https://www.theregister.com/2025/08/15/little_llm_on_the_ram/ Source: The Register Title: Little LLM on the RAM: Google’s Gemma 270M hits the scene Feedly Summary: A tiny model trained on trillions of tokens, ready for specialized tasks Google has unveiled a pint-sized new addition to its “open" large language model lineup: Gemma 3 270M.… AI Summary and Description: Yes Summary:…
-
The Register: How OpenAI used a new data type to cut inference costs by 75%
Source URL: https://www.theregister.com/2025/08/10/openai_mxfp4/ Source: The Register Title: How OpenAI used a new data type to cut inference costs by 75% Feedly Summary: Decision to use MXFP4 makes models smaller, faster, and more importantly, cheaper for everyone involved Analysis Whether or not OpenAI’s new open weights models are any good is still up for debate, but…
-
Simon Willison’s Weblog: Qwen3-4B Instruct and Thinking
Source URL: https://simonwillison.net/2025/Aug/6/qwen3-4b-instruct-and-thinking/ Source: Simon Willison’s Weblog Title: Qwen3-4B Instruct and Thinking Feedly Summary: Qwen3-4B Instruct and Thinking Yet another interesting model from Qwen—these are tiny compared to their other recent releases (just 4B parameters, 7.5GB on Hugging Face and even smaller when quantized) but with a 262,144 context length, which Qwen suggest is essential…