Tag: model licensing

  • Simon Willison’s Weblog: deepseek-ai/DeepSeek-R1-0528

    Source URL: https://simonwillison.net/2025/May/31/deepseek-aideepseek-r1-0528/ Source: Simon Willison’s Weblog Title: deepseek-ai/DeepSeek-R1-0528 Feedly Summary: deepseek-ai/DeepSeek-R1-0528 Sadly the trend for terrible naming of models has infested the Chinese AI labs as well. DeepSeek-R1-0528 is a brand new and much improved open weights reasoning model from DeepSeek, a major step up from the DeepSeek R1 they released back in January.…

  • Simon Willison’s Weblog: DeepSeek-R1 and exploring DeepSeek-R1-Distill-Llama-8B

    Source URL: https://simonwillison.net/2025/Jan/20/deepseek-r1/ Source: Simon Willison’s Weblog Title: DeepSeek-R1 and exploring DeepSeek-R1-Distill-Llama-8B Feedly Summary: DeepSeek are the Chinese AI lab who dropped the best currently available open weights LLM on Christmas day, DeepSeek v3. That model was trained in part using their unreleased R1 “reasoning" model. Today they’ve released R1 itself, along with a whole…

  • Hacker News: SmolLM2

    Source URL: https://simonwillison.net/2024/Nov/2/smollm2/ Source: Hacker News Title: SmolLM2 Feedly Summary: Comments AI Summary and Description: Yes Summary: The text introduces SmolLM2, a new family of compact language models from Hugging Face, designed for lightweight on-device operations. The models, which range from 135M to 1.7B parameters, were trained on 11 trillion tokens across diverse datasets, showcasing…