Tag: local deployment

  • Simon Willison’s Weblog: Mistral-Small 3.2

    Source URL: https://simonwillison.net/2025/Jun/20/mistral-small-32/ Source: Simon Willison’s Weblog Title: Mistral-Small 3.2 Feedly Summary: Mistral-Small 3.2 Released on Hugging Face a couple of hours ago, so far there aren’t any quantizations to run it on a Mac but I’m sure those will emerge pretty quickly. This is a minor bump to Mistral Small 3.1, one of my…

  • Docker: How to Build, Run, and Package AI Models Locally with Docker Model Runner

    Source URL: https://www.docker.com/blog/how-to-build-run-and-package-ai-models-locally-with-docker-model-runner/ Source: Docker Title: How to Build, Run, and Package AI Models Locally with Docker Model Runner Feedly Summary: Introduction As a Senior DevOps Engineer and Docker Captain, I’ve helped build AI systems for everything from retail personalization to medical imaging. One truth stands out: AI capabilities are core to modern infrastructure. This…

  • Simon Willison’s Weblog: Run Your Own AI

    Source URL: https://simonwillison.net/2025/Jun/3/run-your-own-ai/ Source: Simon Willison’s Weblog Title: Run Your Own AI Feedly Summary: Run Your Own AI Anthony Lewis published this neat, concise tutorial on using my LLM tool to run local models on your own machine, using llm-mlx. An under-appreciated way to contribute to open source projects is to publish unofficial guides like…

  • Simon Willison’s Weblog: Gemma 3 QAT Models

    Source URL: https://simonwillison.net/2025/Apr/19/gemma-3-qat-models/ Source: Simon Willison’s Weblog Title: Gemma 3 QAT Models Feedly Summary: Gemma 3 QAT Models Interesting release from Google, as a follow-up to Gemma 3 from last month: To make Gemma 3 even more accessible, we are announcing new versions optimized with Quantization-Aware Training (QAT) that dramatically reduces memory requirements while maintaining…

  • Hacker News: How to Run DeepSeek R1 Distilled Reasoning Models on RyzenAI and Radeon GPUs

    Source URL: https://www.guru3d.com/story/amd-explains-how-to-run-deepseek-r1-distilled-reasoning-models-on-amd-ryzen-ai-and-radeon/ Source: Hacker News Title: How to Run DeepSeek R1 Distilled Reasoning Models on RyzenAI and Radeon GPUs Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text discusses the capabilities and deployment of DeepSeek R1 Distilled Reasoning models, highlighting their use of chain-of-thought reasoning for complex prompt analysis. It details how…

  • Hacker News: Mistral Small 3

    Source URL: https://mistral.ai/news/mistral-small-3/ Source: Hacker News Title: Mistral Small 3 Feedly Summary: Comments AI Summary and Description: Yes Summary: The text introduces Mistral Small 3, a new 24B-parameter model optimized for latency, designed for generative AI tasks. It highlights the model’s competitive performance compared to larger models, its suitability for local deployment, and its potential…

  • Slashdot: Microsoft Makes DeepSeek’s R1 Model Available On Azure AI and GitHub

    Source URL: https://slashdot.org/story/25/01/29/2218253/microsoft-makes-deepseeks-r1-model-available-on-azure-ai-and-github?utm_source=rss1.0mainlinkanon&utm_medium=feed Source: Slashdot Title: Microsoft Makes DeepSeek’s R1 Model Available On Azure AI and GitHub Feedly Summary: AI Summary and Description: Yes Summary: Microsoft has enhanced its Azure AI Foundry platform by integrating DeepSeek’s R1 model, facilitating efficient experimentation and deployment of AI applications for developers. The model has passed extensive security evaluations,…

  • Hacker News: How to run DeepSeek R1 locally

    Source URL: https://workos.com/blog/how-to-run-deepseek-r1-locally Source: Hacker News Title: How to run DeepSeek R1 locally Feedly Summary: Comments AI Summary and Description: Yes **Summary:** DeepSeek R1 is an open-source large language model (LLM) designed for local deployment to enhance data privacy and performance in conversational AI, coding, and problem-solving tasks. Its capability to outperform OpenAI’s flagship model…

  • Hacker News: I Run LLMs Locally

    Source URL: https://abishekmuthian.com/how-i-run-llms-locally/ Source: Hacker News Title: I Run LLMs Locally Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses how to set up and run Large Language Models (LLMs) locally, highlighting hardware requirements, tools, model choices, and practical insights on achieving better performance. This is particularly relevant for professionals focused on…

  • Simon Willison’s Weblog: Meta AI release Llama 3.3

    Source URL: https://simonwillison.net/2024/Dec/6/llama-33/#atom-everything Source: Simon Willison’s Weblog Title: Meta AI release Llama 3.3 Feedly Summary: Meta AI release Llama 3.3 This new Llama-3.3-70B-Instruct model from Meta AI makes some bold claims: This model delivers similar performance to Llama 3.1 405B with cost effective inference that’s feasible to run locally on common developer workstations. I have…