Tag: Huggingface

  • Simon Willison’s Weblog: Comma v0.1 1T and 2T – 7B LLMs trained on openly licensed text

    Source URL: https://simonwillison.net/2025/Jun/7/comma/#atom-everything Source: Simon Willison’s Weblog Title: Comma v0.1 1T and 2T – 7B LLMs trained on openly licensed text Feedly Summary: It’s been a long time coming, but we finally have some promising LLMs to try out which are trained entirely on openly licensed text! EleutherAI released the Pile four and a half…

  • Cloud Blog: Accelerate your gen AI: Deploy Llama4 & DeepSeek on AI Hypercomputer with new recipes

    Source URL: https://cloud.google.com/blog/products/ai-machine-learning/deploying-llama4-and-deepseek-on-ai-hypercomputer/ Source: Cloud Blog Title: Accelerate your gen AI: Deploy Llama4 & DeepSeek on AI Hypercomputer with new recipes Feedly Summary: The pace of innovation in open-source AI is breathtaking, with models like Meta’s Llama4 and DeepSeek AI’s DeepSeek. However, deploying and optimizing large, powerful models can be  complex and resource-intensive. Developers and…

  • Simon Willison’s Weblog: Qwen3-8B

    Source URL: https://simonwillison.net/2025/May/2/qwen3-8b/#atom-everything Source: Simon Willison’s Weblog Title: Qwen3-8B Feedly Summary: Having tried a few of the Qwen 3 models now my favorite is a bit of a surprise to me: I’m really enjoying Qwen3-8B. I’ve been running prompts through the MLX 4bit quantized version, mlx-community/Qwen3-8B-4bit. I’m using llm-mlx like this: llm install llm-mlx llm…

  • Docker: Introducing Docker Model Runner: A Better Way to Build and Run GenAI Models Locally

    Source URL: https://www.docker.com/blog/introducing-docker-model-runner/ Source: Docker Title: Introducing Docker Model Runner: A Better Way to Build and Run GenAI Models Locally Feedly Summary: Docker Model Runner is a faster, simpler way to run and test AI models locally, right from your existing workflow. AI Summary and Description: Yes Summary: The text discusses the launch of Docker…

  • Hacker News: Sharding Pgvector

    Source URL: https://pgdog.dev/blog/sharding-pgvector Source: Hacker News Title: Sharding Pgvector Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the implementation of a sharding strategy for handling vector indices in the pgvector database, focusing specifically on large-scale embeddings. It highlights the challenges of scaling vector searches and presents an approach using two indexing…

  • Hacker News: Sharding Pgvector

    Source URL: https://pgdog.dev/blog/sharding-pgvector Source: Hacker News Title: Sharding Pgvector Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the implementation of a sharding strategy for handling vector indices in the pgvector database, focusing specifically on large-scale embeddings. It highlights the challenges of scaling vector searches and presents an approach using two indexing…

  • Simon Willison’s Weblog: mlx-community/OLMo-2-0325-32B-Instruct-4bit

    Source URL: https://simonwillison.net/2025/Mar/16/olmo2/#atom-everything Source: Simon Willison’s Weblog Title: mlx-community/OLMo-2-0325-32B-Instruct-4bit Feedly Summary: mlx-community/OLMo-2-0325-32B-Instruct-4bit OLMo 2 32B claims to be “the first fully-open model (all data, code, weights, and details are freely available) to outperform GPT3.5-Turbo and GPT-4o mini". Thanks to the MLX project here’s a recipe that worked for me to run it on my Mac,…