Tag: vocabulary

  • Simon Willison’s Weblog: deepseek-ai/DeepSeek-V3-Base

    Source URL: https://simonwillison.net/2024/Dec/25/deepseek-v3/#atom-everything Source: Simon Willison’s Weblog Title: deepseek-ai/DeepSeek-V3-Base Feedly Summary: deepseek-ai/DeepSeek-V3-Base No model card or announcement yet, but this new model release from Chinese AI lab DeepSeek (an arm of Chinese hedge fund High-Flyer) looks very significant. It’s a huge model – 685B parameters, 687.9 GB on disk (TIL how to size a git-lfs…

  • Hacker News: Probably pay attention to tokenizers

    Source URL: https://cybernetist.com/2024/10/21/you-should-probably-pay-attention-to-tokenizers/ Source: Hacker News Title: Probably pay attention to tokenizers Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text delves into the critical role of tokenization in AI applications, particularly those utilizing Retrieval-Augmented Generation (RAG). It emphasizes how understanding tokenization can significantly affect the performance of AI models, especially in contexts…