vocabulary – Page 2 – Experimental News Clipping Site

Simon Willison’s Weblog: Anomalous Tokens in DeepSeek-V3 and r1

Jan 26, 2025

—

by

Source URL: https://simonwillison.net/2025/Jan/26/anomalous-tokens-in-deepseek-v3-and-r1/#atom-everything Source: Simon Willison’s Weblog Title: Anomalous Tokens in DeepSeek-V3 and r1 Feedly Summary: Anomalous Tokens in DeepSeek-V3 and r1 Glitch tokens (previously) are tokens or strings that trigger strange behavior in LLMs, hinting at oddities in their tokenizers or model weights. Here’s a fun exploration of them across DeepSeek v3 and R1.…

Simon Willison’s Weblog: deepseek-ai/DeepSeek-V3-Base

Dec 25, 2024

—

by

system automation

in Uncategorized

Source URL: https://simonwillison.net/2024/Dec/25/deepseek-v3/#atom-everything Source: Simon Willison’s Weblog Title: deepseek-ai/DeepSeek-V3-Base Feedly Summary: deepseek-ai/DeepSeek-V3-Base No model card or announcement yet, but this new model release from Chinese AI lab DeepSeek (an arm of Chinese hedge fund High-Flyer) looks very significant. It’s a huge model – 685B parameters, 687.9 GB on disk (TIL how to size a git-lfs…

Hacker News: Probably pay attention to tokenizers

Oct 23, 2024

—

by

system automation

in Uncategorized

Source URL: https://cybernetist.com/2024/10/21/you-should-probably-pay-attention-to-tokenizers/ Source: Hacker News Title: Probably pay attention to tokenizers Feedly Summary: Comments AI Summary and Description: Yes **Summary:** The text delves into the critical role of tokenization in AI applications, particularly those utilizing Retrieval-Augmented Generation (RAG). It emphasizes how understanding tokenization can significantly affect the performance of AI models, especially in contexts…

Tag: vocabulary

Simon Willison’s Weblog: Anomalous Tokens in DeepSeek-V3 and r1

Simon Willison’s Weblog: deepseek-ai/DeepSeek-V3-Base

Hacker News: Probably pay attention to tokenizers