Tag: tokenization
-
Hacker News: I built a large language model "from scratch"
Source URL: https://brettgfitzgerald.com/posts/build-a-large-language-model/ Source: Hacker News Title: I built a large language model "from scratch" Feedly Summary: Comments AI Summary and Description: Yes Summary: The text provides a detailed account of the author’s experience learning about and building a Large Language Model (LLM) based on insights from Sebastian Raschka’s book. It emphasizes the technical processes…
-
Simon Willison’s Weblog: Anomalous Tokens in DeepSeek-V3 and r1
Source URL: https://simonwillison.net/2025/Jan/26/anomalous-tokens-in-deepseek-v3-and-r1/#atom-everything Source: Simon Willison’s Weblog Title: Anomalous Tokens in DeepSeek-V3 and r1 Feedly Summary: Anomalous Tokens in DeepSeek-V3 and r1 Glitch tokens (previously) are tokens or strings that trigger strange behavior in LLMs, hinting at oddities in their tokenizers or model weights. Here’s a fun exploration of them across DeepSeek v3 and R1.…
-
Cloud Blog: Cloud CISO Perspectives: Talk cyber in business terms to win allies
Source URL: https://cloud.google.com/blog/products/identity-security/cloud-ciso-perspectives-talk-cyber-in-business-terms-to-win-allies/ Source: Cloud Blog Title: Cloud CISO Perspectives: Talk cyber in business terms to win allies Feedly Summary: Welcome to the first Cloud CISO Perspectives for January 2025. We’re starting off the year at the top with boards of directors, and how talking about cybersecurity in business terms can help us better convey…
-
Hacker News: A16Z 2025 Big Ideas for Crypto
Source URL: https://a16zcrypto.com/posts/article/big-ideas-crypto-2025/ Source: Hacker News Title: A16Z 2025 Big Ideas for Crypto Feedly Summary: Comments AI Summary and Description: Yes Summary: The text outlines emerging trends in AI, crypto, and governance that may shape the technology landscape in 2025. It highlights the transition of AIs into agentic roles, the necessity of unique digital identities,…
-
Hacker News: Don’t Look Twice: Faster Video Transformers with Run-Length Tokenization
Source URL: https://rccchoudhury.github.io/rlt/ Source: Hacker News Title: Don’t Look Twice: Faster Video Transformers with Run-Length Tokenization Feedly Summary: Comments AI Summary and Description: Yes Summary: The text presents a novel approach called Run-Length Tokenization (RLT) aimed at optimizing video transformers by eliminating redundant tokens. This content-aware method results in substantial speed improvements for training and…