Tag: deepspeed

  • Simon Willison’s Weblog: Shisa V2 405B: Japan’s Highest Performing LLM

    Source URL: https://simonwillison.net/2025/Jun/3/shisa-v2/ Source: Simon Willison’s Weblog Title: Shisa V2 405B: Japan’s Highest Performing LLM Feedly Summary: Shisa V2 405B: Japan’s Highest Performing LLM Leonard Lin and Adam Lensenmayer have been working on Shisa for a while. They describe their latest release as “Japan’s Highest Performing LLM". Shisa V2 405B is the highest-performing LLM ever…

  • Hacker News: Liger-kernel: Efficient triton kernels for LLM training

    Source URL: https://github.com/linkedin/Liger-Kernel Source: Hacker News Title: Liger-kernel: Efficient triton kernels for LLM training Feedly Summary: Comments AI Summary and Description: Yes Summary: The Liger Kernel is a specialized Triton kernel collection aimed at enhancing LLM (Large Language Model) training efficiency by significantly improving throughput and reducing memory usage. It is particularly relevant for AI…