Tag: computational costs
-
Hacker News: NanoGPT (124M) quality in 3.25B training tokens (vs. 10B)
Source URL: https://github.com/KellerJordan/modded-nanogpt Source: Hacker News Title: NanoGPT (124M) quality in 3.25B training tokens (vs. 10B) Feedly Summary: Comments AI Summary and Description: Yes Summary: The provided text outlines a modified PyTorch trainer for GPT-2 that achieves training efficiency improvements through architectural updates and a novel optimizer. This is relevant for professionals in AI and…