Nano LLMs Collection Really small LLMs pre-trained on data efficient 1 B tokens • 3 items • Updated 1 day ago • 1
Nano LLMs Collection Really small LLMs pre-trained on data efficient 1 B tokens • 3 items • Updated 1 day ago • 1
view article Article Scaling Pedagogical Pretraining: From Optimal Mixing to 10 Billion Tokens 2 days ago • 2
view article Article Scaling Pedagogical Pretraining: From Optimal Mixing to 10 Billion Tokens 2 days ago • 2