5 1 3

Sunny Sanyal

Sunny111

https://sites.google.com/view/sunnysanyal/home

SunnySanyal9
sanyalsunny111

AI & ML interests

Efficient Training Recipes of Large Models (mostly LLMs)

Organizations

repliedto their post 2 months ago

not yet but I will.

posted an update 2 months ago

Post

1627

Are you familiar with reverse residual connections or looping in language models?

Excited to share my Looped-GPT blog post and codebase 🚀
https://github.com/sanyalsunny111/Looped-GPT

TL;DR: looping during pre-training improves generalization.

Plot shows GPT2 LMs pre-trained with 15.73B OWT tokens

P.S. This is my first post here — I have ~4 followers and zero expectations for reach 😄

3 replies

Sunny Sanyal

AI & ML interests

Organizations

Sunny111's activity