-
Large Language Models as Optimizers
Paper β’ 2309.03409 β’ Published β’ 78 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper β’ 2404.02258 β’ Published β’ 107 -
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper β’ 2404.14619 β’ Published β’ 126 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper β’ 2404.14219 β’ Published β’ 259
HAN JUNGU
JUNGU
AI & ML interests
None yet
Recent Activity
liked
a Space
about 5 hours ago
multimodalart/qwen-image-multiple-angles-3d-camera
liked
a model
17 days ago
zai-org/GLM-4.7-Flash
liked
a dataset
20 days ago
facebook/research-plan-gen