LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth Paper • 2602.07962 • Published 2 days ago • 21
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Paper • 2510.25726 • Published Oct 29, 2025 • 46
ilovesushiandkimchiandmalaxiangguo/shuPT-xiaohongshu-Llama3-8B-Instruct Text Generation • 8B • Updated Jan 26, 2025 • 1 • 2
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization Paper • 2503.23733 • Published Mar 31, 2025 • 10