Music Flamingo: Scaling Music Understanding in Audio Language Models Paper • 2511.10289 • Published Nov 13, 2025 • 17
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction Paper • 2510.22706 • Published Oct 26, 2025 • 41 • 1
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction Paper • 2510.22706 • Published Oct 26, 2025 • 41
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion Paper • 2507.02813 • Published Jul 3, 2025 • 60
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion Paper • 2507.02813 • Published Jul 3, 2025 • 60 • 1
xbench: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations Paper • 2506.13651 • Published Jun 16, 2025 • 8
xbench: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations Paper • 2506.13651 • Published Jun 16, 2025 • 8
xbench: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations Paper • 2506.13651 • Published Jun 16, 2025 • 8 • 2
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding Paper • 2506.01853 • Published Jun 2, 2025 • 32
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper • 2505.23747 • Published May 29, 2025 • 69
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper • 2505.23747 • Published May 29, 2025 • 69 • 3
An Empirical Study of GPT-4o Image Generation Capabilities Paper • 2504.05979 • Published Apr 8, 2025 • 64