omni models showlab/show-o Any-to-Any • Updated Jun 21, 2025 • 83 • 17 Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published Jan 21, 2025 • 68
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published Jan 21, 2025 • 68
Keep in Mind's Paper Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation Paper • 2302.09664 • Published Feb 19, 2023 • 4 Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5, 2024 • 17 Rethinking Optimization and Architecture for Tiny Language Models Paper • 2402.02791 • Published Feb 5, 2024 • 13 Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 117
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation Paper • 2302.09664 • Published Feb 19, 2023 • 4
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5, 2024 • 17
Rethinking Optimization and Architecture for Tiny Language Models Paper • 2402.02791 • Published Feb 5, 2024 • 13
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 117
omni models showlab/show-o Any-to-Any • Updated Jun 21, 2025 • 83 • 17 Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published Jan 21, 2025 • 68
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models Paper • 2501.11873 • Published Jan 21, 2025 • 68
Keep in Mind's Paper Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation Paper • 2302.09664 • Published Feb 19, 2023 • 4 Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5, 2024 • 17 Rethinking Optimization and Architecture for Tiny Language Models Paper • 2402.02791 • Published Feb 5, 2024 • 13 Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 117
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation Paper • 2302.09664 • Published Feb 19, 2023 • 4
Shortened LLaMA: A Simple Depth Pruning for Large Language Models Paper • 2402.02834 • Published Feb 5, 2024 • 17
Rethinking Optimization and Architecture for Tiny Language Models Paper • 2402.02791 • Published Feb 5, 2024 • 13
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6, 2024 • 117