SOTAVerified

Mixture-of-Experts

Papers

Showing 1120 of 1312 papers

TitleStatusHype
Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM TrainingCode7
MiniMax-01: Scaling Foundation Models with Lightning AttentionCode7
HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion TransformerCode7
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning AttentionCode7
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language ModelsCode5
Kimi-VL Technical ReportCode5
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-ExpertsCode5
Aria: An Open Multimodal Native Mixture-of-Experts ModelCode5
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by TencentCode5
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-ExpertsCode5
Show:102550
← PrevPage 2 of 132Next →

No leaderboard results yet.