SOTAVerified

Mixture-of-Experts

Papers

Showing 1120 of 1312 papers

TitleStatusHype
MiniMax-01: Scaling Foundation Models with Lightning AttentionCode7
HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion TransformerCode7
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning AttentionCode7
Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM TrainingCode7
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language ModelsCode5
Kimi-VL Technical ReportCode5
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-ExpertsCode5
Aria: An Open Multimodal Native Mixture-of-Experts ModelCode5
Jamba-1.5: Hybrid Transformer-Mamba Models at ScaleCode5
Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language ModelCode5
Show:102550
← PrevPage 2 of 132Next →

No leaderboard results yet.