SOTAVerified

Mixture-of-Experts

Papers

Showing 110 of 1312 papers

TitleStatusHype
DeepSeek-V3 Technical ReportCode16
Qwen2.5 Technical ReportCode13
Qwen2 Technical ReportCode13
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and ApplicationsCode9
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal UnderstandingCode9
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code IntelligenceCode9
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated ParametersCode9
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language ModelCode9
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning AttentionCode7
HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion TransformerCode7
Show:102550
← PrevPage 1 of 132Next →

No leaderboard results yet.