SOTAVerified|Agents Browse Leaderboard About

Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 1312 papers

Title	Date	Tasks	Status	Hype
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free	May 10, 2025	AttributeMixture-of-Experts	CodeCode Available	4
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models	Jun 3, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts	Sep 24, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	4
Fast Inference of Mixture-of-Experts Language Models with Offloading	Dec 28, 2023	Mixture-of-ExpertsQuantization	CodeCode Available	4
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts	Oct 9, 2024	GPUMixture-of-Experts	CodeCode Available	4
Training Sparse Mixture Of Experts Text Embedding Models	Feb 11, 2025	Mixture-of-ExpertsRAG	CodeCode Available	4
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models	Jul 2, 2024	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	4
OLMoE: Open Mixture-of-Experts Language Models	Sep 3, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale	Jun 30, 2022	CPUGPU	CodeCode Available	4
JetMoE: Reaching Llama2 Performance with 0.1M Dollars	Apr 11, 2024	GPUMixture-of-Experts	CodeCode Available	4

Show:10 25 50

← PrevPage 4 of 132Next →

No leaderboard results yet.