Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 1312 papers

Title	Date	Tasks	Status	Hype	Score
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training	Jun 24, 2024	Mixture-of-Experts	CodeCode Available	5	5
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models	Jan 11, 2024	Language ModellingLarge Language Model	CodeCode Available	5	5
Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model	Jun 28, 2023	HallucinationKnowledge Graphs	CodeCode Available	5	5
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models	Jan 29, 2024	DecoderMixture-of-Experts	CodeCode Available	5	5
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale	Jun 30, 2022	CPUGPU	CodeCode Available	4	5
JetMoE: Reaching Llama2 Performance with 0.1M Dollars	Apr 11, 2024	GPUMixture-of-Experts	CodeCode Available	4	5
OLMoE: Open Mixture-of-Experts Language Models	Sep 3, 2024	Language ModelingLanguage Modelling	CodeCode Available	4	5
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts	Oct 9, 2024	GPUMixture-of-Experts	CodeCode Available	4	5
Training Sparse Mixture Of Experts Text Embedding Models	Feb 11, 2025	Mixture-of-ExpertsRAG	CodeCode Available	4	5
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free	May 10, 2025	AttributeMixture-of-Experts	CodeCode Available	4	5
Mixtral of Experts	Jan 8, 2024	Code GenerationCommon Sense Reasoning	CodeCode Available	4	5
Fast Inference of Mixture-of-Experts Language Models with Offloading	Dec 28, 2023	Mixture-of-ExpertsQuantization	CodeCode Available	4	5
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models	Jun 3, 2024	Language ModelingLanguage Modelling	CodeCode Available	4	5
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models	Jul 2, 2024	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	4	5
MoH: Multi-Head Attention as Mixture-of-Head Attention	Oct 15, 2024	Mixture-of-Experts	CodeCode Available	4	5
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts	Sep 24, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	4	5
BlackMamba: Mixture of Experts for State-Space Models	Feb 1, 2024	Language ModelingLanguage Modelling	CodeCode Available	3	5
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields	May 4, 2025	Mixture-of-ExpertsNeRF	CodeCode Available	3	5
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters	Mar 18, 2024	Continual LearningIncremental Learning	CodeCode Available	3	5
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts	May 2, 2024	Combinatorial OptimizationMixture-of-Experts	CodeCode Available	3	5
A Survey on Mixture of Experts	Jun 26, 2024	In-Context LearningMixture-of-Experts	CodeCode Available	3	5
A Survey on Inference Optimization Techniques for Mixture of Experts Models	Dec 18, 2024	Computational EfficiencyDistributed Computing	CodeCode Available	3	5
AnyGraph: Graph Foundation Model in the Wild	Aug 20, 2024	Graph LearningMixture-of-Experts	CodeCode Available	3	5
Generalizing Motion Planners with Mixture of Experts for Autonomous Driving	Oct 21, 2024	Autonomous DrivingData Augmentation	CodeCode Available	3	5
MoAI: Mixture of All Intelligence for Large Language and Vision Models	Mar 12, 2024	AllMixture-of-Experts	CodeCode Available	3	5

Show:10 25 50

← PrevPage 2 of 53Next →

No leaderboard results yet.