Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 1312 papers

Title	Date	Tasks	Status	Hype
A Survey on Inference Optimization Techniques for Mixture of Experts Models	Dec 18, 2024	Computational EfficiencyDistributed Computing	CodeCode Available	3
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation	Aug 28, 2024	Computational EfficiencyHallucination	CodeCode Available	3
A Survey on Mixture of Experts	Jun 26, 2024	In-Context LearningMixture-of-Experts	CodeCode Available	3
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields	May 4, 2025	Mixture-of-ExpertsNeRF	CodeCode Available	3
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation	Jul 5, 2024	Drum TranscriptionDrum Transcription in Music (DTM)	CodeCode Available	3
Scaling Laws for Fine-Grained Mixture of Experts	Feb 12, 2024	Mixture-of-Experts	CodeCode Available	3
BlackMamba: Mixture of Experts for State-Space Models	Feb 1, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters	Mar 18, 2024	Continual LearningIncremental Learning	CodeCode Available	3
Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward Model	Jun 2, 2024	DenoisingMixture-of-Experts	CodeCode Available	3
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling	Dec 23, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	3
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts	Jan 8, 2024	MambaMixture-of-Experts	CodeCode Available	3
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning	Sep 11, 2023	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	2
Harder Tasks Need More Experts: Dynamic Routing in MoE Models	Mar 12, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	2
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models	Oct 25, 2023	GPUMixture-of-Experts	CodeCode Available	2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks	Jan 5, 2024	Arithmetic ReasoningCode Generation	CodeCode Available	2
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer	Jan 23, 2017	Computational EfficiencyGPU	CodeCode Available	2
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset	Dec 9, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models	Feb 22, 2024	AllMixture-of-Experts	CodeCode Available	2
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models	Oct 2, 2024	Mixture-of-ExpertsNavigate	CodeCode Available	2
Motion In-Betweening with Phase Manifolds	Aug 24, 2023	Mixture-of-Expertsmotion in-betweening	CodeCode Available	2
Multi-Task Dense Prediction via Mixture of Low-Rank Experts	Mar 26, 2024	DecoderMixture-of-Experts	CodeCode Available	2
Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts	Oct 10, 2024	Mixture-of-Experts	CodeCode Available	2
Monet: Mixture of Monosemantic Experts for Transformers	Dec 5, 2024	Dictionary LearningMixture-of-Experts	CodeCode Available	2
Fast Feedforward Networks	Aug 28, 2023	Mixture-of-Experts	CodeCode Available	2
MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting Models	Jul 9, 2025	Mixture-of-ExpertsTime Series	CodeCode Available	2
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models	Apr 16, 2024	image-classificationImage Classification	CodeCode Available	2
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks	Jun 7, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	2
MoEUT: Mixture-of-Experts Universal Transformers	May 25, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
No Language Left Behind: Scaling Human-Centered Machine Translation	Jul 11, 2022	Machine TranslationMixture-of-Experts	CodeCode Available	2
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing	Dec 19, 2024	Mixture-of-Experts	CodeCode Available	2
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts	Oct 14, 2024	Mixture-of-Experts	CodeCode Available	2
Mixture of A Million Experts	Jul 4, 2024	Computational EfficiencyLanguage Modeling	CodeCode Available	2
Mixture of Lookup Experts	Mar 20, 2025	Mixture-of-Experts	CodeCode Available	2
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models	May 23, 2024	Mixture-of-ExpertsVisual Question Answering	CodeCode Available	2
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation	Mar 18, 2024	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving	Sep 11, 2024	Autonomous DrivingFeature Engineering	CodeCode Available	2
ModuleFormer: Modularity Emerges from Mixture-of-Experts	Jun 7, 2023	Language ModellingLightweight Deployment	CodeCode Available	2
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration	Oct 20, 2024	AllComputational Efficiency	CodeCode Available	2
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment	Feb 24, 2025	image-classificationImage Classification	CodeCode Available	2
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework	Jun 4, 2024	Mixture-of-Experts	CodeCode Available	2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More	Oct 8, 2024	Mixture-of-ExpertsQuantization	CodeCode Available	2
Delta Decompression for MoE-based LLMs Compression	Feb 24, 2025	DiversityMixture-of-Experts	CodeCode Available	2
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification	Dec 14, 2024	Mixture-of-ExpertsObject	CodeCode Available	2
Mixture of Tokens: Continuous MoE through Cross-Example Aggregation	Oct 24, 2023	Language ModellingLarge Language Model	CodeCode Available	2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training	Nov 24, 2024	MathMixture-of-Experts	CodeCode Available	2
MDFEND: Multi-domain Fake News Detection	Jan 4, 2022	Fake News DetectionMixture-of-Experts	CodeCode Available	2
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation	May 26, 2024	feature selectionMixture-of-Experts	CodeCode Available	2
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts	Jul 7, 2025	Inductive BiasMixture-of-Experts	CodeCode Available	2
KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting?	Aug 21, 2024	Mixture-of-ExpertsTime Series	CodeCode Available	2
A Closer Look into Mixture-of-Experts in Large Language Models	Jun 26, 2024	Computational EfficiencyDiversity	CodeCode Available	2

Show:10 25 50

← PrevPage 2 of 27Next →

No leaderboard results yet.