Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 1312 papers

Title	Date	Tasks	Status	Hype
Higher Layers Need More LoRA Experts	Feb 13, 2024	Mixture-of-Experts	CodeCode Available	2
TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts	Mar 5, 2024	Graph AttentionGraph Embedding	CodeCode Available	2
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models	Oct 2, 2024	Mixture-of-ExpertsNavigate	CodeCode Available	2
Multi-Task Dense Prediction via Mixture of Low-Rank Experts	Mar 26, 2024	DecoderMixture-of-Experts	CodeCode Available	2
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts	Oct 14, 2024	Mixture-of-Experts	CodeCode Available	2
KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting?	Aug 21, 2024	Mixture-of-ExpertsTime Series	CodeCode Available	2
Monet: Mixture of Monosemantic Experts for Transformers	Dec 5, 2024	Dictionary LearningMixture-of-Experts	CodeCode Available	2
Motion In-Betweening with Phase Manifolds	Aug 24, 2023	Mixture-of-Expertsmotion in-betweening	CodeCode Available	2
No Language Left Behind: Scaling Human-Centered Machine Translation	Jul 11, 2022	Machine TranslationMixture-of-Experts	CodeCode Available	2
MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting Models	Jul 9, 2025	Mixture-of-ExpertsTime Series	CodeCode Available	2
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models	May 23, 2024	Mixture-of-ExpertsVisual Question Answering	CodeCode Available	2
MoEUT: Mixture-of-Experts Universal Transformers	May 25, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection	Apr 12, 2024	Mixture-of-Experts	CodeCode Available	2
ModuleFormer: Modularity Emerges from Mixture-of-Experts	Jun 7, 2023	Language ModellingLightweight Deployment	CodeCode Available	2
A Closer Look into Mixture-of-Experts in Large Language Models	Jun 26, 2024	Computational EfficiencyDiversity	CodeCode Available	2
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training	Nov 24, 2024	MathMixture-of-Experts	CodeCode Available	2
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts	Mar 7, 2025	Mixture-of-ExpertsState Space Models	CodeCode Available	2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models	Feb 22, 2024	AllMixture-of-Experts	CodeCode Available	2
Mixture of Lookup Experts	Mar 20, 2025	Mixture-of-Experts	CodeCode Available	2
Mixture of A Million Experts	Jul 4, 2024	Computational EfficiencyLanguage Modeling	CodeCode Available	2
Mixture of Tokens: Continuous MoE through Cross-Example Aggregation	Oct 24, 2023	Language ModellingLarge Language Model	CodeCode Available	2
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation	May 26, 2024	feature selectionMixture-of-Experts	CodeCode Available	2
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts	May 9, 2024	Image CaptioningInstruction Following	CodeCode Available	2
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning	Dec 22, 2023	Instruction FollowingMixture-of-Experts	CodeCode Available	2
MDFEND: Multi-domain Fake News Detection	Jan 4, 2022	Fake News DetectionMixture-of-Experts	CodeCode Available	2
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving	Sep 11, 2024	Autonomous DrivingFeature Engineering	CodeCode Available	2
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese Characters	Nov 18, 2024	fill-maskFill Mask	CodeCode Available	2
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling	Sep 28, 2024	image-classificationImage Classification	CodeCode Available	2
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More	Oct 8, 2024	Mixture-of-ExpertsQuantization	CodeCode Available	2
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset	Dec 9, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	2
Superposition in Transformers: A Novel Way of Building Mixture of Experts	Dec 31, 2024	Mixture-of-Experts	CodeCode Available	2
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis	Jul 24, 2024	Mixture-of-ExpertsMultiple Instance Learning	CodeCode Available	1
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design	Oct 26, 2022	Mixture-of-ExpertsMulti-Task Learning	CodeCode Available	1
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts	May 15, 2024	Image SegmentationMixture-of-Experts	CodeCode Available	1
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework	Apr 29, 2024	AutoMLMixture-of-Experts	CodeCode Available	1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset	Oct 21, 2024	Image DehazingMamba	CodeCode Available	1
M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA framework	Sep 9, 2024	Computational EfficiencyCross-Modal Retrieval	CodeCode Available	1
LOLA -- An Open-Source Massively Multilingual Large Language Model	Sep 17, 2024	DiversityLanguage Modeling	CodeCode Available	1
LLMBind: A Unified Modality-Task Integration Framework	Feb 22, 2024	AI AgentAudio Generation	CodeCode Available	1
PAD-Net: An Efficient Framework for Dynamic Networks	Nov 10, 2022	image-classificationImage Classification	CodeCode Available	1
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models	Sep 25, 2023	GPUMixture-of-Experts	CodeCode Available	1
Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation	Apr 3, 2023	Mixture-of-ExpertsTransfer Learning	CodeCode Available	1
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals Prediction	Aug 26, 2020	Interpretable Machine LearningMixture-of-Experts	CodeCode Available	1
LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models	Apr 1, 2024	Decision MakingLanguage Modeling	CodeCode Available	1
AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality	Oct 14, 2024	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction	Sep 26, 2024	Mixture-of-ExpertsPrediction	CodeCode Available	1
Learning to Skip the Middle Layers of Transformers	Jun 26, 2025	Mixture-of-Experts	CodeCode Available	1
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models	Nov 1, 2024	BenchmarkingMixture-of-Experts	CodeCode Available	1
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model	Feb 20, 2025	Mixture-of-ExpertsQuestion Answering	CodeCode Available	1
Parameter Efficient Adaptation for Image Restoration with Heterogeneous Mixture-of-Experts	Dec 12, 2023	DenoisingDiversity	CodeCode Available	1

Show:10 25 50

← PrevPage 3 of 27Next →

No leaderboard results yet.