Mixture-of-Experts

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 1312 papers

Title	Date	Tasks	Status	Hype	Score
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models	Feb 22, 2024	AllMixture-of-Experts	CodeCode Available	2	5
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts	Jul 7, 2025	Inductive BiasMixture-of-Experts	CodeCode Available	2	5
No Language Left Behind: Scaling Human-Centered Machine Translation	Jul 11, 2022	Machine TranslationMixture-of-Experts	CodeCode Available	2	5
Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset	Dec 9, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	2	5
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts	May 9, 2024	Image CaptioningInstruction Following	CodeCode Available	2	5
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin	Dec 15, 2023	Language ModellingMixture-of-Experts	CodeCode Available	2	5
LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes	Jan 7, 2025	Mixture-of-ExpertsRepresentation Learning	CodeCode Available	2	5
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts	Oct 14, 2024	Mixture-of-Experts	CodeCode Available	2	5
Monet: Mixture of Monosemantic Experts for Transformers	Dec 5, 2024	Dictionary LearningMixture-of-Experts	CodeCode Available	2	5
Motion In-Betweening with Phase Manifolds	Aug 24, 2023	Mixture-of-Expertsmotion in-betweening	CodeCode Available	2	5
MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting Models	Jul 9, 2025	Mixture-of-ExpertsTime Series	CodeCode Available	2	5
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference	Apr 8, 2025	CPUGPU	CodeCode Available	2	5
A Closer Look into Mixture-of-Experts in Large Language Models	Jun 26, 2024	Computational EfficiencyDiversity	CodeCode Available	2	5
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference	May 26, 2025	Language ModelingLanguage Modelling	CodeCode Available	2	5
MoEUT: Mixture-of-Experts Universal Transformers	May 25, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
Yuan 2.0-M32: Mixture of Experts with Attention Router	May 28, 2024	ARCMath	CodeCode Available	2	5
Multi-Task Dense Prediction via Mixture of Low-Rank Experts	Mar 26, 2024	DecoderMixture-of-Experts	CodeCode Available	2	5
MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks	Jun 7, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	2	5
CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese Characters	Nov 18, 2024	fill-maskFill Mask	CodeCode Available	2	5
MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection	Apr 12, 2024	Mixture-of-Experts	CodeCode Available	2	5
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling	Sep 28, 2024	image-classificationImage Classification	CodeCode Available	2	5
Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts	Oct 10, 2024	Mixture-of-Experts	CodeCode Available	2	5
Mixture of Tokens: Continuous MoE through Cross-Example Aggregation	Oct 24, 2023	Language ModellingLarge Language Model	CodeCode Available	2	5
ModuleFormer: Modularity Emerges from Mixture-of-Experts	Jun 7, 2023	Language ModellingLightweight Deployment	CodeCode Available	2	5
Mixture of A Million Experts	Jul 4, 2024	Computational EfficiencyLanguage Modeling	CodeCode Available	2	5
MDFEND: Multi-domain Fake News Detection	Jan 4, 2022	Fake News DetectionMixture-of-Experts	CodeCode Available	2	5
MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More	Oct 8, 2024	Mixture-of-ExpertsQuantization	CodeCode Available	2	5
MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving	Sep 11, 2024	Autonomous DrivingFeature Engineering	CodeCode Available	2	5
Mixture of Lookup Experts	Mar 20, 2025	Mixture-of-Experts	CodeCode Available	2	5
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models	Apr 16, 2024	image-classificationImage Classification	CodeCode Available	2	5
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models	Oct 2, 2024	Mixture-of-ExpertsNavigate	CodeCode Available	2	5
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis	Jul 24, 2024	Mixture-of-ExpertsMultiple Instance Learning	CodeCode Available	1	5
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts	May 15, 2024	Image SegmentationMixture-of-Experts	CodeCode Available	1	5
Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis	Sep 7, 2023	Image GenerationMixture-of-Experts	CodeCode Available	1	5
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference	Jan 16, 2024	GPUMixture-of-Experts	CodeCode Available	1	5
M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA framework	Sep 9, 2024	Computational EfficiencyCross-Modal Retrieval	CodeCode Available	1	5
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework	Apr 29, 2024	AutoMLMixture-of-Experts	CodeCode Available	1	5
Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark	Jun 12, 2024	BenchmarkingMixture-of-Experts	CodeCode Available	1	5
Specialized federated learning using a mixture of experts	Oct 5, 2020	Federated LearningMixture-of-Experts	CodeCode Available	1	5
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design	Oct 26, 2022	Mixture-of-ExpertsMulti-Task Learning	CodeCode Available	1	5
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals Prediction	Aug 26, 2020	Interpretable Machine LearningMixture-of-Experts	CodeCode Available	1	5
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts	Aug 22, 2023	Mixture-of-ExpertsNeRF	CodeCode Available	1	5
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models	Sep 25, 2023	GPUMixture-of-Experts	CodeCode Available	1	5
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset	Oct 21, 2024	Image DehazingMamba	CodeCode Available	1	5
AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality	Oct 14, 2024	Mixture-of-Expertsparameter-efficient fine-tuning	CodeCode Available	1	5
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction	Sep 26, 2024	Mixture-of-ExpertsPrediction	CodeCode Available	1	5
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node	May 27, 2024	Computational EfficiencyMixture-of-Experts	CodeCode Available	1	5
LOLA -- An Open-Source Massively Multilingual Large Language Model	Sep 17, 2024	DiversityLanguage Modeling	CodeCode Available	1	5
EWMoE: An effective model for global weather forecasting with mixture-of-experts	May 9, 2024	Mixture-of-ExpertsWeather Forecasting	CodeCode Available	1	5
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding	May 10, 2025	DescriptiveEmotion Recognition	CodeCode Available	1	5

Show:10 25 50

← PrevPage 3 of 27Next →

No leaderboard results yet.