| Demystifying the Compression of Mixture-of-Experts Through a Unified Framework | Jun 4, 2024 | Mixture-of-Experts | CodeCode Available | 2 | 5 |
| Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts | Jul 7, 2025 | Inductive BiasMixture-of-Experts | CodeCode Available | 2 | 5 |
| Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks | Jan 5, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 2 | 5 |
| Delta Decompression for MoE-based LLMs Compression | Feb 24, 2025 | DiversityMixture-of-Experts | CodeCode Available | 2 | 5 |
| LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes | Jan 7, 2025 | Mixture-of-ExpertsRepresentation Learning | CodeCode Available | 2 | 5 |
| Text2Human: Text-Driven Controllable Human Image Generation | May 31, 2022 | DiversityHuman Parsing | CodeCode Available | 2 | 5 |
| Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation | May 26, 2024 | feature selectionMixture-of-Experts | CodeCode Available | 2 | 5 |
| Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine | Dec 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| I2MoE: Interpretable Multimodal Interaction-aware Mixture-of-Experts | May 25, 2025 | Mixture-of-Expertsmultimodal interaction | CodeCode Available | 2 | 5 |
| No Language Left Behind: Scaling Human-Centered Machine Translation | Jul 11, 2022 | Machine TranslationMixture-of-Experts | CodeCode Available | 2 | 5 |
| Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models | Feb 22, 2024 | AllMixture-of-Experts | CodeCode Available | 2 | 5 |
| Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs | Jun 9, 2022 | Image CaptioningImage Classification | CodeCode Available | 2 | 5 |
| Monet: Mixture of Monosemantic Experts for Transformers | Dec 5, 2024 | Dictionary LearningMixture-of-Experts | CodeCode Available | 2 | 5 |
| MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting Models | Jul 9, 2025 | Mixture-of-ExpertsTime Series | CodeCode Available | 2 | 5 |
| Motion In-Betweening with Phase Manifolds | Aug 24, 2023 | Mixture-of-Expertsmotion in-betweening | CodeCode Available | 2 | 5 |
| Higher Layers Need More LoRA Experts | Feb 13, 2024 | Mixture-of-Experts | CodeCode Available | 2 | 5 |
| Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models | Apr 16, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| MoEUT: Mixture-of-Experts Universal Transformers | May 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Multi-Task Dense Prediction via Mixture of Low-Rank Experts | Mar 26, 2024 | DecoderMixture-of-Experts | CodeCode Available | 2 | 5 |
| Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset | Dec 9, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 2 | 5 |
| MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection | Apr 12, 2024 | Mixture-of-Experts | CodeCode Available | 2 | 5 |
| ModuleFormer: Modularity Emerges from Mixture-of-Experts | Jun 7, 2023 | Language ModellingLightweight Deployment | CodeCode Available | 2 | 5 |
| MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks | Jun 7, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 2 | 5 |
| Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning | Dec 22, 2023 | Instruction FollowingMixture-of-Experts | CodeCode Available | 2 | 5 |
| Mixture of Tokens: Continuous MoE through Cross-Example Aggregation | Oct 24, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 | 5 |
| CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese Characters | Nov 18, 2024 | fill-maskFill Mask | CodeCode Available | 2 | 5 |
| CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling | Sep 28, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving | Sep 11, 2024 | Autonomous DrivingFeature Engineering | CodeCode Available | 2 | 5 |
| MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More | Oct 8, 2024 | Mixture-of-ExpertsQuantization | CodeCode Available | 2 | 5 |
| MDFEND: Multi-domain Fake News Detection | Jan 4, 2022 | Fake News DetectionMixture-of-Experts | CodeCode Available | 2 | 5 |
| SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models | Nov 1, 2024 | Mixture-of-Experts | CodeCode Available | 2 | 5 |
| Few-Shot and Continual Learning with Attentive Independent Mechanisms | Jul 29, 2021 | Continual LearningFew-Shot Learning | CodeCode Available | 1 | 5 |
| FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing | Dec 22, 2023 | Mixture-of-ExpertsMotion Generation | CodeCode Available | 1 | 5 |
| M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis | Jul 24, 2024 | Mixture-of-ExpertsMultiple Instance Learning | CodeCode Available | 1 | 5 |
| M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts | May 15, 2024 | Image SegmentationMixture-of-Experts | CodeCode Available | 1 | 5 |
| M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA framework | Sep 9, 2024 | Computational EfficiencyCross-Modal Retrieval | CodeCode Available | 1 | 5 |
| M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework | Apr 29, 2024 | AutoMLMixture-of-Experts | CodeCode Available | 1 | 5 |
| Specialized federated learning using a mixture of experts | Oct 5, 2020 | Federated LearningMixture-of-Experts | CodeCode Available | 1 | 5 |
| FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models | May 26, 2025 | Mixture-of-Experts | CodeCode Available | 1 | 5 |
| M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design | Oct 26, 2022 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 | 5 |
| Making Neural Networks Interpretable with Attribution: Application to Implicit Signals Prediction | Aug 26, 2020 | Interpretable Machine LearningMixture-of-Experts | CodeCode Available | 1 | 5 |
| Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference | Jan 16, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 | 5 |
| LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset | Oct 21, 2024 | Image DehazingMamba | CodeCode Available | 1 | 5 |
| LOLA -- An Open-Source Massively Multilingual Large Language Model | Sep 17, 2024 | DiversityLanguage Modeling | CodeCode Available | 1 | 5 |
| AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality | Oct 14, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 1 | 5 |
| A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction | Sep 26, 2024 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 | 5 |
| LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models | Sep 25, 2023 | GPUMixture-of-Experts | CodeCode Available | 1 | 5 |
| Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation | Apr 3, 2023 | Mixture-of-ExpertsTransfer Learning | CodeCode Available | 1 | 5 |
| EWMoE: An effective model for global weather forecasting with mixture-of-experts | May 9, 2024 | Mixture-of-ExpertsWeather Forecasting | CodeCode Available | 1 | 5 |
| Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark | Jun 12, 2024 | BenchmarkingMixture-of-Experts | CodeCode Available | 1 | 5 |