| M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework | Apr 29, 2024 | AutoMLMixture-of-Experts | CodeCode Available | 1 |
| Condense, Don't Just Prune: Enhancing Efficiency and Performance in MoE Layer Pruning | Nov 26, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| Distilling the Knowledge in a Neural Network | Mar 9, 2015 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| MoCaE: Mixture of Calibrated Experts Significantly Improves Object Detection | Sep 26, 2023 | Instance SegmentationMixture-of-Experts | CodeCode Available | 1 |
| Modality Interactive Mixture-of-Experts for Fake News Detection | Jan 21, 2025 | Fake News DetectionMisinformation | CodeCode Available | 1 |
| M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design | Oct 26, 2022 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 |
| Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer | May 30, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering | May 5, 2021 | ClusteringContrastive Learning | CodeCode Available | 1 |
| LLMBind: A Unified Modality-Task Integration Framework | Feb 22, 2024 | AI AgentAudio Generation | CodeCode Available | 1 |
| LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models | Apr 1, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis | May 22, 2024 | DiversityMixture-of-Experts | CodeCode Available | 1 |
| LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models | Sep 25, 2023 | GPUMixture-of-Experts | CodeCode Available | 1 |
| Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings | Jan 28, 2024 | Contrastive LearningDescriptive | CodeCode Available | 1 |
| Lifting the Curse of Capacity Gap in Distilling Language Models | May 20, 2023 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts | Oct 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models | Nov 1, 2024 | BenchmarkingMixture-of-Experts | CodeCode Available | 1 |
| Multi-Head Mixture-of-Experts | Apr 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Addressing Confounding Feature Issue for Causal Recommendation | May 13, 2022 | Mixture-of-ExpertsRecommendation Systems | CodeCode Available | 1 |
| C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing | Apr 10, 2025 | In-Context LearningMixture-of-Experts | CodeCode Available | 1 |
| LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset | Oct 21, 2024 | Image DehazingMamba | CodeCode Available | 1 |
| Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts | Nov 19, 2023 | DiversityMixture-of-Experts | CodeCode Available | 1 |
| Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts | Feb 10, 2020 | Language ModellingMixture-of-Experts | CodeCode Available | 1 |
| Layerwise Recurrent Router for Mixture-of-Experts | Aug 13, 2024 | AttributeMixture-of-Experts | CodeCode Available | 1 |
| Large Multi-modality Model Assisted AI-Generated Image Quality Assessment | Apr 27, 2024 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |
| Learning Soccer Juggling Skills with Layer-wise Mixture-of-Experts | Jul 24, 2022 | Deep Reinforcement LearningHumanoid Control | CodeCode Available | 1 |
| RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling | May 14, 2021 | Dialogue GenerationLanguage Modeling | CodeCode Available | 1 |
| Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE | Feb 10, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model | May 22, 2025 | GPULong-range modeling | CodeCode Available | 1 |
| Learning to Skip the Middle Layers of Transformers | Jun 26, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| LOLA -- An Open-Source Massively Multilingual Large Language Model | Sep 17, 2024 | DiversityLanguage Modeling | CodeCode Available | 1 |
| A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction | Sep 26, 2024 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 |
| AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality | Oct 14, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 1 |
| Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing | Jul 26, 2024 | AttributeLanguage Modelling | CodeCode Available | 1 |
| MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank Compensators | Apr 3, 2025 | Mixture-of-ExpertsQuantization | CodeCode Available | 1 |
| Patcher: Patch Transformers with Mixture of Experts for Precise Medical Image Segmentation | Jun 3, 2022 | DecoderImage Segmentation | CodeCode Available | 1 |
| HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts | Feb 20, 2024 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 |
| PFL-MoE: Personalized Federated Learning Based on Mixture of Experts | Dec 31, 2020 | Decision MakingFederated Learning | CodeCode Available | 1 |
| HyperFormer: Enhancing Entity and Relation Interaction for Hyper-Relational Knowledge Graph Completion | Aug 12, 2023 | AttributeKnowledge Graph Completion | CodeCode Available | 1 |
| HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts | Dec 12, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| BrainMAP: Learning Multiple Activation Pathways in Brain Networks | Dec 23, 2024 | MambaMixture-of-Experts | CodeCode Available | 1 |
| HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models | Oct 8, 2021 | Abstractive Text SummarizationDecoder | CodeCode Available | 1 |
| Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential Recommendation | Jan 24, 2025 | Contrastive LearningMixture-of-Experts | CodeCode Available | 1 |
| Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach | Oct 18, 2023 | Blind Super-ResolutionDecoder | CodeCode Available | 1 |
| Graph Sparsification via Mixture of Graphs | May 23, 2024 | Graph LearningMixture-of-Experts | CodeCode Available | 1 |
| EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate | Dec 29, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| BiMediX: Bilingual Medical Mixture of Experts LLM | Feb 20, 2024 | Mixture-of-ExpertsMultiple-choice | CodeCode Available | 1 |
| GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts | Dec 7, 2023 | DiversityGraph Neural Network | CodeCode Available | 1 |
| Heterogeneous Mixture of Experts for Remote Sensing Image Super-Resolution | Feb 12, 2025 | Image Super-ResolutionMixture-of-Experts | CodeCode Available | 1 |
| GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation | Oct 15, 2024 | Explainable RecommendationLanguage Modelling | CodeCode Available | 1 |